Reinforcement learning can be used to evaluate training data quality by revealing backdoors or ex..., Sonic AI
“Reinforcement learning can be used to evaluate training data quality by revealing backdoors or exploits in an environment when a model learns to "game the system."”