Misha Laskin asserts that reward models for vision or other sensory signals in robotics are signi..., Sonic AI
“Misha Laskin asserts that reward models for vision or other sensory signals in robotics are significantly more prone to being hacked by RL agents than reward models for language.”