Andrei Karpathy characterizes standard reinforcement learning as "sucking supervision through a s..., Sonic AI
“Andrei Karpathy characterizes standard reinforcement learning as "sucking supervision through a straw" because it derives a single reward signal from a long, complex trajectory of actions, making it highly inefficient.”