“Reinforcement Learning (RL) scaling occurred in projects like AlphaGo, Dota at OpenAI, and AlphaStar at DeepMind.”