“AI labs are now reportedly spending more compute on Reinforcement Learning (RL) than on pre-training.”