“Reinforcement Learning (RL) is the main method used to improve AI models by generating synthetic data from large amounts of compute.”