“The primary difference in training OpenAI's O3 model compared to foundational models is the use of reinforcement learning to solve difficult tasks.”