“The recent introduction of reinforcement learning fine-tuning allows customers to improve a model to state-of-the-art performance on a specific use case, a significant advance over supervised fine-tuning which primarily adjusted tone and instruction following.”