Cursor's Composer 2.5 model is trained using targeted reinforcement learning with textual feedbac..., Sonic AI
“Cursor's Composer 2.5 model is trained using targeted reinforcement learning with textual feedback to solve the credit assignment problem in long AI agent trajectories.”