“OpenAI's internal evaluations show that GPT-4.1 reduced the rate of irrelevant code edits to 2%, down from 9% in GPT-4.0.”