“The performance gap between the best frontier AI models on software engineering benchmarks has decreased by more than 50% in the last 12 months.”