On the Terminal Bench 2 coding benchmark, which separates performance by model and agent harness,..., Sonic AI
“On the Terminal Bench 2 coding benchmark, which separates performance by model and agent harness, Anthropic's Claude Code is not the top-performing agent.”