On the multilingual version of the SWE-bench benchmark, AI agent performance drops significantly ..., Sonic AI

Use with Claude or ChatGPT

On the multilingual version of the SWE-bench benchmark, AI agent performance drops significantly ..., Sonic AI