The VBench benchmark, created by a team at Princeton, consists of approximately 2,200 issue-pull ..., Sonic AI
“The VBench benchmark, created by a team at Princeton, consists of approximately 2,200 issue-pull request pairs from 12 open-source Python repositories and is used to measure the performance of AI coding agents.”