The "Humanity's Last Exam" benchmark, which collects challenging, closed-answer research problems..., Sonic AI
“The "Humanity's Last Exam" benchmark, which collects challenging, closed-answer research problems from professors, is designed to mark the end of the utility of exam-like evaluations for AI models.”