“Current AI models are only achieving around a 26% score on the "Humanity's Last Exam" benchmark, indicating it remains a significant challenge.”