Open leaderboards like the OpenLLM Leaderboard are susceptible to models overfitting the benchmar..., Sonic AI
“Open leaderboards like the OpenLLM Leaderboard are susceptible to models overfitting the benchmarks, which may not reflect true general capabilities.”