After large language models achieved high performance on Francois Chollet's ARC v1 benchmark, the..., Sonic AI
“After large language models achieved high performance on Francois Chollet's ARC v1 benchmark, their performance on the subsequent ARC v2 benchmark dropped to nearly 0%.”