There is a colloquial sense that Chinese AI models perform better on standard benchmarks than on ..., Sonic AI
“There is a colloquial sense that Chinese AI models perform better on standard benchmarks than on truly held-out, novel problems, suggesting potential benchmark optimization.”