“According to Winston Weinberg, coding is the only vertical with well-developed, saturated benchmarks for evaluating AI model performance.”