Surge, founded in 2020 by Edwin Chen, has achieved over $1 billion in annual recurring revenue without any venture capital by focusing on high-quality, complex human data for AI.
CEO Edwin Chen is highly critical of current AI evaluation methods, arguing that leaderboards like the LMSYS Chatbot Arena are "terrible" and encourage "benchmark hacking," which sets the industry back by rewarding superficial model traits over genuine capability.
The demand for AI data is rapidly evolving from simple labeling to requiring deep, specialized expertise (e.g., Olympiad-level math, coding in specific dialects) to train frontier models on reasoning and multimodal tasks.
Chen argues that high-quality Reinforcement Learning from Human Feedback (RLHF) is vastly superior to synthetic data, and predicts that the economic incentives in AI will cause closed-source models to continue outperforming open-source alternatives.
12 quotes
Concerns Raised
The AI industry's reliance on flawed benchmarks like LMSYS is promoting superficial model improvements and hindering genuine progress.
An over-reliance on synthetic data makes models good at academic tests but brittle in real-world, open-ended scenarios.
The current economic incentives in AI will force the most successful open-source models to become closed-source, concentrating power.
Opportunities Identified
There is a massive, growing market for high-quality, expert-driven human data to power the next generation of AI models.
Focusing on high-quality RLHF data is a more effective path to improving model capabilities than using massive amounts of synthetic data.
Developing models with deep expertise in specialized domains and niche languages/dialects remains a significant area for differentiation and value creation.