How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning
From a16z Podcast
Sherman Wu•Lead, Engineering Team for OpenAI's Developer Platform
Executive Summary
OpenAI is pursuing a dual-pronged strategy, simultaneously scaling its first-party application, ChatGPT (with 800M weekly users), and its horizontal developer API platform.
The market is shifting away from a "one model to rule them all" paradigm towards a proliferation of specialized models.
OpenAI is enabling this through advanced features like reinforcement fine-tuning (RFT).
Reinforcement fine-tuning allows customers to achieve state-of-the-art performance on specific tasks using their proprietary data, with OpenAI offering discounted inference and free training in exchange for data sharing.
The concept of prompt engineering has evolved into "context engineering," focusing on providing models with the right tools, data, and structured logic to handle complex, real-world tasks, especially in enterprise settings.
12 quotes
Concerns Raised
Current models are not yet reliable enough for fully autonomous, unconstrained agentic behavior.
Inference at scale remains an extremely hard engineering and capital problem.
Opportunities Identified
Leveraging reinforcement fine-tuning to create state-of-the-art specialized models for customers.
Expanding on-premise deployments for government and high-security clients.
High developer retention on the API indicates a strong and sticky platform business.
Tapping into vast, unused enterprise data troves to create highly valuable AI applications.