Rerun deliberately open-sourced its core visualization tool, believing it's too critical and widely used across an organization to monetize directly via per-seat licenses. The commercial product is a cloud-based data platform that builds upon the open-source foundation, a common strategy to build trust and drive adoption before upselling enterprise features.
The conversation emphasizes the unique challenges of managing complex, multi-rate, multimodal sensor data in robotics, which are distinct from the more mature data stacks for LLMs. Rerun focuses on the data pipelines *before* model training, addressing a critical bottleneck in the MLOps lifecycle for physical AI systems.
Rerun built its high-performance stack from scratch in Rust, including a custom in-memory database and a new file format. The company has also redesigned its core data model four times, demonstrating a willingness to undergo significant engineering effort to perfect the product's core architecture for user needs.
The discussion points to a "ChatGPT moment" for robotics, where advances in scalable machine learning and the availability of open-source models are enabling more practical, "learning-first" approaches. This shift is creating new opportunities and accelerating progress in historically difficult areas like manipulation.
A key design challenge discussed is creating a tool that is both low-friction and schema-less for researchers during exploration, while also being robust and able to integrate with the rigid, pre-defined schemas of production systems. Rerun has worked to solve this tension through its iterative data model design.
Keep pulling the thread on Nico West.