Unsupervised Learning Notify me• May 22, 2026• 59:40Interview

Gemini Co-Lead on World Models, RL's Next Domains & Continual Learning

From Unsupervised Learning

Oriol Vinyals(Co-Lead of Gemini, Google DeepMind, guest)

Get the full transcript next time Unsupervised Learning releases an episode

Summary, key quotes, top claims, and the searchable transcript - emailed automatically. No card needed.

Executive Summary

Continue your research

Keep pulling the thread on Oriol Vinyals.

The Quest for World Models AI Memory and Continual Learning

12 quotes

Concerns Raised

The difficulty of extracting deep conceptual knowledge from unlabeled visual data.
Current models lack the ability to generate truly novel or innovative ideas, a key blocker for AGI.
LLMs are data-limited for post-training, lacking a source of 'infinite complexity' like game environments.
The precision gap for fine motor control remains a significant hurdle for applying world models to robotics.

Opportunities Identified

Achieving a 'GPT moment' for video and images, unlocking vast knowledge from non-textual data.
Developing world models that can act as powerful simulators for robotics and prediction.
Building defensible businesses by creating high-quality, domain-specific evaluation datasets.
Improving agent capabilities through scalable, non-parametric memory systems.

Key Themes

Research Findings12

Oriol Vinyals predicts that complex, hand-coded scaffolding systems built around models will eventually be written on-the-fly by the model itself.

Reasoning capabilities developed from training on narrow domains like coding and math have been shown to generalize to unrelated, complex topics like taxes and international relocation.

Oriol Vinyals believes the AI industry has not yet seen the equivalent of the "GPT moment" for video and images.

A core, unsolved quest in machine learning is to train a model on all video and image data without text and have it extract the same level of understanding that language models achieve from text.

Oriol Vinyals believes the ability for AI models to genuinely innovate, particularly in scientific fields like machine learning, is a key capability that currently lacks a clear research path.

Oriol Vinyals believes that based on the expectations of seven years ago, current AI models would likely have been declared as achieving Artificial General Intelligence (AGI).

A key challenge in training models on unlabeled visual data is linking abstract concepts to what is seen in an image without explicit language annotations.

World models like Google's Omni could provide a simulation dimension that enables systems to predict outcomes before acting in the physical world, with applications in self-driving cars and robotics.

A significant gap for using world models in robotics is the lack of precision for fine motor control, such as grasping, because the models lack data for modalities like touch and force.

Google released consumer agents named Spark at its I/O conference.

A promising mechanism for agent memory is to have the agent write its thoughts and knowledge into an external, modifiable file system.

Serving models with personalized weights for each user's memory would be practically difficult, making non-parametric memory systems like external files more convenient.

Topics

AI Research Multimodality World Models Google Gemini Google I/O Oriol Vinyals AGI Continual Learning AI Memory Post-Training Reinforcement Learning (RL)Representation Learning Robotics AI Compute Strategy LLM Scaling

Processed May 22, 2026Daily intelligence brief → yt-dlp + mlx-whisper + Gemini

Gemini Co-Lead on World Models, RL's Next Domains & Continual Learning

Continue your research

Concerns Raised

Opportunities Identified

Key Themes

The Quest for World Models

AI Memory and Continual Learning

Post-Training and the Limits of Data

The Path to AGI and True Innovation

Strategic AI Resource Allocation

Research Findings12

Topics