Y Combinator Notify me• Jun 12, 2026• 1:16:52

5 Papers That Show Where AI Research Is Heading Right Now

François(Host)•Luke(Guest)•Yas Beg(Guest)•Robert George(Guest)•Arnab Mehti(Guest)

Get the full transcript next time Y Combinator releases an episode

Summary, key quotes, top claims, and the searchable transcript - emailed automatically. No card needed.

Executive Summary

Continue your research

Keep pulling the thread on Noam Brown.

The Bitter Lesson in Biology Overcoming Self-Play Plateaus in LLMs

9 quotes

Concerns Raised

LLM self-play can lead to performance plateaus by generating uselessly complex or 'inelegant' synthetic data.
In-context learning (ICL) performance is not monotonic and hits a hard limit at the context length, unlike human continuous learning.
Latency from components like RAG is a major obstacle for natural, real-time conversational AI.
Current AI models may be confined to the 'human-generated subspace' and struggle to explore the full solution space without new methods.

Opportunities Identified

Applying LLM scaling laws to biology can unlock new discoveries in protein science and drug design.
Self-Guided Self-Play (SGS) allows smaller models to achieve the performance of much larger models, making advanced capabilities more accessible.
Agentic, parallelized workflows can dramatically increase developer productivity and change how complex work is done.
AI is achieving superhuman performance in formal mathematics, accelerating discovery and solving long-standing problems.

Key Themes

Research Findings12

On antibody design tasks, the single-sequence ESMFold2 model outperforms AlphaFold3, achieving a DocQ pass rate of 50 compared to AlphaFold3's 47.

Standard self-play algorithms for LLMs fail because rewarding the task-generating model (conjecturer) for difficulty incentivizes it to create messy, artificially complex problems rather than useful ones.

The improved scaling performance of the ESM Cambrian model was achieved by increasing its training dataset from 50 million to 2.8 billion protein sequences, primarily by incorporating metagenomic data.

ESMFold2, using only a single input sequence, achieves near-parity with AlphaFold3 on general protein-protein complex prediction, performing within 3 points on the DocQ pass rate metric.

Using the Self-Guided Self-Play (SGS) method, a 7 billion parameter model achieved the performance of a 670 billion parameter model on the Passat 4 benchmark, though it required 8 times more compute.

AI systems from OpenAI and DeepMind achieved gold medal-level performance at the 2024 International Mathematical Olympiad (IMO).

OpenAI recently claimed to have solved an 80-year-old mathematics problem from Erdős.

Harmonic AI's AxiomProver successfully solved all 12 problems from a recent Putnam mathematical competition.

Channel AI has increased its pull requests per engineer per month by 3.5 times by adopting an agentic, parallelized development workflow inspired by real-time strategy games.

Using sparse autoencoders, researchers found that the latent space of protein language models decomposes into interpretable features corresponding to biological concepts like amino acids, structural motifs, and protein domains.

The BioHub research team created a protein atlas with up to 7 billion folded protein structures, which is larger than the AlphaFold database.

The amount of compute spent on reinforcement learning post-training for large language models is now approaching or surpassing the amount spent on pre-training.

Topics

Processed Jun 12, 2026Daily intelligence brief → yt-dlp + mlx-whisper + Gemini

5 Papers That Show Where AI Research Is Heading Right Now

Continue your research

Concerns Raised

Opportunities Identified

Key Themes

The Bitter Lesson in Biology

Overcoming Self-Play Plateaus in LLMs

Latency as the Bottleneck for Conversational AI

Agentic Workflows for Human Productivity

The Convergence of AI and Formal Verification

Research Findings12

Topics