Machine Learning Street Talk• Dec 22, 2025• 43:57Interview

The "Final Boss" of Deep Learning

From Machine Learning Street Talk

Tim Scarfe

Executive Summary

Current AI models, including large language models, have fundamental architectural limitations, failing at basic algorithmic tasks like addition with a 'carry' operation, which cannot be solved by scale or tool-use alone.
Geometric Deep Learning (GDL) successfully uses symmetry and equivariance to make models like Transformers data-efficient, but it is limited by its reliance on invertible transformations, which fails to model many real-world algorithms.
Categorical Deep Learning (CDL) is proposed as a more general and powerful framework that extends GDL.
It uses category theory to model non-invertible and compositional computations, providing a more robust mathematical foundation for neural networks.
A key missing component in models like Graph Neural Networks (GNNs) is the 'carry' mechanism.
CDL and related mathematical concepts like the Hopf fibration may provide a path to building this capability, enabling neural networks to function more like CPUs.

12 quotes

Concerns Raised

Current LLMs cannot perform reliable algorithmic reasoning, despite their massive scale.
Relying on external tools is an inefficient and brittle patch for fundamental architectural deficiencies.
Geometric Deep Learning is too restrictive because its assumption of invertibility doesn't apply to most computational algorithms.
Current neural network designs, particularly GNNs, lack a core computational primitive: the 'carry' operation.

Opportunities Identified

Develop novel architectures based on Categorical Deep Learning to overcome the limitations of current models.
Build models with intrinsic, robust, and generalizable algorithmic reasoning capabilities.
Create a unified mathematical framework for deep learning that connects high-level constraints with practical implementations.
Incorporate complex geometric and algebraic structures (like the Hopf fibration) to enable new computational primitives in neural networks.

Key Themes

Fundamental Limitations of Current AI

The discussion highlights that despite their scale, frontier models like ChatGPT cannot reliably perform basic algorithmic tasks such as multi-digit addition. This, along with the inability of physics-based models to perfectly capture physical laws, points to a deep architectural misalignment rather than a problem that can be solved by more data or simple tool integration.

This challenges the 'scale is all you need' paradigm, arguing that progress towards more robust and general AI requires foundational research into new architectures capable of intrinsic algorithmic reasoning.

Geometric Deep Learning and Equivariance

Geometric Deep Learning (GDL) is presented as a successful framework for building structural priors, specifically symmetries, into neural networks. This principle of equivariance (e.g., permutation equivariance in Transformers) makes models exponentially more data-efficient by ensuring they produce predictable outputs for transformed inputs.

This establishes the value of encoding mathematical structure into model architectures and serves as the foundation from which the speakers identify limitations and propose a more general theory.

The Problem of Non-Invertibility

A core critique of GDL is its reliance on group theory, which assumes all transformations are invertible (symmetries). However, many essential algorithms, like Dijkstra's for pathfinding, are non-invertible, as different inputs can map to the same output, thereby losing information. This makes GDL unsuitable for modeling general computation.

This identifies a critical gap in current AI theory. To build models that can execute complex, multi-step algorithms, their underlying architecture must be able to handle irreversible, information-destroying operations.

Categorical Deep Learning as a Generalization

Categorical Deep Learning (CDL) is introduced as a more abstract and powerful framework that generalizes GDL. By leveraging category theory, it moves beyond the rigid constraints of invertible groups to formally describe non-invertible processes, partial compositionality, and relationships between different data types, creating a unified language for deep learning.

CDL offers a potential path toward a foundational theory of neural networks, enabling the design of next-generation models that can handle a much broader class of computational problems, including recursion and complex algorithmic reasoning.

The 'Carry' Operation and Neural CPUs

The speakers identify the simple arithmetic 'carry' as a fundamental computational primitive that has been overlooked in the design of GNNs and other models. They propose that the abstract mathematics of CDL, including concepts like the Hopf fibration, provides the necessary geometric subtlety to implement this mechanism in a continuous, differentiable way.

This provides a concrete, ambitious goal for the field: building the components of a 'neural CPU' directly into network architectures, which would represent a major leap in the reasoning capabilities of AI systems.

Get started free

Topics

Categorical Deep Learning Geometric Deep Learning LLM Limitations Algorithmic Reasoning Equivariance Symmetry Group Theory Category Theory Non-invertible Computation Graph Neural Networks (GNNs)Transformers Carry Operation Hopf Fibration Neural Architecture AI Theory

Processed Apr 2, 2026 yt-dlp + mlx-whisper + Gemini