The discussion highlights that despite their scale, frontier models like ChatGPT cannot reliably perform basic algorithmic tasks such as multi-digit addition. This, along with the inability of physics-based models to perfectly capture physical laws, points to a deep architectural misalignment rather than a problem that can be solved by more data or simple tool integration.
Geometric Deep Learning (GDL) is presented as a successful framework for building structural priors, specifically symmetries, into neural networks. This principle of equivariance (e.g., permutation equivariance in Transformers) makes models exponentially more data-efficient by ensuring they produce predictable outputs for transformed inputs.
A core critique of GDL is its reliance on group theory, which assumes all transformations are invertible (symmetries). However, many essential algorithms, like Dijkstra's for pathfinding, are non-invertible, as different inputs can map to the same output, thereby losing information. This makes GDL unsuitable for modeling general computation.
Categorical Deep Learning (CDL) is introduced as a more abstract and powerful framework that generalizes GDL. By leveraging category theory, it moves beyond the rigid constraints of invertible groups to formally describe non-invertible processes, partial compositionality, and relationships between different data types, creating a unified language for deep learning.
The speakers identify the simple arithmetic 'carry' as a fundamental computational primitive that has been overlooked in the design of GNNs and other models. They propose that the abstract mathematics of CDL, including concepts like the Hopf fibration, provides the necessary geometric subtlety to implement this mechanism in a continuous, differentiable way.
Keep pulling the thread on Multi-Layer Perceptron.