The AI industry has exhausted the supply of high-quality, diverse training data, a problem that emerged as early as 2022 with the training of GPT-4. Training new models on recent, LLM-generated internet content leads to 'model collapse,' where performance degrades, indicating that simply scaling up current architectures is a dead end.
The guest argues that the massive capital expenditure on data centers to support current-generation LLMs is a historic mistake. This is because the underlying technology is fundamentally flawed and is being superseded by more efficient models that can run locally on laptops.
Leading AI researchers are 'jumping ship' from major tech firms like Meta and DeepMind to create startups focused on solving AI's core challenges. This shift from an 'age of scaling' back to an 'age of research' signals a major pivot in the industry towards new, brain-inspired architectures that prioritize continual learning and efficiency.
A new class of AI systems is being developed that mimics biological processes, such as Hebbian learning and neural growth, to enable continual learning without 'catastrophic forgetting'. These models are 3-4 orders of magnitude more power- and data-efficient than current LLMs and are already demonstrating superior performance.
Keep pulling the thread on Janusz Maretski.