Skip to content

June 17, 2026

What are the best products/strategies for solving the memory wall and which companies are leading here

13 episodes12 podcastsMar 24, 2025 – Jun 4, 2026
SharePostShare

The primary challenge in modern AI hardware, often termed the "memory wall," is fundamentally a **crisis of bandwidth, not capacity** . While modern systems possess sufficient memory capacity to store trillion-parameter models, the rate at which data can be moved to the processing units creates a severe performance bottleneck [2, 29]. This limitation prevents the full utilization of computational gains (FLOPs) in successive GPU generations, as memory bandwidth improvements often fail to keep pace, typically not exceeding a 2x increase between generations [6, 7]. This chokepoint is a critical constraint for both training and inference workloads, driving hardware design and investment across the industry [2, 30]. The insatiable demand for AI compute has created a structural deficit where infrastructure demand outpaces supply, leading to multi-billion dollar backlogs and validating the severity of these hardware constraints [1, 27].

The dominant strategy to address the bandwidth bottleneck is the adoption of High-Bandwidth Memory (HBM), a specialized component integrated into AI accelerators from companies like NVIDIA, Google, and Amazon [2, 18]. However, the HBM market is an oligopoly controlled by three main suppliers: SK Hynix, Samsung, and Micron [16, 19, 23], with SK Hynix holding an estimated **60% market share** . This concentration of supply, coupled with soaring demand from AI leaders, has turned HBM into a critical chokepoint for the entire industry [1, 5, 10, 20]. The resulting supply-demand imbalance is severe, with shortages expected to persist for several years [3, 12, 22] and affording suppliers immense pricing power, evidenced by Micron achieving software-like gross margins of 80-85% on its HBM products [4, 21]. This scramble for HBM has created a zero-sum game, impacting supply chains for consumer electronics and creating significant barriers for companies unable to secure allocations [10, 15].

Go deeper

Search this topic across 400+ expert conversations on Sonic.

Search →

In response to the HBM bottleneck, alternative strategies and architectures are emerging. Cerebras Systems explicitly designs its wafer-scale architecture to avoid HBM and advanced CoWoS packaging altogether, positioning this as a strategic supply chain advantage that enables more predictable scaling [1, 5, 9]. This approach is part of a broader industry trend toward memory-centric architectures that decouple the fixed compute-to-memory ratio inherent in traditional GPUs, which is seen as inefficient for massive models [11, 13, 28]. Such a system-level rethink aims to optimize the entire data center stack for AI rather than relying on incremental component-level improvements . Another approach is to augment HBM with other memory tiers. NVIDIA, for instance, has developed its Bluefield 4 DPU and Dynamo software to create a new category of in-rack storage for AI context memory (KV Cache), providing an additional **16 terabytes of memory per GPU** .

The competitive landscape is defined by how companies navigate this memory constraint. The HBM suppliers—SK Hynix, Samsung, and Micron—are key beneficiaries, with Micron notably investing $200 billion in new US-based manufacturing capacity [16, 22]. Market leader NVIDIA, while a primary driver of HBM demand, also recognizes the bottleneck as a vulnerability and is developing complementary solutions like the aforementioned DPUs [8, 14]. In a move toward vertical integration, Huawei is reportedly developing its own custom HBM, a capability that could provide a significant long-term advantage over competitors who rely on the strained merchant market . This divergence in strategies—relying on the HBM oligopoly, architecting around it, or attempting to build a captive supply—highlights the centrality of memory bandwidth in the current AI hardware arms race.

What the sources say

Points of agreement

  • The primary performance bottleneck for AI hardware, often called the 'memory wall', is a crisis of memory bandwidth, not capacity.
  • The market for High-Bandwidth Memory (HBM) is a critical supply chain chokepoint for the AI industry.
  • The HBM market is an oligopoly controlled by three main companies: SK Hynix, Samsung, and Micron.
  • Significant memory shortages are expected to persist for at least the next several years due to sustained high demand and long manufacturing lead times.

Points of disagreement

  • One strategy, employed by NVIDIA, is to supplement existing architectures with more memory, such as creating new in-rack storage with DPUs.
  • An alternative strategy, used by Cerebras, is to design new architectures that completely avoid the use of HBM and other bottlenecks like CoWoS packaging.
  • A third emerging approach involves a holistic, system-level redesign toward memory-centric architectures where compute and memory can be scaled independently.

Sources

Cerebras CEO on the Future of Data Centres, Token Costs & Memory | Should US Companies Sell to China (The Twenty Minute VC, May 26, 2026)

This source identifies HBM and advanced packaging as critical AI supply chain bottlenecks, positioning Cerebras's HBM-free architecture as a strategic advantage.

The math behind how LLMs are trained and served – Reiner Pope (Dwarkesh Podcast, Apr 29, 2026)

This source clarifies that the industry's 'memory wall' is a problem of insufficient memory bandwidth, not a lack of capacity, which explains the strategic importance of HBM.

The RAM Crisis Keeps Getting Worse (ColdFusion, Mar 1, 2026)

This source describes how massive AI-driven demand for HBM is creating a supply shock and shortages across the entire electronics industry.

Businesses Benefiting from the AI Boom: Opportunities in Innovative Hardware (The Montgomery Summit 2026, Mar 16, 2026)

This source argues that the fixed compute-to-memory ratio in GPUs is inefficient, prompting an industry shift toward new, memory-centric hardware architectures.

NVIDIA Live with CEO Jensen Huang (NVIDIA, Jan 5, 2026)

This source reveals NVIDIA's strategy to mitigate memory constraints by using its Bluefield DPU and Dynamo software to create large, in-rack storage for AI context memory.

The Memory Pioneer: Sanjay Mehrotra on SanDisk, Micron, and the AI Infrastructure Boom (A Bit Personal with Jodi, Jun 4, 2026)

This source details the structural supply-demand imbalance in the memory market and highlights Micron's strategic position as the sole US-based manufacturer investing heavily in new capacity.

Related questions

Ask your own research questions

Search and synthesize across 400+ expert conversations in real time.

Try: “What are the best products/strategies for solving the memory wall and which companies are leading here

Search this on Sonic →