The size of a single Mixture-of-Experts (MoE) layer is fundamentally limited by the communication..., Sonic AI
“The size of a single Mixture-of-Experts (MoE) layer is fundamentally limited by the communication bandwidth within a single GPU rack, as inter-rack communication becomes a significant bottleneck.”