The communication pattern required for a Mixture-of-Experts (MoE) layer using expert parallelism ..., Sonic AI
“The communication pattern required for a Mixture-of-Experts (MoE) layer using expert parallelism is an all-to-all pattern, where any GPU may need to communicate with any other GPU.”