“Current AI hardware has significant overheads when running models with dynamic computation, such as Mixture-of-Experts (MoE) architectures.”