“A key innovation in the DeepSeek Mixture-of-Experts architecture was to activate a larger number of smaller, more fine-grained experts.”