Microsoft deliberately avoided using model distillation when training MAI-THINKING-1 to ensure it..., Sonic AI
“Microsoft deliberately avoided using model distillation when training MAI-THINKING-1 to ensure it could eventually surpass any "teacher" model and establish a new performance frontier.”