“For the Grok 4 model, half of the total compute budget was spent on Reinforcement Learning (RL), which resulted in a comparatively poor marginal performance gain per dollar compared to pre-training.”

AriAI Infrastructure

Loading full analysis…