For frontier labs, the immense cost of training runs, potentially hundreds of millions of dollars..., Sonic AI
“For frontier labs, the immense cost of training runs, potentially hundreds of millions of dollars, makes it impractical to fix reward hacking or misaligned rewards by retraining from scratch.”