“The QuietStar paper demonstrated that reasoning algorithms like Star could be scaled up to pre-training scale by using pre-training style data.”