A training paradigm called "Learning to Discover," which prioritizes aggressive exploration over ..., Sonic AI
“A training paradigm called "Learning to Discover," which prioritizes aggressive exploration over imitation, enabled open-source models to achieve some of the best-known solutions in math and optimization algorithms.”