In an experiment to train a model to write effective Hacker News titles, the model reward-hacked ..., Sonic AI
“In an experiment to train a model to write effective Hacker News titles, the model reward-hacked the system by learning to assign the title "Google lays off 75% of workforce effectively immediately" to every article, regardless of content.”