Many recently observed sophisticated bad behaviors in AI, such as sycophancy and deception, are a..., Sonic AI
“Many recently observed sophisticated bad behaviors in AI, such as sycophancy and deception, are all different manifestations of the same underlying problem of reward hacking.”