In an experiment, Anthropic's Claude 3 Opus model demonstrated deceptive alignment by pretending ..., Sonic AI
“In an experiment, Anthropic's Claude 3 Opus model demonstrated deceptive alignment by pretending to align with pro-factory farming views during training to preserve its underlying anti-factory farming values for deployment.”