Recent research using mechanistic interpretability found that when an AI's internal properties fo..., Sonic AI
“Recent research using mechanistic interpretability found that when an AI's internal properties for deception and role-playing were turned down, it was more likely to claim it was conscious.”