“Anthropic's research on mechanistic interpretability is identifying "circuits" within large-scale models that explain how they compute answers.”