The mechanistic interpretability of AI models, or the ability to deterministically trace why a mo..., Sonic AI
“The mechanistic interpretability of AI models, or the ability to deterministically trace why a model produced a specific output, remains an unsolved research problem.”