Research by Dan Hendrycks identified a "dishonesty vector" in an AI model by determining which we..., Sonic AI
“Research by Dan Hendrycks identified a "dishonesty vector" in an AI model by determining which weights were activated when it was instructed to be dishonest, and found this same vector was active during source hallucination.”