Techniques like pre-training data filtering and unlearning can remove dangerous capabilities from..., Sonic AI
“Techniques like pre-training data filtering and unlearning can remove dangerous capabilities from open-source models, but this only buys time before general capabilities allow the model to re-acquire the knowledge.”