Supervised fine-tuning (SFT) causes significantly larger changes to a model's weights compared to..., Sonic AI
“Supervised fine-tuning (SFT) causes significantly larger changes to a model's weights compared to reinforcement learning (RL), even with few examples and a low learning rate.”