Reinforcement from Finetuning (RFT) is highly data-efficient, requiring as few as 100 samples to ..., Sonic AI
“Reinforcement from Finetuning (RFT) is highly data-efficient, requiring as few as 100 samples to significantly improve model performance on niche tasks.”