In 2020, Nick Frosst incorrectly believed that Reinforcement Learning from Human Feedback (RLHF) ..., Sonic AI
“In 2020, Nick Frosst incorrectly believed that Reinforcement Learning from Human Feedback (RLHF) would not be data-efficient enough to improve models with small feedback datasets.”