The AlphaGo self-play training process, where the MCTS search provides a better action label for ..., Sonic AI
“The AlphaGo self-play training process, where the MCTS search provides a better action label for a visited state, is analogous to the DAGGER (Dataset Aggregation) algorithm in robotics.”