“LLMs are considered data-limited for post-training because, unlike in game environments like Go, the source of infinite novel data is not clear.”