Andrei Karpathy asserts that the internet data used for pre-training LLMs is of very low quality,..., Sonic AI
“Andrei Karpathy asserts that the internet data used for pre-training LLMs is of very low quality, describing a random document from a frontier lab's dataset as "total garbage."”