“The internet data used for pre-training large language models was exhausted approximately three years ago.”