“Future AI model training will be limited by compute rather than data, as a significant portion of training data will be synthetically generated.”