“Machine learning models commonly utilize mathematically simple loss functions, such as cross-entropy, for tasks like next token prediction.”