“Evaluating image models with real user prompts on platforms like LM Arena is considered the best method for assessing model quality.”