“On the HealthBench benchmark, Anthropic's Mythos-5 and Fable-5 models scored 66%, compared to 51.8% for GPT-5.5.”