“When Anthropic first launched its tool use API in the previous year, its accuracy on relevant benchmarks was below 50%.”