“Anthropic tests its models in conjunction with harnesses it has built internally, such as those for Claude CoWork and Claude Code.”