“The key insight of the Time Horizons benchmark is using the time it takes a human to complete a task as a unified metric for task difficulty.”