“The "time horizon" number for a given model is defined as the human task-completion time at which the model is estimated to have a 50% success rate.”