“Glean uses LLMs as judges to evaluate the quality of its system's subjective answers against its golden datasets.”