Which model is used as LLM-as-a-judge?

title: 'Flowcharts illustrating various research frameworks: Huggingface Open DR, GPT Researcher, Open Deep Research, and Test-Time Diffusion DR'

The model used as LLM-as-a-judge in the evaluation of the Test-Time Diffusion Deep Researcher (TTD-DR) is Gemini-1.5-pro. This model was calibrated with human ratings to ensure alignment with human judgment in evaluating long-form responses produced by the research agents, as stated in the text^[1].

In the context of the evaluation, the Gemini-1.5-pro model played a critical role by providing fitness scores and generating textual critiques to enhance the quality of the outputs generated by the TTD-DR framework^[1].

Deep Researcher with Test-Time Diffusion In Bite Size Format

Related Content From The Pandipedia

Which model uses self-reflection?Which model precedes gpt-5-main?Which tokenizer do gpt-oss models use?Quotes highlighting the theme of love and partnership in cosmic adventure Comparison of gpt-oss Models and OpenAI o4-mini Evaluation Framework for TTD-DR Agents What is the primary focus of open-weight models?What is an omni model LLM?Fast facts: AI adoption in education Which model outperformed others on the OSWorld benchmark?Which language has the lowest MMLU score?Which AI model architecture are you?What do model evaluations reveal?What's trending in global cryptocurrency regulations?Which AI model surpassed humans on MMLU in 2025?