100

Quiz: Report evaluation metrics in AI research

What are two metrics used for evaluating long-form LLM responses in research?[πŸŽ“]
Difficulty: Easy
What methodology is employed to evaluate the performance of deep research agents?[πŸ“Š]
Difficulty: Medium
How is the correctness of responses measured for multi-hop short-form QA tasks?[πŸ”]
Difficulty: Hard

Related Content From The Pandipedia