Million LLMs Track Illustration
Image generated by ChatGPT using the prompt: "Create an image in the style of Van Gogh depicting a million LLMs in humanoid form (as a metaphor), working together to respond to user queries."

TREC 2025: Million LLMs Track

Defining "Search" in a world where the front-facing entity for every organization is a Conversational AI.

Overview

In the era of Large Language Models (LLMs), we envision a transformation in how information is served to users. Imagine a world where:

Research Context

The challenge bears similarities to established fields such as:

A key distinction lies in the nature of collecting statistics: while traditional search engines return thousands of results enabling reliable statistics, LLM responses are typically brief, making expertise assessment more challenging.

Key Challenges

1. Evaluation Framework Development

While we propose a specific simulation and evaluation framework, our assumptions may not fully reflect real-world conditions. Bridging this gap and developing more accurate evaluation methodologies remains a crucial challenge.

2. LLM Expertise Quantification

Unlike traditional DIR systems where search engines provide extensive results, LLMs typically respond with concise answers, sometimes with limited references. This brevity complicates the assessment of LLM expertise.

3. LLM Ranking Optimization

Even with robust expertise assessment, determining optimal LLM ranking strategies for meta-LLM agents remains complex. While this year's track focuses primarily on LLM ranking, future iterations may explore this challenge further.

4. Answer Synthesis

The challenge of effectively combining multiple LLM responses into coherent, comprehensive answers represents an open research question with limited existing work.

Track Objectives

We invite the Information Retrieval community to:

  1. Explore methods for selective LLM querying
  2. Develop robust evaluation frameworks
  3. Build a collaborative research community around these challenges

Organizers

Evangelos Kanoulas
Evangelos Kanoulas
University of Amsterdam, The Netherlands
Panagiotis Eustratiadis
Panagiotis Eustratiadis
University of Amsterdam, The Netherlands
Mark Sanderson
Mark Sanderson
RMIT University, Australia
Jamie Callan
Jamie Callan
Carnegie Mellon University, USA

Co-Organizers

Vaishali Pal
Vaishali Pal
University of Amsterdam, The Netherlands
Yougang Lyu
Yougang Lyu
University of Amsterdam, The Netherlands
Zihan Wang
Zihan Wang
University of Amsterdam, The Netherlands