Defining "Search" in a world where the front-facing entity for every organization is a Conversational AI.
In the era of Large Language Models (LLMs), we envision a transformation in how information is served to users. Imagine a world where:
The challenge bears similarities to established fields such as:
A key distinction lies in the nature of collecting statistics: while traditional search engines return thousands of results enabling reliable statistics, LLM responses are typically brief, making expertise assessment more challenging.
While we propose a specific simulation and evaluation framework, our assumptions may not fully reflect real-world conditions. Bridging this gap and developing more accurate evaluation methodologies remains a crucial challenge.
Unlike traditional DIR systems where search engines provide extensive results, LLMs typically respond with concise answers, sometimes with limited references. This brevity complicates the assessment of LLM expertise.
Even with robust expertise assessment, determining optimal LLM ranking strategies for meta-LLM agents remains complex. While this year's track focuses primarily on LLM ranking, future iterations may explore this challenge further.
The challenge of effectively combining multiple LLM responses into coherent, comprehensive answers represents an open research question with limited existing work.
We invite the Information Retrieval community to:
Evangelos Kanoulas
University of Amsterdam, The Netherlands
|
Panagiotis Eustratiadis
University of Amsterdam, The Netherlands
|
Mark Sanderson
RMIT University, Australia
|
Jamie Callan
Carnegie Mellon University, USA
|
Vaishali Pal
University of Amsterdam, The Netherlands
|
Yougang Lyu
University of Amsterdam, The Netherlands
|
Zihan Wang
University of Amsterdam, The Netherlands
|