Project Summary
Aggregated search is the task of combining results from multiple
independent search engines into a single presentation. The most widely
used aggregated search systems are commercial search portals such as
Google. In addition to web search, commercial search portals provide
access to a wide range of auxiliary search services, or verticals, that
focus on a specific type of media (e.g., images, videos) or search task
(e.g., search for news, local businesses). Aggregated search systems are
responsible for predicting which verticals to present (Does the user want
to see images or news?) and where/how to present them. This project will
study a phenomenon called aggregated search coherence and its effect on
search behavior. Given an ambiguous query (e.g., "saturn"), a common
strategy for a search engine is to diversify its results (e.g., to return
results about the car and the planet). Aggregated search coherence is the
extent to which results from different sources focus on similar senses of
the query. The outcomes of this project will provide the knowledge
required to expand the accessibility of search across a wide range of
domains. The test collection produced will allow others to reproduce the
results of this work and test their own solutions. The software will
enable others to perform large-scale remote studies of search behavior.
Insights gained from the user studies will be of interest to researchers
in other fields such as psychology and marketing.
Prior research by the PI found that the query-senses in the vertical
results can affect user interaction with other components on the
aggregated results page, the so-called "spill-over" effect. This project
will investigate how aggregated search coherence affects search behavior
and will incorporate this knowledge into new methods for aggregated
search evaluation and prediction. Specifically, four objectives will be
tackled. (1) A series of user studies will be conducted to investigate
how different factors of the user, the search task, the results
presentation, and the layout determine the level of spill-over from one
component to another. (2) Using the insights gained from these studies, a
new test-collection evaluation methodology will be developed and
validated that models cross-component effects. (3) Due to the pipeline
architecture of existing systems, results from different components are
completely independent of each other. New algorithms will be developed
and evaluated for predicting which results from each component to display
and how. The goal will be to minimize negative cross-component effects.
(4) The generalizability of the methods will be tested on two additional
domains: library search and news story aggregation. Aggregated search
facilitates single-query access to different types of media, which
require customized search solutions. It is the underlying technology
behind commercial search portals and also widely used in other domains
such as library, mobile, and desktop search. The project will study a
phenomenon that is not currently well-understood, nor addressed in
existing evaluation methods and algorithmic solutions for aggregated
search.
|
Dissemination of Research Results
Jaime Arguello, Sandeep Avula, and Fernando Diaz. Using Query
Performance Predictors to Improve Spoken Queries. To Appear in the
Proceedings of the 38th European Conference in Information
Retrieval (ECIR'16), 2016.
Sandeep Avula and Jaime Arguello. Query-expansion Approaches for Microblog
Retrieval. In Proceedings of the 24th Text REtrieval
Conference (TREC'15), National Institute of Standards and
Technology, special publication, 2015.
|