School of Information & Library Science
305 Manning Hall
University of North Carolina at Chapel Hill
Chapel Hill, NC 27599-3360

NSF CAREER: Making Aggregated Search Results More Effective and Useful

Project Summary

Aggregated search is the task of combining results from multiple independent search engines into a single presentation. The most widely used aggregated search systems are commercial search portals such as Google. In addition to web search, commercial search portals provide access to a wide range of auxiliary search services, or verticals, that focus on a specific type of media (e.g., images, videos) or search task (e.g., search for news, local businesses). Aggregated search systems are responsible for predicting which verticals to present (Does the user want to see images or news?) and where/how to present them. This project will study a phenomenon called aggregated search coherence and its effect on search behavior. Given an ambiguous query (e.g., "saturn"), a common strategy for a search engine is to diversify its results (e.g., to return results about the car and the planet). Aggregated search coherence is the extent to which results from different sources focus on similar senses of the query. The outcomes of this project will provide the knowledge required to expand the accessibility of search across a wide range of domains. The test collection produced will allow others to reproduce the results of this work and test their own solutions. The software will enable others to perform large-scale remote studies of search behavior. Insights gained from the user studies will be of interest to researchers in other fields such as psychology and marketing.

Prior research by the PI found that the query-senses in the vertical results can affect user interaction with other components on the aggregated results page, the so-called "spill-over" effect. This project will investigate how aggregated search coherence affects search behavior and will incorporate this knowledge into new methods for aggregated search evaluation and prediction. Specifically, four objectives will be tackled. (1) A series of user studies will be conducted to investigate how different factors of the user, the search task, the results presentation, and the layout determine the level of spill-over from one component to another. (2) Using the insights gained from these studies, a new test-collection evaluation methodology will be developed and validated that models cross-component effects. (3) Due to the pipeline architecture of existing systems, results from different components are completely independent of each other. New algorithms will be developed and evaluated for predicting which results from each component to display and how. The goal will be to minimize negative cross-component effects. (4) The generalizability of the methods will be tested on two additional domains: library search and news story aggregation. Aggregated search facilitates single-query access to different types of media, which require customized search solutions. It is the underlying technology behind commercial search portals and also widely used in other domains such as library, mobile, and desktop search. The project will study a phenomenon that is not currently well-understood, nor addressed in existing evaluation methods and algorithmic solutions for aggregated search.

Project Personnel

Dissemination of Research Results

Jaime Arguello, Sandeep Avula, and Fernando Diaz. Using Query Performance Predictors to Improve Spoken Queries. To Appear in the Proceedings of the 38th European Conference in Information Retrieval (ECIR'16), 2016.

Sandeep Avula and Jaime Arguello. Query-expansion Approaches for Microblog Retrieval. In Proceedings of the 24th Text REtrieval Conference (TREC'15), National Institute of Standards and Technology, special publication, 2015.

NSF logo

This research is sponsored by National Science Foundation grant IIS-1451668. Any opinions, findings, conclusions or recommendations expressed on this Web site are those of the author(s), and do not necessarily reflect those of the sponsor.