School of Information & Library Science
305 Manning Hall
University of North Carolina at Chapel Hill
Chapel Hill, NC 27599-3360



Using Query Performance Predictors to Improve Spoken Queries

In this paper, we focused on the task of automatically reducing verbose spoken queries using query performance predictors as input features for a regression model. The model was trained to predict the difference in performance between the candidate sub-query and the original.

The spoken queries were gathered using Amazon Mechanical Turk (MTurk), and are based on the 250 TREC topics used in the TREC 2004 Robust Track.

Our study participants were given search task descriptions that were slightly modified from the original TREC topic description/narrative. Our goal was to give participants search tasks that were situated in a “real world” scenario.

For example, TREC topic 395 is associated with the following TREC description, narrative, and search task description:

TREC description: Provide examples of successful attempts to attract tourism as a means to improve a local economy.

TREC narrative: To be relevant, a selected document will specify the entity (city, state, country, governmental unit) which has achieved an economic increase due to the entity's efforts at boosting tourism. Documents which only concern plans for increasing tourism are not relevant, only documents which detail an actual increase are relevant.

Search Task: You were recently in Costa Rica and were surprised by the amount of tourists you saw. Now you are curious about other locations (cities, states, or countries) that have also managed to boost their tourism industry. Find information about locations that have recently managed to grow their tourism industry.

Our 250 search task descriptions are provided here.

In total, we gathered 20 spoken queries per search task, for a total of 5,000 spoken queries. Queries were automatically transcribed using the AT&T, IBM, and WIT.AI speech-to-text APIs.

The query transcriptions are provided in transcriptions.xlm.

Each query has an ID of the form

<TREC_TOPIC_ID>.<VERSION_ID>, where <VERSION_ID> is in the 1-20 range.

A few additional details:

  1. There are speech recognition errors.
  2. There are cases where the API interpreted a long pause as “end of speech”
  3. There are cases where the API was not able to resolve the spoken query and returned a NULL transcription. These are marked as NULL in the XML file.
NSF logo

This research is sponsored by National Science Foundation grant IIS-1451668. Any opinions, findings, conclusions or recommendations expressed on this Web site are those of the author(s), and do not necessarily reflect those of the sponsor.