Next: Comparing Retrieval or Search
Up: Measuring Search Engine Quality
Previous: Experimental Rankings
Given the analytic model of retrieval,
we may compute the A values for each query (Ai represents the A value for the
query) and the Q for each retrieval engine, where Qj represents the quality (probability of optimal ranking) of search engine j.
The A values may be interpreted as the level of difficulty associated with retrieving the relevant documents on the topic represented by various formulations of the query.
The Q values may be interpreted as the quality of each search mechanism.
We compute these values by performing a rather lengthy regression.
Our goal is to solve for the various values of Ai and Qj for each query and each search engine,
finding the set of A and Q values that minimize the errors made in estimating the ASL values.
This is a complex problem, and there are no standard simple procedures for solving it.
We can treat the problem as being to solve a non-linear regression of the form
Here the ASL is the dependent variable and the parameters
and
are independent variables to be estimated by the regression package.
The variable xi is an indicator variable that has the value 1 when the
query in question is query i, and 0 otherwise.
The variable yi similarly is an indicator variable that has the value
1 when the retrieval engine being used is retrieval engine number i, and 0 otherwise.
The data set contains 600 document rankings, one for each combination of the six search techniques and for each of the 100 queries.
The N values are set to the correct number of documents for each database.
The numbers that are obtained from these regressions are inexact.
They are estimates that would be better with a larger sample of queries and documents from which to make the estimates.
The standard errors for estimating Q values are all approximately 0.014, while the standard errors for estimating A values are approximately 0.056.
The Q values reflect the database from which they are derived.
The A values are query specific and reflect the nature of the relevance judgments and the documents available.
The Q values are computed so as to mathematically complement the A values so the regression formula produces an ASL values with minimal error.
While Q values clearly will vary due to the characteristics of a specific database, the variance should be relatively small compared to the variation obtained with other measures of retrieval performance quality, such as precision.
In the following section we examine the Q values and their robustness.
Next: Comparing Retrieval or Search
Up: Measuring Search Engine Quality
Previous: Experimental Rankings
Bob Losee
1999-07-29