A recent (May 2010) excellent doctoral dissertation was written by Lewis Church here at
UNC-CH,
Combinatoric Models of Information Retrieval
Ranking Methods and Performance Measures for
Weakly-Ordered Document Collections,
addresses analytic models of retrieval.
List of publications on Retrieval
by Robert Losee
See also the list of publications on
Natural Language Processing
(many of which address retrieval concerns)
and publications on
Classification and Organizing Information for Browsing
.
-
Thesaurus Structure, Descriptive
Parameters, and Scale.
In Press,
Journal of the American Society for Information Science and Technology.
-
"Validating a Model Predicting
Retrieval Ordering Performance with Statistically Dependent Binary Features,"
International Journal of Information Retrieval Research. 5 (1): 2015, 1-18.
-
The Effect of Assigning a Metadata or Indexing Term on Document Ordering,
Journal of the American Society for Information Science and Technology.
64 (11), 2013, 2191-2200.
Simply put, this shows what 1 term in or not in 1 document does to retrieval performance.
The associated software that validates these results is at
http://ils.unc.edu/~losee/ima.
-
A Random Walk on an Ontology: Using Thesaurus Structure for Automatic Indexing,
Journal of the American Society for Information Science and Technology.
64 (7), 2013, 1330-1344. Willis and Losee.
-
Decisions in Thesaurus Construction and Use,
Information Processing & Management.
43(4), 2007, 958-968.
(publisher's link)
-
Percent Perfect Performance (PPP),
Information Processing & Management.
43(4), 2007, 1020-1029.
(publisher's link)
-
"Is 1 Noun Worth 2 Adjectives? Measuring the Relative Feature Utility",
Information Processing & Management.
42(5) 2006, 1248-1259.
-
"Browsing Mixed Structured and Unstructured Data,"
Information Processing & Management.
42 (2)
(2006), 440-452.
-
"Are 2 Document Clusters Better Than 1?
The Cluster Performance Question for
Information Retrieval,"
Journal of the American Society for Information Science and Technology.
56 (1) 2005, 106-108.
(pdf of full article).
(Losee and Church)
-
"Information Retrieval with Distributed Databases:
Analytic Models of Performance,"
IEEE Transactions on Parallel and Distributed Systems.
15, 2004, 18-27.
(pdf of full article).
(Losee and Church)
-
"When Information Retrieval Measures
Agree about the
Relative Quality of Document Rankings,"
Journal of the American Society for Information Science,
51 (9), pp. 834-840, 2000.
(pdf of full article.)
(Won JASIS Best Paper of the Year Award, 2000.)
-
"Measuring Search Engine Quality and Query Difficulty:
Ranking with Target and Freestyle,"
Journal of the American Society for Information Science,
50 (10), pp. 882-889, 1999.
(Losee & Paris)
(pdf of full article.)
-
Text Retrieval and Filtering: Analytic Models of
Performance
(Information Retrieval Series), Kluwer, 1998.
Chapters on
Quality of Document Ranking,
Ranking Performance with One Term (most important chapter),
Linguistic Ranking Performance, and
Bibliography
-
"Comparing Boolean and Probabilistic Information Retrieval Systems
across Disciplines and
Queries," Journal of the American Society for Information
Science, 48 (2), pp. 143-156, 1997.
(pdf of full article)
-
"Evaluating Retrieval Performance Given Database and Query
Characteristics: Analytic
Determination of Performance Surfaces," Journal of the
American Society for Information Science, 47 (1), pp. 95-105, 1996.
-
"Feedback in Information Retrieval,"
Annual Review of Information Science and Technology,"
31 1996, 33-78. (Spink & Losee)
-
"Determining Information Retrieval and Filtering Performance without
Experimentation,"
Information Processing & Management,
31 (4) 1995, 555-572.
-
"Upper Bounds for Retrieval Performance and Their Use Measuring Performance
and Generating Optimal Boolean Queries:
Can it Get Any Better Than This?"
Information Processing & Management,
30(2), pp. 193-204, 1994.
-
"An Analytic Measure Predicting Information Retrieval System Performance,"
Information Processing & Management,
27 (1) 1991, 1-13.
-
Minimizing Information Overload: The Ranking of Electronic Messages,"
Journal of Information Science, 15(3), pp. 179-189, 1989.
(pdf of full
article)
-
"Integrating Boolean Queries in Conjunctive Normal Form
with Probabilistic Retrieval Models,"
Information Processing & Management,
24(3), pp. 315-321, 1988.
(Losee & Bookstein)
-
"Parameter Estimation for Probabilistic Document Retrieval
Models,"
Journal of the American Society for Information Science,
39(1), pp. 8-16, 1988.
-
"Predicting Document Retrieval System performance: An Expected Precision Measure,"
Information Processing & Management,
23 (6) 1987, 529-537.
-
"Probabilistic Retrieval and Coordination Level Matching,"
Journal of the American Society for Information Science,
38 (4) July 1987, 529-537.
-
Google Search for some works referring to my research
The past and future:
My recent work has developed analytic (e.g. non-experimental) methods for predicting
retrieval performance. I have emphasized the simple case where there is only a single
term in the query.
This has enabled me to make statements
(e.g. when does incorporating relevance feedback or including part-of-speech tags improve
IR performance)
that are independent of individual test
databases.
Present and future work is or will consider distributed data,
different ranking and learning algorithms,
and more elaborate multiple
term queries.
Students who have completed a semester or two of master's level work
and have
taken the related introductory courses in this area
might want to consider taking an
independent readings course
if they believe that this area represents their career focus.