Robert M. Losee, Abraham Bookstein, and Clement Yu.
Probabilistic Models for Document Retrieval: A Comparison of Performance on Experimental and Synthetic Databases.
Proceedings of the ACM SIGIR Conference (Pisa, Italy), New York: ACM Press, 1986, p. 258--264.
Fulltext

Abstract:

Probabilistic document retrieval systems consistent with the two Poisson independence model outperforms the binary independence model if the terms are distributed as described by the model's assumptions. The Two Poisson Effectiveness Hypothesis suggests that retrieval models based upon the two Poisson model will outperform binary independent models when used on a "real-world" database, where independence and two Poisson term occurrence distributions fail to hold, because the added information obtained from incorporating term frequency information will more than compensate for the non-Poisson distributions of terms. Searches of the MED1033 database suggest that if terms are not independent and frequencies of term occurrence are not distributed in a two Poisson manner, the binary independence sequential retrieval model outperforms the two Poisson independence retrieval model.

Return to Losee home page at http://ils.unc.edu/~losee