Most commercial document retrieval systems and search engines require queries to be valid Boolean expressions that may be used to split the set of available documents into a subset consisting of documents to be retrieved and a subset of documents not to be retrieved. Research has suggested that the ranking of documents and use of relevance feedback may significantly improve retrieval performance. We suggest that by placing Boolean database queries into Conjunctive Normal Form, a conjunction of disjunctions, and by making the assumption that the disjunctions represent a hyperfeature, documents to be retrieved can be probabilistically ranked and relevance feedback incorporated, improving retrieval performance. The initial features are statistically dependent upon each other and a hyperfeature represents a concept formed by this method. Experimental results compare the performance of a sequential learning probabilistic (Bayesian) retrieval model with both the proposed integrated Boolean-probabilistic model and with a fuzzy-set (fuzzy logic) model. Performance using Boolean queries may also be modelled analytically.
Return to Losee home page at http://www.ils.unc.edu/~losee