G. Marchionini, UNC-CH
Vectors
•Each document (or surrogate) is represented by a vector defined by every word in the collection.
•Doc 1  0 0 1 1 0 0 ..... 0
•Doc 2  0 0 0 0 1 1 ..... 0
•.
•Doc 7  1 0 0 1 0 0 ..... 1  (has aardvark and zygote)
•.
•Doc 33  0 1 0 0 0 0 ..... 1 (has abacus and zygote)
•.
•Doc 67  1 1 0 0 0 0 ..... 1 (has aardvark, abacus and zygote)
•.
•Doc N
•
•Queries are expressed as vectors and matched to document vectors.  Degrees of matching are possible.
•