•Each document
(or surrogate) is represented by a vector defined by every word in the collection.
•Doc 1 0 0 1 1 0 0 ..... 0
•Doc 2 0 0 0 0 1 1 ..... 0
•.
•Doc 7 1 0 0 1 0 0 ..... 1 (has aardvark and zygote)
•.
•Doc 33 0 1 0 0 0 0 ..... 1 (has abacus and
zygote)
•.
•Doc 67 1 1 0 0 0 0 ..... 1 (has aardvark, abacus
and zygote)
•.
•Doc N
•
•Queries are
expressed as vectors and matched to document vectors. Degrees of matching are possible.
•