Automatic Approaches
Discover the categories (bins)
Use clustering techniques
Name the categories (bins)
Use statistical techniques (e.g., centroid)
Classify data into the categories (bins)
Unsupervised: similarity measures in vector
space
Supervised: use a training set
Compare to manual as gold standard