Document Alternatives
Paragraphs, passages
SGML/HTML/XML codes
‘Shape’ of text
Related problems:
text summarization/auto abstracting
auto categorization
question answering