Document Alternatives
•
Paragraphs, passages
•
SGML/HTML/XML codes
•
‘Shape’ of text
•
Related problems:
–
text summarization/auto abstracting
–
auto categorization
–
question answering