Assigning Weights to Terms
Raw term frequency
tf x idf
Recall the Zipf distribution
Want to weight terms highly if they are
frequent in relevant documents … BUT
infrequent in the collection as a whole
Automatically derived thesaurus terms
Previous slide
Next slide
Back to first slide
View graphic version