21. Vector Models

IS 202 - 8 November 2005

Copyright © 2005 Robert J. Glushko

Plan for Today's Class


Schedule for Upcoming Lectures


Recall and Precision


Recall and Precision [2]


High Recall but Low Precision


Low Recall but High Precision


High Recall and High Precision


The Boolean Model


Boolean Search with Inverted Indexes (last slide on 11/3)


Relevance in the Boolean Model


Models of Information Retrieval [1]


Models of Information Retrieval [2]


Models of Information Retrieval [2]


Some Mathematical Foundations (and Review, I Hope)


Vectors [1]


Vectors [2]


Summation Notation


Cosines


Overview of Vector Model


Document x Term Matrix


Document Vector [1]


Document Vector [2]


Word (or Term) Vectors


Documents in Term Space - 2D Example


Term Weighting


Weighting Using Term Frequency


Weighted Vectors in 3D


Word Frequency vs Discriminability / Resolving Power


Term Resolving Power


Weighting with Inverse Document Frequency


IDF Calculations


Weighting Term Frequency with IDF (Simplified)


tf x idf Example Calculations


Normalized tf x idf


Normalized tf x idf Example Calculations


Similarity in Vector Models


Cosine Similarity with Weighting Example Calculations


Similarity in Unnormalized Vectors


Latent Semantic Analysis -- Motivation


Latent Semantic Analysis -- Key Concepts


Latent Semantic Analysis -- Implementation


Readings for IO & IR Lecture #22