Current Topics in Information Access: IR Background

9/16/98


Click here to start


Table of Contents

Current Topics in Information Access: IR Background

Last Time

Today

Some IR History

Information Retrieval

Structure of an IR System

PPT Slide

PPT Slide

PPT Slide

PPT Slide

Steps in a “typical IR System”

Stemming and Morphological Analysis

Automated Methods

Errors Generated by Porter Stemmer (Krovetz 93)

Query Languages

Simple query language: Boolean

Boolean Queries

Boolean Queries

Boolean Queries

Boolean Logic

Boolean Searching

Boolean Problems

Advantages and Disadvantage of the Boolean Model

Psuedo-Boolean Queries

Boolean Extensions

Ranking Algorithms

PPT Slide

Indexing and Representation: The Vector Space Model

Document Representation What values to use for terms

Document Vectors

Vector Representation

Document Vectors

Assigning Weights

Assigning Weights

tf x idf

tf x idf normalization

Vector Space Similarity Measure combine tf x idf into a similarity measure

Computing Similarity Scores

Documents in Vector Space

Computing a similarity score

Similarity Measures

Problems with Vector Space

Probabilistic Models

Probabilistic Retrieval

Probabilistic Models: Some Notation

Probabilistic Models: Logistic Regression

Probabilistic Models: Logistic Regression attributes

Probabilistic Models: Logistic Regression

Probabilistic Models

Vector and Probabilistic Models

Simple Presentation of Results

Problems with Vector Space

Evaluation

What to Evaluate?

What to Evaluate?

Relevance

Standard IR Evaluation

Precision/Recall Curves

Precision/Recall Curves

Precision/Recall Curves

Document Cutoff Levels

The E-Measure

TREC

Sample TREC queries (topics)

TREC

TREC Results

Blair and Maron 1985

Blair and Maron, cont.

Blair and Maron, cont.

PPT Slide

Creating a Keyword Index

Inverted files

Inverted Files

How Are Inverted Files Created

How Inverted Files are Created

How Inverted Files are Created

How Inverted Files are Created

An Example IR System

Next Time

Author: hearst

Email: hearst@sims.berkeley.edu

Home Page: http://sims.berkeley.edu/~hearst

Text version