L23. STRUCTURE-BASED MODELS [1] (11/18)

 

18 November 2009

Structure-based IR models combine representations of terms with information about structures within documents (i.e., hierarchical organization) and between documents (i.e. hypertext links and other explicit relationships). This structural information tells us what documents and parts of documents are most important and relevant, and provides additional justification for determining relevance and ordering a result set.   The nature and pattern of links between documents has been studied for almost a century by "bibliometricians" who measured patterns of scientic citation to quantify the influence of specific documents or authors. The concepts and techniques of citation analysis seem applicable to the web since we can view it as a network of interlinked articles, and Google's "page rank" algorithm is now the classic example.

 

Download recorded lecture from http://courses.ischool.berkeley.edu/i202/f09/files/202-20091118.mp3