Information Retrieval in Dynamic Information Environments

Most IR research has been focused on static information retrieval as if we are retrieving information from physical and digital libraries, and those techniques are what we have learned as part of this course. However, increasingly, researchers have become interested in the dynamic and temporal nature of information on the Web.

In the recent CIKM 2010 conference (Conference on Information and Knowledge Management), Sue Dumais of Microsoft Research gave an interesting keynote address about the temporal nature of information on the Web. She mentions that a study shows that

- 35% of Web pages change in 11 weeks.

- 66% of visited Web pages change in 5 weeks (of these, 63% change every hour)

As the current IR model evolves around the Web, are the search engines considering these factors? What are we missing and what do we need to understand and support information retrieval in this dynamic information environment (a.k.a. the Web)?

In her keynote speech, she argues that 

* We need to characterize the dynamic because a) content changes over time b) people re-visit and re-find c) relationships between contents change.

* Based on the observation, we can improve the current IR model in a way of 

a) Building tools for understanding the changes - Help users understand changes made after their last information retrieval

b) Building models and systems that leverage the dynamics - Current IR algorithms look only at a single snapshot of a page, but it is clearly obvious that the Web pages change over time. How can we leverage this to improve the current IR algorithms? She suggests us to use and observe the following points to achieve better IR algorithms:

- Pages have different rates of change

- Terms have different longevity (staying power) <- some terms are always on the page; some are transient

- Language modeling approach to ranking

Even though this area is vey new , it is fascinating and interesting enough, and I believe understanding the dynamic on the Web and creating models to support the dynamic has a huge potential of advancing current IR algorithms significantly. 

Reference

ECDL 2010 Keynote Talk, Susan Dumais, Microsoft Research, Understand and Supporting People in Dynamic Information Environments, http://research.microsoft.com/en-us/um/people/sdumais/ECDL2010-Keynote-Dumais_Share.pdf

http://palblog.fxpal.com/?p=4873, Sue Dumais at CIKM 2010