SIMS 296a-3:
Current Topics in Information Access

Readings and Assignments


 

Overview

Schedule & Lectures

Readings & Assignments

Pre-Requisites

Project Schedule:
Sept. 23: One paragraph each describing project ideas; not a commitment
Sept. 30: Decide on topics (commitment), form project groups
Oct. 7: Turn in project goals, dates for milestones, definition of a successful project
Oct. 14: Turn in proposed reading list
Nov. 15: Turn in mid-project report (a short report showing what you've done so far and what you have to do before the end of the semester)
Dec. 9: Project presentations to class (during finals week)
Dec. 15: Projects writeups due

For Dec 2:

For Nov 25:

    Topic: Image Retrieval I
    Discussion Leader: Ame Elliott

    This week's readings are available only on paper. Copies will be handed out in class and extras will be available outside my door.

    Optional:

    This paper provides a brief conceptual background of what precedents are and how they relate to case-based reasoning (CBR). Ame recommends looking over this one before looking at the Domeshek and Gross et al articles.

    Rivka Oxman "Precedents in Design: A Computational Model for the Organization of Precedent Knowledge" in Design Studies, 1994, vol 15, no 2, pp 141-17.

    Required:

    E. A. Domeshek and J. L. Kolodner, "A Case-Based Design Aid for Architecture" in Artificial Intelligence in Design '92 John S. Gero and Fay Sudweeks (editors). Kluwer Academic Publishers, Dordrecht, the Netherlands, 1992.

    Sara Shatford Layne, "Some Issues in the Indexing of Images" in Journal of the American Society for Information Science, 1994, 45(8) pp 583-588.

    M. Gross, C. Zimring, and E. Do, "Using Diagrams to Access a Case Base of Architectural Designs" in Artificial Intelligence in Design '94 John S. Gero and Fay Sudweeks (editors). Kluwer Academic Publishers, Dordrecht, the Netherlands, 1994.

    Roseanne Price, Tat-Seng Chua, and Suliman Al Hawamdeh, "Applying Relevance Feedback to a Photo Archival System" in Journal of Information Science, 1992, Vol 18, pp 203-215.

For Nov 18:
    Topic: TDM: Links and Associations
    Discussion Leader: Ketan Patel

    Don R. Swanson & Neil R. Smalheiser An interactive system for finding complementary literatures: a stimulus to scientific discovery. Artifical Intelligence 91 (1997) 183-203.

    Ido Dagan and Ronen Feldman and Haym Hirsh. Keyword-Based Browsing and Analysis of Large Document Sets, Proceedings of the Fifth Annual Symposium on Document Analysis and Information Retrieval (SDAIR), Las Vegas, NV, 1996. (Handout in class.)

For Nov 11:

    Topic: Visualizing Large Text Collections
    Discussion Leader: Owen McGrath

    All but one of these papers are short; I am just adding a lot of commentary here.

    The following paper is not about text specifically, but is worth thinking about in terms of text:
    Thomas J. Ball and Stephen G. Eick, Software Visualization in the Large, Computer, 26(4), April 1996.

    James A. Wise and James J. Thomas and Kelly Pennock and David Lantrip and Marc Pottier and Anne Schur, Visualizing the Non-Visual: Spatial analysis and interaction with information from text documents ", Proceedings of the Information Visualization Symposium 95", N. Gershon and S. G. Eick", IEEE Computer Society Press", 51-58, 1995. (Handout in class; there is also a videotape)

    Mukherjea, Foley and Hudson, Visualizing Complex Hypermedia Networks through Multiple Hierarchical Views, ACM SIGCHI 1995, May 1995, Denver, Colorado.

    Earl Rennison, Galaxy of News: An Approach to Visualizing and Understanding Expansive News Landscapes, Proceedings of UIST 94, ACM Symposium on User Interface Software and Technology, New York, 3-12, 1994
    The author of this paper has started a company with others, called Perspecta. However, the viz part seems to have gone away.
    A product doing similar things (apparently) can be found at: Semio. This is also similar to the Apple HotSauce project (now apparently defunct).

    Krista Lagus and Timo Honkela and Samuel Kaski and Teuvo Kohonen, Self-organizing maps of document collections: A new approach to interactive exploration, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, AAAI Press, 1996, 238-243. (Handout in class.)
    Check out the demo of the WebSOM project (best to do this during a time of day uring which the network tends to be less loaded).
    This is related to the The ET-Map project that we already read about (Chen et al, in JASIS).

For Nov 4:

    Topic: Text Data Mining: Intro to Data Mining
    Discussion Leader: Hao Chen

    Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth The KDD process for extracting useful knowledge from volumes of data, Communications of the ACM, Vol. 39, No. 11 (Nov. 1996), Pages 27-34.

    Tomasz Imielinski and Heikki Mannila, A database perspective on knowledge discovery, Communcations of the ACM 39, 11 (Nov. 1996), Pages 58 - 64

    Marti Hearst, Untangling Text Data Mining (talk slides). The middle part of this talk is review of lectures from the first two weeks of class.

    Skim the following (handout in class):
    Glymour, C.; Madigan, D.; Pregibon, D.; Smyth, P. Statistical themes and lessons for data mining. Data Mining and Knowledge Discovery, 1997, vol.1, (no.1):11-28.

For Oct 28:
    Topic: Tradeoffs between Automated and User Control
    Discussion Leader: Barbara Stone
    Readings:

    This is a handout; you can get copies outside my door.

    Bates, Marcia J. "Where should the person stop and the information search interfaces start." Information Processing & Management, 26(5), 1996.

    The following two papers discuss different aspects of the same system. The first is related to this topic and to data mining. The second is related to this topic and to supporting the process of activity (although not addressing information access per se; we need to use our imagination to see how it relates).

    St. Amant and Cohen, Interaction with a mixed-initiative system for exploratory data analysis (draft) Knowledge-Based Systems. 10(5). In press.

    St. Amant and Cohen Navigation and planning in a mixed-initiative user interface. Fourteenth National Conference on Artificial Intelligence. 1997. Pp. 64-69.

    Shneiderman, B. (January 1997) Direct Manipulation for Comprehensible, Predictable, and Controllable User Interfaces, Proceedings of IUI97, 1997 International Conference on Intelligent User Interfaces, Orlando, FL, January 6-9, 1997, 33-39.

For Oct 21:

Topic: (Agents for) Incorporating Personal Information
Discussion Leader: Brent Chun

For Oct 14:

Topic: Intro to Automated Assistants
Discussion Leader: Hao Chen

Read the following (handout in class):

    Brenner, Zarnekow, and Wittig, Intelligent Software Agents, Springer-Verlag 1998, Chapters 3 & 4.

    This is a very short critique of the agent-centric approach:
    Ben Shneiderman, Looking for the bright side of user interface agents, Interactions, ACM, 2(1), 1995. Click here and then search on "bright side"

For Oct 7:

Topic: Support for the Dynamic Process, History Mechanisms
Discussion Leader: Vijayshankar Raman

Read the following:

For Sept 30:

Decide on your project proposal; if it is different from what you've already sent me then email the proposal by Sept. 30.

Topic: Interfaces for search starting points.
Discussion Leader: Carol Butler

    Read the four papers listed below and come prepared to discuss them.

    You'll need to get copies of the the Chen et al. paper from me. I'm leaving it in a folder outside my office door (212 South Hall).

      Sections 1-4 of User Interfaces and Visualization by Marti Hearst from Modern Information Retrieval, edited by Baeza-Yates and Ribeiro-Neto, Addison-Wesley Publishing Company, to appear.

      Adele Howe and Danielle Dreilinger, SavvySearch: A Metasearch Engine that Learns Which Search Engines to Query, AI Magazine, 18 (2), 19-25, 1997. postscript version of the paper

      Hsinchen Chen, Andrea L. Houston, Robin R. Sewell, Bruce R. Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques , Journal of the American Society for Information Sciences (JASIS) , 49 (7), 1998.

      (You can skip the more technical parts of this one if you like.)
      D. Gibson, J. Kleinberg, P. Raghavan. Inferring Web communities from link topology Proc. 9th ACM Conference on Hypertext and Hypermedia, 1998. postscript version of the paper

For Sept 23:

Added on Sept 22: I mentioned this in class but neglected to list it as part of the homework assignment. Before class tomorrow please email me the following:

  • a title
  • an indication of the project type (survey paper, research paper, or implementation of part of an existing project)
  • a one paragraph description
  • whether or not you would like to work with others on this
for what you are currently thinking about doing for your class project (ascii only please). This is not a commitment; just a description. If you would be interested in working with someone on the project then please note this as well. If you have more than one idea, write a separate title, project type, and paragraph for each.

(If I haven't allowed enough time for this, send it as soon after class as possible.)

Topic: Interfaces for subject codes.

Read the three and a half papers listed below and come prepared to discuss them. Alison Brandt will be discussion leader.

    Christine L. Borgman. Why are Online Catalogs Still Hard to Use?, Journal of the American Society for Information Science 47(7):493-503, 1996. (This will be handed out in class on Sept. 16.)

    These two are about the same, read one first and then read the other to find any additional ideas.
    Both are available at this location.

    A Steven Pollitt. Interactive Information Retrieval based on Faceted Classification using Views Knowledge Organization for Information Retrieval, Proceedings of the 6th International Study Conference on Classification, University College, London 16-19 June 1997 FID/CR

    A S Pollitt (1998) The key role of classification and indexing in view-based searching. International Cataloguing and Bibliographic Control Vol 27 No 2 April/June 1998 pp 37-40. http://poseidon.hud.ac.uk/external/research/CeDAR/pollifla.html

    Marti Hearst and Chandu Karadi, Cat-a-Cone: An Interactive Interface for Specifying Searches and Viewing Retrieval Results using a Large Category Hierarchy in the Proceedings of the 20th Annual International ACM/SIGIR Conference, Philadelphia, PA, July 1997. Online Version

For Sept 16:

Do more background reading, especially the Bates paper on berry picking and the Blair and Maron paper on evaluation.

For Sept 9:

Do as much background reading as you can (see prerequisites list).

Here are the results of people's top-level rankings of topics.

Rank order the topics according to which you would like to cover in the course, choosing among the following. First rank the six top-level topics from one to six (one is most preferred). Within each topic, indicate which, if any, of the subtopics you find particulary interesting. Also, indicate any special information I should know, for example, if you are actively doing research in one of the areas.

Print out the information and mark it up for class, or else email the list to hearst@sims.berkeley.edu

Topics marked in red are those for which I have proposed research projects.

User Interfaces

  • Support for the Dynamic Process
    • Source Selection
    • Supporting strategies instead of moves
    • Progressive revelation of relevant information
    • Maintaining History of Interactions
  • Better supporting subject access in the interface
    • Integrating existing thesauri
    • Integrating word sense disambiguation
    • Attempts to catalogue the WWW
  • Studies of why OPACs (online public access catalogs) are hard to use
Agents and AI
  • Trading off between automated control and user input
  • Incorporting "Personal" Information into search
  • Automated assistants working in the background
  • Cognitive modeling (understanding how people think as they search)
Quality Assessment (this is really a type of categorization)
  • Using "social" techniques (e.g., visitation patterns)
  • Using link and other structure
  • Using content
  • Integrating all three
Text Data Mining
  • Visualization of contents of large collections
  • Discovering patterns and anomalies
  • Creating hypotheses based on chains of associations
Information Integration/Natural Language Processing
  • Combining results from different databases and search engines
  • Synthesizing/Summarizing information from different sources
  • Question answering
Multi-Media and Structured Information Retrieval
  • Image retrieval
  • Search over hypertext
  • Integrating text, audio, video, and images
  • UIs for retrieval of geographic/cartographic information