296a1 Summaries - Seminar Information Access Fall 2002

School of Information Management & Systems
Previously School of Library & Information Studies

296a-1 Seminar: Information Access.
("The Friday Afternoon Seminar")
Summaries - Fall 2002.

Fridays 3-5. 107 South Hall. Schedule.

Summaries will be added as they become available.

Aug 30: Clifford LYNCH: Introduction. Digital Archiving.
Introduction to the Seminar. Also, an update on developments in digital archiving.

Sept 6: Marc DAVIS: Towards Computational Media: Metadata for Media Automation and Reuse.
Over the past five hundred years, we have seen the development of technologies and social practices that enable the educated populace to read and write text. However, with video (including motion pictures and television), millions of people "read" it everyday, but very few are able to effectively "write" it. The changing of this asymmetry will require research and innovation that more intimately integrate video and computation. This presentation will address the theoretical issues, core technologies, and applications that will enable video to become a computational data type that people can easily create, access, share, and reuse. Specifically, the research challenge is to develop technologies that create metadata about the semantic content and syntactic structure of video, and that use that metadata to manipulate and reuse video. Addressing this challenge requires a methodology that interleaves the construction and analysis of artifacts and theories, and that combines ideas and technologies from multiple disciplines: information science, computer science, film theory and production, media studies, and human-centered user interaction design. In addition to talking about my past, present, and future research in computational media, I will discuss my own path to SIMS as a way to foreground issues, challenges, and opportunities in interdisciplinary research and teaching in digital media.
Speaker Bio: Marc Davis is an Assistant Professor at the School of Information Management and Systems at the University of California at Berkeley. His work is focused on creating the technology and applications that will enable daily media consumers to become daily media producers. Prof. Davis' research and teaching encompass the theory, design, and development of digital media systems for creating and using media metadata to automate media production and reuse. Prof. Davis earned his B.A. in the College of Letters at Wesleyan University, his M.A. in Literary Theory and Philosophy at the University of Konstanz in Germany, and his Ph.D. in Media Arts and Sciences from the Media Laboratory at the Massachusetts Institute of Technology. At the MIT Media Laboratory, he developed Media Streams, an iconic visual language for annotating, retrieving, and repurposing digital video. From 1993 to 1999 at Interval Research Corporation, he led research and development teams in automatic media production technology for which a patent was awarded in 2001. From 1999 to 2002, Prof. Davis was Chairman and Chief Technology Officer at Amova, a developer of media automation and personalization technology. Prof. Davis is a co-founder of Narrative Intelligence Reading Group, which innovated interdisciplinary work at the intersection of literary theory, artificial intelligence, and media technology. Prof. Davis was also an invited contributor to the 50th Anniversary Edition of the Communications of the ACM, for which he wrote a vision piece about the next 50 years of media technology.

Sept 13: Clifford LYNCH: Issues in Digital Archiving and Preservation, with a Focus on Current Project Initiatives.
A continuation of Sept 6 presentation.

Sept 20: Mikael GLEONNOC: Supporting Strategic Alliances in a Network of Partners: a New Challenge for Information Systems.
In order to gain a competitive advantage without losing their flexibility and their autonomy, more and more organizations prefer to develop strategic alliances with external partners than to commit themselves in more static and constraining models of collaboration such as, for example, traditional joint ventures. Companies, private and public laboratories, governmental organizations, independent experts and other private or institutional entities tend to develop different types of formal and informal partnerships that link them together. These links create a direct or indirect complementarity and interdependency between actors involved in a same social network, some of which are in competition for the same markets. How to manage such relationships is one of the main questions experts in management and in organization science have been trying to answer during the last ten years. The perception of this phenomenon that is exposed in this seminar is that it contributes to a general evolution of the modes of collaboration not only between different organizations but also within organizations, e.g. between different departments or different working teams. I maintain that the development of horizontal and cross-boundary relationships inside the organizations and among the network of external partners to which they belong contributes to the emergence of a global new "philosophy of collaboration." Two central concepts, widely explored in different scientific fields, permit us to understand the dynamic upon which this philosophy is based. The first concept is that of 'trust'. The second is the concept of 'network'. We will argue that the management of trust in a social network that includes both internal and external partners is the key that permits us to understand the recent evolution of the modes of collaboration and we will see that this is still a challenge in most organizations. Then, we will examine the ways that the information systems supporting this collaboration are not well adapted to facilitate cross-boundary and horizontal relationships between interdependent and complementary partners, especially when some of them are competitors. Although my analysis is focused upon groupware information and communication tools, the question of how to support internal and external strategic alliances can be considered as a design challenge for the next generation of all types of information systems.

Sept 27: Clifford LYNCH: Digital Rights Management, Research, and the Higher Education Community.
I will discuss digital rights management from a very broad perspective, with particular emphasis on the needs of the research and higher education community.

Oct 4: Two Reports:
Avi RAPPOPORT, Search Engine Consultant, searchtools.com: Developments in the Enterprise Search Industry
A report on interesting developments in the enterprise search industry, including her research on spellchecking search queries, the leading open source search engines, connections with content management systems, and implementation of faceted metadata search.
and also
Clifford LYNCH: What's become of the digital library?
I'll revisit some of the material in the talk I just gave at Educause, "What's become of the digital library?", which deals with print and digital collections and public myths and expectations.

Oct 11: Daniel GREENSTEIN, University Librarian for Systemwide Library Planning and Scholarly Information and Director, California Digital Library, Office of the President, University of California: Building Towards the User's Vision of the 21st Century Research Library.
Analyzing data on how faculty and students perceive and use scholarly information, the talk will identify users' vision for the 21st century research library. It will then explore some of the challenges and opportunities that exist for the California Digital Library and its University of California co-libraries as they build collectively towards that vision. The talk will present data gathered recently in studies conducted variously by the Digital Library Federation and Outsell Inc., by the California Digital Library, and by the Office of the President's division of Systemwide Library Planning. It will also offer an early look into possible developmental trajectories for the California Digital Library.

Oct 18: Maggie EXON, Visiting Scholar from Curtin University of Technology, Australia:
Democratizing Metadata: Should Those who Create Documents Write their Own Metadata?
A connected set of developments in the world of information has led to an assumption: that those who create documents also create the metadata for them. This differs from practice in libraries where authors write books and cataloguers create descriptions of them.
The rise of author-created metadata has been particularly influential (and therefore instructive) in the implementation of corporate document management systems. As these have developed into, or influenced, knowledge management, enterprise prortals, and all other so-called "seamless solutions" for corporate information management, the problems of how to create valid and useful metadata have become pressing. This talk will examine these issues and their implications for metadata generally.

Oct 25: Michael BUCKLAND: Events as Information.
What we know is influenced signifying events as much as by signifying documents. Yet Information Management / Science has focused on documents. How could our understanding of Information Management be made more complete? Two different lines will be explored.
1. THE SOCIAL EPISTEMOLOGY OF PAST EVENTS: A CASE-STUDY. History, heritage, and the past. What is past is passed, and no longer directly knowable. History is composed of narratives, always multiple and always incomplete. Heritage, what we have today from the past, includes "received history" (aka "Social memory"), and is mythic, in the sense of being a powerful belief which may or may not be true. Historically, Information Management is a modernist undertaking originating in the late nineteenth century. A detailed reconstruction of how the history of the first use of electronics for searching collections of documents shows that the received, mythic history is importantly determined by accidents and vested interests. This aspect of the creation, distribution, and flow of knowledge in society is a neglected, research front in Information Management.
2. EVENTS AS A CHALLENGE TO INFORMATION SCIENCE THEORY. A convenient way to organize the highly varied discourse about "information" is to categorize use of the word "information" in terms of three different kinds of phenomena: "information-as-knowledge", "information-as-process", and "information-as-thing", where "thing" denotes anything physical that is regarded as signifying something. We are, however, also informed by events, not just documents about them. So how might "Event-as-information" fit? Taking events seriously seems to require major changes in how "information" is theorized: A move away from formal models borrowed from other fields, such as the conduit model (Shannon-Weaver signaling theory) and from cognitive models; and more attention to disciplines concerned with meaning, representation, and interpretation, i.e. towards language and the humanities.

Nov 1: Lewis LANCASTER, Director, Electronic Cultural Atlas Initiative; Emeritus Professor of Buddhist Studies & of East Asian Languages & Cultures:
Update on the Electronic Cultural Atlas Initiative.
An update and progress report on the work of the Electronic Cultural Atlas Initiative (ECAI), an international effort, based at Berkeley, to enhance scholarship through increased attention to time and place. The origins and mission of ECAI. Strategies adopted for international community building, software development, training institutes, and standards development. Recent efforts have focused on the design of online gazetteers, creating e-publications that include dynamic maps, and the incorporation geo-temporal resources in teaching. The role of communities in cultural history and the use of time and space to provide new insights for research and analysis. Examples of how attention to time and space and map visualization can advance scholarship.

Nov 8: Students' Progress Reports.
Behrang MOHIT: Information Extraction using FrameNet and WordNet.
Information Extraction is the science of detecting specific types of data from the raw text. I have been using the FrameNet project's (http://www.icsi.berkeley.edu/~framenet) annotations to provide a precise seed pattern set for information extraction. In my presentation, I will focus on:
1. The Information Extraction task in general.
2. FrameNet project and the type of services that I have used for Information Extraction.
3. Future of my work which includes usage of WordNet Ontalogies and Machine Learning techniques.
Emily LIGGETT: Interfaces Designed for the Visualization of Complex Relationships.
Last summer, I worked for Professor Hearst and Oracle designing an interface geared towards sales managers. This interface was designed as a way in which sales managers can view and manage all relationships that are important to them and which help them do their job (e.g. their relationships with sales reps, sales reps relationships with contacts and companies to which they sell, etc.). Each relationship is displayed as a direct line between people - and a person's (in this case, sales manager's) entire network would be viewed as a graph, all laid out on concentric circles. The end goal for this endeavor would be to provide this interface as a complementary tool to already existing sales software packages.
I will discuss this interface, as well as the results of the usability studies performed on it, along with studies that have been done on similar interfaces.
Grace JEON: Applying MPEG7 to a DigitalChem Project.
Lecture Video/Audio can be searched and organized by using standard multimedia metadata. MPEG7, one of several multimedia description languages, will be explored as a way to allow users to search specific segments of a lecture video.
Luis VILLAFANA: Design of a Maintenance and Operations Recommender (MORE).
MORE uses information from computerized maintenance management systems (CMMS) and energy management and control systems (EMCS) to recommend what maintenance personnel should do in response to a maintenance service request or other event requiring a maintenance or control system action.

Nov 15: Doug OARD, University of Maryland: Searching Spoken Word Collections.
Spoken word collections promise access to unique and compelling content, and most of the needed technology to realize that promise is now in place. Decreasing storage costs, increasing network capacity, and easy availability of software to exchange digital audio make possible physical access to spoken word collections at a previously unimaginable scale. Effective support for intellectual access -- the problem of finding what you are looking for -- is much more challenging, however. In this talk I will review the work that has been done on this problem at the Text Retrieval Conferences and the Topic Detection and Tracking evaluations, and I will present some early results from a user study comparing present manual and automated approaches to indexing spoken word collections. I will then describe a unique resource, a collection of 116,000 hours of oral history interviews recorded in 32 languages in 67 countries, and explain how we are leveraging an unprecedented manual indexing effort to develop the ability to index similar materials automatically.
About the speaker: Doug Oard is an Associate Professor at the University of Maryland, with a joint appointment in the College of Information Studies and the Institute for Advanced Computer Studies. He is on sabbatical at USC-ISI through August, 2003. He holds a Ph.D. in Electrical Engineering from the University of Maryland, and his research interests center around the use of emerging technologies to support information seeking by end users. Dr. Oard's recent work has focused on cross-language information retrieval, retrieval from audio, data mining from text, and the exchange of ratings by networked users. Additional information is available at www.glue.umd.edu/~oard/

Nov 22: Fredric GEY, UCDATA, and Ray LARSON: Recent Developments in Information Retrieval.
We will summarize the research highlights of four conferences on information retrieval and, especially, cross-lingual retrieval: The ACM SIGIR meeting, August 2002, in Tampere, Finland, including the workshop on "Cross Language Information Retrieval: A Research Roadmap"; the 2002 Cross Language Evaluation Forum (CLEF) in Rome, September 2002, which dealt with Euopean languages; the TREC 2002 English - Arabic retrieval track; and the NTCIR3 conference on Chinese, English, Japanese and Korean retrieval in Tokyo in October 2002.

Dec 13: Students' Final Progress Reports.
Grace JEON: Digitalchem - Applying MPEG7 to Lecture Video Search Engine.
The lecture browser streams videos to students accompanied by slides in Flash format. Due to the growing number files, students will eventually need a search engine for these audio-visual type documents. MPEG7 offers the framework for such a search engine but it can be simpler than the typical audio-visual document metadata.
Behrang MOHIT: Information Extraction using FrameNet and WordNet.
In this presentation, I will talk about my approach to Information Extraction by using two human built knowledge=base (FrameNet and WordNet). I have used FrameNet annotation to build a pattern set and also a lexicon for the information extraction task and then improved my lexicon by using WordNet. My most recent evaluation shows %66 Precission and %76 Recall for this system.
Luis Villafana: Design of a Maintenance and Operations Recommender (MORE).
MORE uses information from computerized maintenance management systems (CMMS) and energy management and control systems (EMCS) to recommend what maintenance personnel should do in response to a maintenance service request or other event requiring a maintenance or control system action. MORE integrates text information from a CMMS database and sensor information from an EMCS to provide recommendations.

Fall 2002 schedule.

Spring 2002 schedule and summaries.
Spring 2003 schedule and summaries.