 296a-1 Seminar: Information Access.
 ("The Friday Afternoon Seminar")
 Summaries - Spring 2006.

Friday, Jan 20: Ray LARSON: Time Period Directories; and HICSS.
    We will be discussing a paper just submitted to the Joint Conference on Digital Libraries (JCDL) entitled "Time Period Directories: A Metadata Infrastructure for Placing Events in Temporal and Geographic Context." Metadata is ordinarily used to describe documents, but it can also constitute a form of infrastructure for access to networked resources and for traversal of those resources. One problematic area for access to digital library resources has been the ability to interpret user statements of periods or eras as ranges of dates and to associate them with particular locations. For example, a user interested in the "Vietnam war", "Clinton Administration" or the "Elizabethan Period" must either know the corresponding dates, or rely on simple keyword matching for those period names. This paper describes the Time Period Directory, a metadata infrastructure for named time periods linking them with their geographic location as well as a canonical time period range. We describe the design and development of Time Period Directories and discuss the a prototype version derived from the Library of Congress Subject Headings.
    Time periods and events are tightly associated, which leads to us to consider how "events" should be described. Further, people's lives can be viewed as a series of episodes or events, which leads, in turn, to new possibilities for representing and marking up biographical records.
    In addition we will discuss the papers presented in the Digital Media Track of the Hawaii International Conference on Systems Sciences (HICSS).

Jan 27: Timo SAARI, Helsinki Institute for Information Technology: Researching Mobile Media in Context.
    The talk will address the study of mobile media and its impacts, mostly from the psychological point of view. Brief examples in psychology of mobile media (making and receiving media content) as well as contextual and situational influences on mobile media creation and reception will be presented. Also, methodological challenges and opportunities for field-based mobile media research are addressed. Study examples include mobile attention allocation processes, mobile and collaborative spectator media for large-scale events and the impact of use context on mobile gaming.
    Timo Saari, D.Soc.Sc., is Senior Research Scientist in the User Experience Research Group, and the Director of Research at the Center for Knowledge and Innovation Research (CKIR) in Helsinki School of Economics and Associate Director of M.I.N.D. Lab., USA & Finland. Currently Dr. Saari is a Visiting Scholar in Stanford University. For fall semester 2005 Dr. Saari was Visiting Professor at Michigan State University in Mobile Media Psychology.
    His research interests are i) psychology and adaptation of mobile interaction and ii) media-rich collaboration technologies, such as collaborative gaming applications for team-based knowledge work, iii) design for emotion in entertainment gaming. His previous research on mobile interaction includes experiments on how adaptation of mobile interface form factors and design influence emotion, mood, presence and learning with news and entertainment content. Recent research investigates how mood influences behavior in mobile interaction.
    Research on media-rich collaboration technologies and collaborative gaming addresses issues such as how social awareness technologies and open content-based collaborative games and simulations can be used to facilitate performance of knowledge worker teams in select business processes, such as product development. Performance is seen as task- related but also concepts of social performance (group maintenance and cohesiveness) and emotional performance (mood management, well-being) are used. Research on entertainment gaming involves studying the emotional significance of various gaming stimuli, such as middle level generic elements inside gaming sequences (obstacles in driving games, for instance) across different genres of games or the emotional impact of multiplayer gaming and social interaction. Also, approaches to user regulated emotional controls for games have been developed.
    Dr. Saari was a visiting researcher at UC Berkeley School of Information Management and Systems in spring 2004 focusing on psychology of personalization systems.

Feb 3: Marc DAVIS, Founding Director, Yahoo! Research Berkeley: Yahoo! Research Berkeley: Designing the Future of Social Media.
   : The challenges of the internet cannot be solved by technology alone. New methods of research and product innovation are needed that enable the iterative design, development, and analysis of "sociotechnical" systems and applications. The possibility of creating large scale internet services that connect billions of humans, computational devices, and media assets into a functional network require us to rethink information science, social science, media studies, and design. This talk will discuss new ways of reconceptualizing the objects and methods of information technology research (especially in the area of multimedia information systems) as well innovation processes for corporate and academic research and development designed to address the challenges and opportunities of large scale internet media systems and applications. We will discuss these issues in the context of Yahoo! Research Berkeley-an innovative corporate-academic collaboration for research and development begun this past summer by Yahoo! Inc. and UC Berkeley. Yahoo! Research Berkeley explores and invents social media and mobile media technology and applications at the intersection of media, technology, and people. Yahoo! Research Berkeley is focused on how we can gather and use media metadata to leverage context (especially from mobile devices) and the power of community (especially through tagging and sharing of media) to enable people to create, describe, find, share, and remix media content (especially photos, video, and audio) on the global internet. Biography: Marc Davis is the Founding Director of Yahoo! Research Berkeley. He is responsible for the technical and creative vision and leadership for the Lab. Yahoo! Research Berkeley is a new research partnership between Yahoo! Inc. and the University of California at Berkeley to explore and invent social media and mobile media technology and applications that will enable people to create, describe, find, share, and remix media on the web. Prof. Davis is on leave from the University of California at Berkeley School of Information where he directed Garage Cinema Research. Prof. Davis' work is focused on creating the technology and applications that will enable the billions of daily media consumers to become daily media producers. Prof. Davis earned his B.A. in the College of Letters at Wesleyan University, his M.A. in Literary Theory and Philosophy at the University of Konstanz in Germany, and his Ph.D. in Media Arts and Sciences at the Massachusetts Institute of Technology Media Laboratory. From 1993 to 1998 at Interval Research Corporation, he led research and development teams in creating automatic media production technology. In 1997, he was an invited contributor to the 50th Anniversary Edition of the Communications of the ACM. From 1999 to 2002, he was Chairman and Chief Technology Officer of Amova, Inc., a developer of media automation and personalization technology. Prof. Davis is also a Co-Founder and Executive Committee Member of the new interdisciplinary UC Berkeley Center for New Media (CNM).

Feb 10: Clifford LYNCH: Some Irresponsible Speculations on the Implications of Large Scale Digitization and Digital Literatures.
    There are a number of large scale digitization programs underway to convert a great deal of the published literature -- scholarly and otherwise -- to digital form. At the same time, at least for the scholarly literature, increasing amounts of material are being published digitally. In my talk, I'll briefly some of these developments, and conclude with a series of speculations about the implications of having disciplinary literatures in digital form.

Feb 17: Keith Johnson, Stanford Digital Repository: Digital Preservation: "In theory, there's no difference between theory and practice.
    In the process of designing institutionally scoped digital preservation services, I have been struck by a frequent disconnect between digital preservation theory and the existing practical and economic environment. My response, predictably, has been to attempt synthesis of more theory-practical, that! Yet my goal was to put together something strategically informative for institutional digital preservation service design, and as the theory has indeed been informative, it is gradually proving itself practical.     I will present a current snapshot of this evolving work, how it addresses what I perceive to be some impracticalities in common digital preservation models, and its impact on the evolving design of the Stanford Digital Repository.
    Keith Johnson Product Manager at the Stanford Digital Repository.

Feb 24: Colin BURKE, Historian: Codebreaking and retrieval machines in the 1930s and 1940s; and the Historiography of Information Systems
    Colin Burke will discuss his work on two subjects: the related development of machines for codebreaking and machines for information retrieval during the 1930s and 1940s; and, his current work on the history of information science and information industry with a focus on his forthcoming ARIST article surveying the state of the art of the history of information.
   Colin Burke is an historian who has researched several different topics during this career, with the aid of grants from many organizations including the Social Science Research Council, The Ford Foundation, NSA, and the Chemical Heritage Foundation. He is currently working on information history, a follow-up on a statistical history of non-profit organizations, and the history of an American intelligence agent and policy maker whose career spanned World War II, the critical years of the cold war, and a politically-driven attempt to imprison him during the Kennedy years.

Mar 3: Clifford LYNCH:The Varieties of Data Curation.
    Data curation has recieved a great deal of emphasis recently in the context of cyberinfrastructure and e-research, yet it seems to mean very different things to different people and in different settings. I will look at a range of practices and activities that fall under the data curation umbrella, and will also offer some discussion of key challenges that need to be addressed.

Mar 10: Libby SMITH, Yun Kyung JUNG, and Ray LARSON.
    Libby SMITH: Mass Collection Digitization: Keeping Resources Trusted.

    Standards have been developed for trusted digital repositories. How do these attributes apply to mass digitization projects conducted by a commercial third party such as Google? Examined are such issues as quality, storage, and, most importantly, persistence of access. In other words, in the case of repositories like UCB, how can Google be trusted to provide an accurate and accessible collection forever?
    Yun Kyung JUNG: Reasons for Voluntary Information Sharing in Korean Cyberspace: The Uses and Gratification Approach.
    I will study the various motivations and reasons for voluntary user participation and information sharing among groups of Koreans who share similar interests in cyberspace. As the primary method to analyze this topic I have chosen the "Uses and Gratifications" approach developed by communication researchers such as Katz and Blumler. In the traditional study of media, the main object of study is the media. The Uses and Gratifications approach, developed in the 1970s, starts with people and explores how people use a certain media in order to achieve their needs. This theory was not designed for Internet media and may not apply well to the culture of Korean cyber society, and there might be other factors that are not covered by this theory. I will also look for reasons that have not yet been documented.
    Ray R. LARSON: Grid-based Digital Libraries and Cheshire3.
    Recent research in designing and developing digital library services has been focused on approaches to indexing and searching in a steadily increasing range of genres and materials. An important aspect of this research is concerned with providing effective and scalable IR services for digital libraries as these diverse collections grow to sizes measured in terabytes and petabytes. The Cheshire project has had a central research focus on large-scale digital library collections for more than a decade, with a current focus on supporting distributed digital libraries in a Grid evironment. At the same time we have have been prototyping systems for very long-term digital preservation, and examining how grid-scale information retrieval systems can interoperate with petabytes of diverse data stored over many years.
    In order for Information Retrieval (IR) in the evolving "Grid" parallel distributed computing environment to work effectively, there must be a single flexible and extensible series of "Grid Services" with identifiable objects and a known API to handle the IR functions needed for Digital Libraries or other retrieval tasks. The Cheshire3 system builds on the work of the Cheshire project over the past decade to define and implement an easy to use set of IR objects with precisely defined roles that can effectively provide a Grid Service for IR. I will discuss how distributed storage technologies like the SRB (Storage Resource Broker) are being used in Cheshire3, and the issues of efficiency in such a computing environment. (This talk is based on recent submissions to SIGIR and to INFOSCALE).

Mar 17: Kirsten NEILSEN, Project Manager, Tobacco Control Digital Library and
    Heidi SCHMIDT, Director, Academic Information Systems, UCSF.
    Remodeling a Digital Library: Planning for the Next Generation Legacy Tobacco Documents Library.

    UCSF Library hosts 2 large digital archives of corporate documents from the tobacco industry. The larger of the two collections, the Legacy Tobacco Documents Library (LTDL), launched in 2002 with about 24 million pages. The system was built using the University of Michigan's DLXS software.
    Two radical changes to LTDL's content compel us to redesign the system. First, the LTDL has almost doubled in size in 4 years; the Library now holds 41 million pages and more are added each month. Second, we have completed a project to OCR all of the documents. When LTDL launched users could search metadata only; now we are able to provide full-text searching, but the DLXS system cannot support this function.
    Meanwhile, in October 2003 we launched a second, smaller archive, the British American Tobacco Documents Archive (BATDA) which currently holds 4 million pages of documents. The BADTA system, built in Java and based on the Lucene search engine, does support full-text searching. In addition, the system incorporates user suggestions and other knowledge, particularly about usability, gleaned from experience with LTDL.
    Our current plans are to build the new LTDL on the BADTA model using Java and Lucene. However, we would welcome advice, suggestions, or just validation of our plan from others in the community.
    We plan to discuss the creation of the current systems as a means of explaining what we have done and plan to do and, we hope, to explore what we have not done or even thought to do.

Mar 24: Michael BUCKLAND: Book Talk: Emanuel Goldberg and His Knowledge Machine.
    In the received history of information science, Vannevar Bush designed the first desktop search engine, his mythic "Memex"; J. Edgar Hoover revealed that microdots used in espionage were invented in Dresden by a Professor Zapp; and in the history of photography the design of the famous Contax 35 mm camera is attributed to Heinz Kueppenbender, head of Zeiss Ikon. In fact all three -- and much more -- were primarily the work of Emanuel Goldberg (born Moscow 1881, active in Germany 1900-1933, died in Tel Aviv 1970). Goldberg, internationally famous in his prime, disappeared into oblivion and was forgotten.
    My new book, Emanuel Goldberg and his Knowledge Machine: Information, Invention, and Political Forces, reconstructs Goldberg's life and work. I will talk briefly about why and how the book was written, give highlights of Goldberg's adventurous life, and discuss Goldberg as a case study in historiography and the mechanisms of historical amnesia.
    More at

Mar 31: No Seminar - Spring Break.

Apr 7: Niels LUND: A Document Turn Ahead?

    In connection with the development of ICT, many have envisioned having all "content" in one huge database, the world-brain, getting rid of the disturbing frames and borders in the analog world, a vision which is still kept alive by Google and several others. At the same time, the digital world becomes more and more diversified having an infinite number of virtual communities with a lot of different databases using many different media and media combinations.
    Several current research projects focus on framing and capturing content across all these distributed resources into something useful and relevant. The huge amount of data available is not only an advantage, but is also a source of very complex problems of dividing up and selecting from the huge amount of data.
    In 1964 the French philospher Roland Barthes said: "meaning is above all a cutting-out of shapes" and he might have predicted the major challenge in a digital age. He said: "looking into the distant and perhaps ideal future, we might say that semiology and taxonomy, although they are not yet born, are perhaps meant to be merged into a new science, arthrology, namely, the science of apportionment." As I see it, this can be considered as a prediction of a document turn. On April 7 I will explain why and how.

Apr 14: Coye CHESHIRE: Current Experimental Research in the Use of Information for Trust-Building and Social Exchange Transitions.
    A social exchange system is a fundamental form of human interaction that consists of individuals who exchange social and material resources. In any given social exchange system, individuals have access to various kinds of information about their potential exchange partners (such as personal exchange experience or third-party reputations). A long history of social exchange experiments demonstrates that different forms of exchange yield different outcomes for cooperation, trust, affect, and other factors. To a large extent, this is a function of the differences that exist in levels of risk and uncertainty inherent in various forms of exchange (i.e., reciprocal, binding, or non-binding exchange). In this talk, I will present my current and forthcoming research on how available information within an exchange network is related to 1) building trust, and 2) transitioning between forms of exchange. Given the current interest in real-world systems of B2B and Internet-based exchange (which often challenge many assumptions about exchange processes, attributions, and outcomes), the opportunities for theoretical development and real-world applications of the study of trust-building and transitions in modes of exchange are substantial.
    Trust and Trust-Building. Research has consistently demonstrated that increased uncertainty in social exchange leads to an increased need for relations based on interpersonal trust. I will present current experimental research that shows how uncertainty and risk affect trust-building over repeated interactions and assessments of trustworthiness in one-shot interactions.
    Exchange Transitions. Prior work in social exchange generally begins with fixed networks in which only one type of exchange can occur; in other words, the type of exchange is fixed by the experimenter for purposes of comparison. There is little or no research on the process of transitioning between different modes of social exchange. I will present a set of theoretically driven arguments for social exchange systems that transition (or shift) between reciprocal exchange and binding or non-binding negotiated exchange (which is only one of the possible types of transition in modes of exchange). These shifts can be structurally determined (i.e. the form of exchange occurs exogenously independent of the particular intentions or desires of the participants), or as agent-based transitions (i.e. in which individuals choose to move to a new mode of exchange based on their own experiences, available information, and dispositions).
    In collaboration with researchers at Stanford University, we make several predictions about how agent-based transitions occur and about the attributions and exchange outcomes that result from both structurally determined and agent-based transitions in mode of exchange. I will present our proposed set of social exchange experiments that will allow us to test various hypotheses about these social exchange transitions.

Apr 21: Charis KASKIRIS: Behavioral Economic Engineering: An Emprical Investigation of Time Preferences.
    "What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention, and a need to allocate that attention efficiently among the overabundance of information sources that might consume it." (Herbert Simon, 1971)
    Designing information services and systems that incorporate desired individual and social outcomes requires understanding of the information, instruction, and support structures, as well as the incentives, motivations and psychological biases of users within the technological possibilities available. Information design and encoding of such systems incorporates models of human behavior whose empirical validity becomes a critical component of the system's verification of the effectiveness.
    We investigate different explorative and predictive models of time discounting behavior using a unique data set from an online medical claim negotiation service. We examine negotiators' discounting patterns, time preference behavior and construct different predictive models/technologies for improving negotiation outcomes. The data also provide a unique opportunity to to study individual inter-temporal preferences outside the experimental laboratory and provide some empirical evidence towards the predictive accuracy of different models of time discounting. This understanding is crucial in behavioral modeling of users and in providing technologies which improve negotiation outcomes.

*** Change of program ***     Visit of Dean AnnaLee Saxenian has been postponed.
Apr 28: Clifford LYNCH and Michael BUCKLAND: Reports and New Developments.
    Clifford LYNCH: I'll report on highlights from a range of recent conferences and meetings, including the CNI Spring Task Force Meeting, the Internet2 meeting, the EDUCAUSE Policy meeting, the CNI/Mellon/Microsoft/DLF/JISC invitational workshop on repository interoperability and other developments as time permits.
    Michael BUCKLAND: The Nature of Naming: (1) Mapping Names of Objects to Names of Topics: Preliminary report on a mapping of gazetteer feature types (the National Geospatial-Intelligence Agency Geographic Description Codes) to Library of Congress Subject headings; (2) Subject Headings as Naming: Ideas for a paper on library subject cataloging and classification for a linguistics anthology on "naming". What "aboutness" is about. Tensions between the fluidity of language and libraries' need for stability. Why subject categorization is inherently obsolescent and unsatisfactory.

*** Last Friday of Semester ***
May 5: Students' Reports.
    Yun Kyung JUNG: What Clicks? Why do Korean Cyber Users Participate in Sharing Information?

    An examination of uses of Google in the USA and Naver in Korea found distinct differences. Naver users prefer to ask questions of other users instead of doing keyword searching, as in Google. Users are ranked by the quality of their answers. The more they offer, the more likely information-receivers are to get relevant information and the more information-givers likely get higher rank in "Naegong" ("inside-power of human being"). Ranks change in real time. People care a lot about their "Naegong" rank. Similarly, the web site called "wtv" allows anyone who has a webcam to share their files through Internet broadcasting and get "stars" from people who receive the service. People choose media based on their needs. Media have to provide for use, an expectation of rewards that the media can provide. "Naegong" and "star" click Korean cyber users' motivation for more active participation in information sharing. Why do these intangible rewards matter so much in Korean society and Internet culture? In Korea Google is not as popular as Naver and is placed the fourth among. Koreans seem to trust personal answers rather than web pages.
    Elizabeth Ann SMITH: Mass Digitization: Is Google a Trusted Repository?
    Google is digitizing huge library collections. If Google will essentially become the steward for these collections, is it responsible for the collections and to the public to maintain accessibility and quality of the content? Necessary attributes of such trusted repositories have been developed and will be examined in the context of the Google Library Project. Questions and other issues will be explored to determine the feasibility of Google as a trusted resource.
    Vivien PETRAS: Translating Dialects in Search: Mapping between Specialized Languages of Discourse and Documentary Languages.
    The biggest problem in searching an information system is to find the appropriate search terms that not only represent the searcher's information need but also match the language used in the information system. This is a translation problem between a specialized dialect of discourse and the documentary language of the information system. Discourse dialects evolve within specialized communities. They differ from general language and other communities' dialects in terminology (e.g. terms of art, jargon) and patterns grammar. A documentary language is the language used for document representation in an information system. The scope of a bibliographic database and its documentary language usually extends across more than one domain of discourse.
    This dissertation describes a mechanism that will provide a translation aid between specialized languages and the documentary language by suggesting appropriate search terms for a searcher's query in relation to the searcher's domain of discourse. With this kind of vocabulary support in the search process, the different specialized vocabularies within the information system can be disambiguated. Different perspectives on a topic can be represented to the searcher (based on the different discourses' discussion of the topic in the collection), which will help in navigating and exploring this information space more effectively.

Aug 18: Julian WARNER, Queen's University, Belfast.
    Forms of Mental Labor in the Feist Judgment.
    The Feist judgment by the Supreme Court, which denied copyright to telephone white pages, occurred in 1991 and is regarded as one of the most significant copyright decisions concerning information technology and inordinately Delphic even by Supreme Court standards.
    This presentation attempts to clarify the judgment by distinguishing different forms of mental labor, and their relation to technology, which are implicit or covert in the judgment itself. The presentation is deliberately exploratory and the presenter encourages communal contributions.
    Julian Warner is a faculty member in information science at the Queen's University of Belfast, Northern Ireland, where he teaches courses in the human aspects of modern information and communication technologies and in information policy. He has been a visiting scholar here in South Hall and at the Universities of Illinois and Edinburgh, and a visiting professor at Indiana University. He has published a number of journal articles in information science and three books, the first of which was translated into Japanese and selected as a recommended reading by Microsoft Japan.

