School of Information
 Previously School of Library & Information Studies

 Friday Afternoon Seminar: Summaries.
  296a-1 Seminar: Information Access, Spring 2013.

Fridays 3-5. 107 South Hall. Schedule. Weekly mailing list.
Summaries will be added as they become available.

Friday, Jan 25 Clifford LYNCH: Introductions. Personal Digital Archiving. E-books
    1. Introduction, introductions, and brief reports around the room: Recent conferences & other announcements.
    2. More on Personal Digital Archiving: Late last semester I gave a presentation outlining my views on key research agendas in personal digital archving and closely related areas as part of the final preparation of a book chapter on the same topic. While the chapter is now complete, I want to further explore two topics. The first is the part of a personal digital "life" that is represented by information in some sense held or owned by third parties -- typically commercial or governmental -- and the various social and legal frameworks surrounding this type of information. The second topic, which Michael Buckland has very helpfully illuminated in recent discussions, is the challenge of understanding why material is added to, and retained in, personal digital collections, and what this may imply about the potential ongoing value of such material when such a collection is passed on beyond its original creator.

Friday, Feb 1: Daniel PITTI, Ray R. LARSON, Adrian TURNER, & Brian TINGLE: Social Networks and Archival Context (SNAC) : Phase 2, Current Status and Developments.
    Daniel Pitti, associate director of the University of Virginia's Institute for Advanced Technology in the Humanities (IATH), in collaboration with Ray Larson at the School of Information here, and Adrian Turner and Brian Tingle at the California Digital Library, has received a grant from the Mellon Foundation to vastly expand Social Networks and Archival Context (SNAC), a research and demonstration project http://socialarchive.iath.virginia.edu. This continues work begun in 2010 with a grant from the National Endowment for the Humanities to use existing archival descriptions to make it easier to discover and locate distributed historic records, and at the same time, build an unprecedented resource that provides access to the socio-historical contexts (which includes people, families, and corporate bodies) in which the records were created. This second phase will vastly expand the source data employed in the project as well as the research and development agenda. This presentation will describe the project, its current status and next steps.

Friday, Feb 8: Charles WANG and Natalie CADRANEL.
    Charles WANG: Nuclear Forensics---A Computer Directed-Graph Visualization Approach to Scientific Search Problem.
    Charles Wang will report on his work with Prof. Ray Larson, Fredric Gey and Electra Sutton, on an NSF-funded project entitled "Recasting Nuclear Forensics Discovery as a Digital Library Search Problem". It takes a computer science algorithmic approach to address the nuclear forensics search problem. Nuclear forensics discovery was recast as a digital library search problem and databases were developed databases. A dynamic nuclear decay-chain visualizer was implemented for various isotopes using existing web technologies. This work will presented and compared with earlier work. Charles Wang is a Masters student in the School of Information.
    Natalie CADRANEL: Participatory Archiving: Three Case Studies.
    The prevalence of born digital cultural artifacts offers archivists the opportunity to be involved earlier on in the life-cycle of those records, before they enter the archive. As people engage with digital technologies in more nuanced ways, the networked web provides favorable circumstances for media creators to ensure provenance, authenticity, and custody. I will briefly discuss three case studies that exemplify creative ways archivists are engaging with the publics they serve. These include: the WITNESS archive, Howard Besser's work with the Activist Archivists group in New York, and Rick Prelinger's Participatory Archiving project at the Internet Archive. Natalie Cadranel is a Masters student in the School of Information.

Friday, Feb 15: Peter BRANTLEY, Director of Scholarly Communications, Hypothes.is: Designing Strategies for the Deployment of Open Annotation within the Scholarly Community.
    Peter Brantley was previously Director of Bookserver at the Internet Archive. He is the co-founder of the Open Book Alliance, an organization dedicated to ensuring an open market in digital book access. He serves on the board of the International Digital Publishing Forum, the standards setting body for digital books. Peter was previously the Executive Director of the Digital Library Federation. See http://hypothes.is/blog/welcoming-peter-brantley.

Friday, Feb 22: Michael BUCKLAND: What is New in Libraries? -- And How New?
    The techniques long familiar to librarians mainly originated in the 19th century and were implemented with spectacular success in the 20th century (MARC, online catalogs, OCLC, Medline, etc.). They were deeply based in the technologies used: books, cards, databases, and telecommunications. But now, rather suddenly in the 21st century, the familiar techniques begin to seem irrelevant as several quite new and different techniques have arisen, mostly from outside of librarianship and based on new technologies (the Web, artificial intelligence) and a closer integration of libraries with their communities: geo-referencing, external authority files, ontologies, FRBR and RDA, link-based technologies, visualization, the Semantic web, tagging, and others. How much of this is really new? How does the new relate to the old? This situation opens the possibility for new and better forms of library service, but also poses challenges.

Friday, March 1: Seminar meeting cancelled.

Friday, March 8: Tim STUTT and Catherine MARSHALL.
    Tim STUTT: Public Search Interfaces to Plant Collections.

    A brief initial progress report on my masters project on Search Interfaces for Plant Collections. I'll present the project overview, initial secondary research, comparative analysis, and ethnographic research for personae development.
    Catharine MARSHALL, Microsoft: Exploring Social Norms with the Crowd: A Reflection on Methods, Participation, and Reliability.
    Crowdsourcing services such as Amazon's Mechanical Turk (MTurk) provide new venues for recruiting participants and conducting studies; hundreds of surveys may be available to workers at any given time. I will reflect on seven related studies Frank Shipman and I performed on MTurk over a three year period. The studies used a combination of open-ended questions and structured hypothetical statements about story-like scenarios (a technique borrowed from legal education) to engage the efforts of 1,493 participants. I'll describe the methods we used to elicit social norms and reflect on what we've learned about MTurk as a survey venue. I'll also talk about apparent trends in data reliability, especially when the surveys are seen from a worker's perspective. This talk describes work done in collaboration with Frank Shipman at Texas A&M University.
    Cathy Marshall is a Principal Researcher in Microsoft Research's Silicon Valley Lab. Lately she's been working on personal digital archiving, social media ownership, and file syncing and sharing.

Friday, March 15: Merrilee PROFFITT & Jim MICHALKO, OCLC: MOOCs and Libraries.

Friday, March 22: Discussion of Biographical Data. Clifford Lynch, with Ray Larson, Laurie Pearce, Patrick Schmitz, and Brian Tingle.

    Earlier this semester, on February 1, we discussed the evolving idea of a national archival name and identity infrastructure and its relationship to the Social Networks and Archival Context (SNAC) project; this also connects to a series of earlier presentations on SNAC, a talk I gave last year on names and lives in the cultural record, and several discussions of editors' notes. On October 12, 2012, we had a session on the Berkeley Prosopography System. It is clear that there are interesting and poorly explored similarities between prosopography systems at both conceptual and technical (systems) levels. Today, I will moderate a discussion to begin to examine these questions, and in particular:
  - What are the conceptual differences between the archival name and identity infrastructure and a prosopography?
  - Are there common system components between name infrastructure and prosopography?
  - Is it reasonable to envision prosopographies contributing to an archival name infrastructure, or to deriving prosopographies from such an infrastructure?
  - Are there missing data elements that would facilitate such interoperability?
    We'll begin the session with a short review of prosopography and the Berkeley Prosopography system; we will assume that participants have some minimal familiarity with SNAC and the national archival name infrastructure concepts, and will try to move quickly into discussion.
    For the Social Networks and Archival Context (SNAC) see the Feb 1 announcement. For the Berkeley Prosopographical Services (BPS) see the October 12, 2012 announcement at courses.ischool.berkeley.edu/i296a-ia/f12/summary.html.

Friday, March 29: Semester break. No seminar meeting.

Friday, April 5: Tim STUTT: Public Search Interfaces to Plant Collections. Progress report.

    This is a short progress report on my master's project on Search Interfaces for Plant Collections. I will present results from a round of usability testing as well as a low fidelity prototype that includes maps and images in the search experience. I welcome feedback on this project by email at tim@ischool.berkeley.edu. Tim Stutt is a second-year master's student in the School of Information's MIMS degree program.
    Fred GEY: Nuclear Forensics Search.
    Nuclear forensics search is an emerging sub-field of scientific search: Nuclear forensics plays an important technical role in international security. Nuclear forensic search is grounded in the science of nuclear isotope decay and the rigor of nuclear engineering. However, two elements are far from determined: Firstly, what matching formulae should be used to match between unknown (e.g. smuggled) nuclear samples and libraries of analyzed nuclear samples of known origin? Secondly, what is the appropriate evaluation measure to be applied to assess the effectiveness of search? Using a database of spent nuclear fuel samples we formulated a search experiment to try to identify the particular nuclear reactor from which an unknown sample might have came. This talk describes the experiment and also compares alternative evaluation metrics (precision at 1, 5 and 10 and mean reciprocal rank) used to judge search success. Recent directions of the project have been in visualization of nuclear decay chain dynamics.
    For more information see: metadata.berkeley.edu/nuclear-forensics.
    Fred Gey, PhD, is an Information Scientist at the Institute for the Study of Societal Issues, University of California, Berkeley. He was a Visiting Researcher (Dec 2011 and summer 2010) at the National Institute of Informatics, Tokyo, Japan.

Friday, April 12: Clifford LYNCH and Michael Buckland.
    Clifford LYNCH and Michael Buckland: CNI Report.

    Highlights of the recent CNI Members Meeting in San Antonio.
    Michael BUCKLAND: The Changing Role of Documents in Society and the Role of Cultural Heritage "Memory" Institutions.
.     I will present some ideas on three related themes: (1) The impact of changes in the mediating roles of documents in society and, especially, new forms of digital publication; (2) Library and Information Science approaches to culture and cultural heritage institutions; and (3) Document theory in relation to cultural heritage "memory" institutions.

Friday, April 19: Niklaus STETTLER, Michael BUCKLAND, Patrick GOLDEN, Marc BRON.
    Niklaus STETTLER
, Visiting Scholar, is Head of the Swiss Institute for Information Science at the University of Applied Science, Chur, Switzerland. He is here as a visiting scholar for three months. He has a background in records management, cultural heritage, and digitization. He will briefly introduce himself, his research interests, and the status of education for information studies in Switzerland.
    Michael BUCKLAND and Patrick GOLDEN: Editorial Practices and the Web: Phase 2.
    Update and progress report. A newly-funded two-year second phase started on April 1 and will address the archiving of the working notes of completed editing projects, the scope for making archivists' working notes available, and incorporating techniques developed in digital humanities projects (e.g. maps, chronologies, and social network analyses) into the everyday work practices of editors. More at metadata.sims.berkeley.edu/en2narr.pdf.
    Marc BRON, OCLC Research, San Mateo.
    ArchiveGrid is a collection of nearly two million archival material descriptions, including MARC records from WorldCat and finding aids harvested from the web. ArchiveGrid provides access to detailed archival collection descriptions, making information available about historical documents, personal papers, family histories, and other archival materials. More at beta.worldcat.org/archivegrid/.
    Marc Bron, a PhD candidate at University of Amsterdam, is completing three months at OCLC Research focusing on archival data and incorporating his findings into ArchiveGrid, before returning home to The Netherlands and finishing his information retrieval (computer science) program. More at beta.worldcat.org/archivegrid/blog/?p=971.

Friday, April 26: **Change of program** Tom LEONARD, University Librarian: Sizing Up Libraries: Slackers or Over-Achievers.
    I will discuss current ways of looking at research libraries.
    Tom Leonard is University Library and Professor of Journalism. More at journalism.berkeley.edu/faculty/leonard/.

Friday, May 3: Last meeting of the Semester.
    Tim STUTT: Search Interfaces for Plant Collections.

    This is a report on my master's project on Search Interfaces for Plant Collections. I will present results from UX design, user testing, and a low fidelity prototype. Full project report, poster, and presentation available at ischool.berkeley.edu/programs/mims/projects/2013/herbaria.
    Tim Stutt is a second-year master's student in the School of Information's MIMS degree program.
    Clifford LYNCH: Summing Up on Names, Biographies and Identities.
    Over the past two semesters we have looked at many aspects of the use of names, biography and identies in archival description and discovery, large scale biographical databases, prosopography, editorial notes, and the construction and management of scholarly identity, among other scenarios. To conclude the semester, I'll try to summarize and generalize some of what we have discovered within a broad framework and outline some open questions that we might want to return to next semester.

** Summer special: Monday, June 10. **
    Tatsuki SEKINO, Research Institute for Humanity and Nature, Kyoto, Japan: Time information system and basic temporal information: The other side of the spatiotemporal information science.
    GIS are widely used in many scientific fields today, and have greatly contributed to progress of information analysis based on spatial data. Recently, the spatiotemporal information analysis is also realizing in some software. However, the most of the analysis function in the software is not necessarily sufficient, because these types of software are derivation of GIS, and the functions about temporal information are very weak. To realize true spatiotemporal information science, analysis environment for temporal information is required.
    HuTime is developed as software specialized for temporal information, and is a time information system which has various functions to visualize and analyze temporal information. For visualization, it can display character data (chronological table) and numeric data (line, bar and plot charts) on the same temporal axis, simultaneously, and a user can move and zoom the displayed temporal range of the data. For analysis, the software has functions corresponding to marge, clip and other analysis function in GIS, and those functions are done on the temporal axis instead of maps. When the HuTime and other GIS software is used and linked together, it is expected that spatiotemporal analysis progress especially for scientific fields using and temporal data mainly such as History. I will talk about this software and related basic temporal information (e.g. event index and calendar conversion which correspond to gazetteer and geo-coordinate conversion in GIS, respectively) in the presentation. There a more detail information in the HuTime Web site (http://www.hutime.org/). HuTime software is also available from the web site.
    Tatsuki Sekino has a PhD in Zoology and is an Associate Professor in the Research Promotion Center of the Research Institute for Humanity and Nature, Kyoto. More at www.chikyu.ac.jp/index_e.html.
    Shoichiro HARA, Vice-Director, Center for Integrated Area Studies, Kyoto University: Information Infrastructures for Area Studies.
    Area Informatics is a new paradigm to integrate individual disciplines of area studies and to create new knowledge of areas using information technologies such as metadata, databases, ontology, GIS, GPS and so on. The Center for Integrated Area Studies Kyoto University (CIAS) has developed the information infrastructure to support area informatics. Among them, databases and spatiotemporal tools are the key information tools. This talk will introduce three interesting database systems developed by the CIAS:
    1. Resource Sharing System: During research activities, researchers collect materials and create databases. Metadata of these databases are inevitably heterogeneous, because strict standardization will restrict potential possibilities of future progress of area studies. But heterogeneous metadata causes headaches for when retrieving multiple databases. To solve these metadata problems, the CIAS has developed a Resource Sharing System (RSS) designed to integrate databases on the Internet and to provide users with a uniform interface to retrieve databases seamlessly by one operation. The key technology of the RSS is standard metadata whose role is to define common names to relate data elements among databases (e.g., by using "creator" as a common name, the RSS can relate "writer" of the database A and "author" of the database B). The RSS is an innovated-system to link heterogeneous databases of different institutes and realize so called MLA (Museum, Library and Archives) cooperation.
    2. MyDatabase and APIs: RSS is a server-side system, which makes complicated analysis logics and user interfaces difficult. The CIAS introduced the latest WEB and XML technologies, and developed MyDatabase and REST-like API. MyDatabase is designed for researchers easily to develop and open databases without specialist knowledge of servers and database system operations. REST-like API is the specification to use MyDatabase. According to the API, researchers can easily develop programs for their own analysis logics and user interfaces. These functions are necessary to change ordinal information infrastructures to new knowledge infrastructures for humanities.
    3. Ontology Database: RSS uses ontology function to relate heterogeneous data structure, but it is limited to data element names. The CIAS is developing ontology databases to realize more flexible and/or intelligent data operations. At present, three types of ontology databases have been developed. One is the Gazetteer Database on Japanese Historical Places. This database organizes more than 300,000 place names with their attributes (rivers, lakes, mountains, shrines, temples, houses, monuments, villages, towns, counties, states, etc.) and locations (longitudes and latitudes). The second is the calendar table. All dates described by different calendars are grouped and ordered according to Julian dates. This simple table can be used to convert a date of a calendar to the date of another calendar. The third is subjects and technical terms. At present, LCSH, BSH (Basic Subject Heading by The Japan Library Association) and AGROVOC (controlled vocabularies covering all areas of interest to FAO, including food, nutrition, agriculture, fisheries, forestry, environment etc.) are organized as ontology databases using Topic Maps.
    Professor Shoichiro Hara, MD, has long been active in the Electronic Cultural Atlas Initiative. He is Vice-Director of the Center for Integrated Area Studies, Kyoto University (CIAS). More at www.cias.kyoto-u.ac.jp/en/staff/hara.html.

The Seminar will resume next semester.

Spring 2013 schedule.   Fall 2012 schedule and summaries.