School of Information
 Previously School of Library & Information Studies

 Friday Afternoon Seminar: Summaries.
  296a-1 Seminar: Information Access, Fall 2018.

Fridays 3-5. 107 South Hall. Schedule. Weekly mailing list.
Details will be added as they become available.

Aug 24: Clifford LYNCH & Michael BUCKLAND: Introductions.
    Clifford LYNCH: The Cultural Record in the Information War Era.

    We are seeing several developments that are rapidly destabilizing the cultural record. These include the ability to fabricate very persuasive audio and visual material (sometimes called "deep fakes"), the exploitation of various mechanisms that include online news media and social media platforms to target attention, and the ability to exploit security breeches to corrupt rather than simply steal information. This discussion will summarize a number of these developments, and then focus on processes and infrastructure that the cultural memory sector might emphasize to help to manage and contextualize these materials. The discussion will be based in part on a brief EDUCAUSE review article that should appear early this fall.

Aug 31: No Seminar meeting.

Sept 7: Luciana Corts MENDES: A Hermeneutical Metatheory for Information Science
    The aim of the research is to present a unifying framework for Information Science that has hermeneutics as its foundation. Hermeneutics, a theory and methodology of interpretation, will be briefly explained, as well as how it can underlie Information Science’s basis. The research argues that a unifying framework for Information Science that is based on hermeneutics posits human sense-making within society at the core of the field, establishing that the question of meaning is central to the development of information services. Preliminary results will be presented.
    Luciana Corts Mendes is here as a visiting student researcher for six months. She is a PhD candidate in the Graduate Programme in Information Science at the School of Communications and Arts of the University of São Paulo, Brazil.

Sept 14: Michael BUCKLAND: Our First 50 Years: 1918-1968.
    By 1900 there was an acute need for qualified librarians and no program in the western states to prepare them. At Berkeley, California’s land-grant university, President Wheeler accepted the need but prevented action. The dramatic expansion of public library services around 1910 induced the California State Library to start a school in Sacramento until Berkeley could take over. In Fall 1918 the UC Berkeley Library quietly began a one-year full-time program for students already on campus without Wheeler’s knowledge. A degree, a department, and a budget eventually followed. The program remained very stable until expansion after 1946 and doctoral programs in the 1960s. I will describe these developments and some of the individuals involved, especially State Librarian James Gillis, his charismatic organizer Harriet Eddy, and Sydney Mitchell, who lead the school for nearly thirty years.

Sept 21: Matt BAYLEY: Towards a Distributed Web Crawling and Indexing Infrastructure that Facilitates Diverse Collection and Curation.
    Brief introduction of topic.
    Michael BUCKLAND: Information Access: Scope and Limitations.
    What are the limits to discovery and access in a digital context? Text printed on paper was the dominant medium for some 500 years and a wide range of techniques were evolved to support description, discovery, and forensic analysis for authenticity, provenance, etc., mostly under the term "bibliography". But the dominance of printing (and indeed of writing) has receded with the rise of new media and new publication forms which urgently need comparable attention. After a brief historical account of concepts, terminology, and anomalous claims by Suzanne Briet, Donald McKenzie, and Karl Popper, I will address the limits of information access. How far are the limits a matter of custom, how far a matter of economics, and, more interesting, where are the inherent limits (i.e. what in principle can information systems not do and why) regardless of medium? This extends last semester's discussion which is available at

Sept 28: Marcia BATES: Designing Search Interfaces for our Inner Hunter-Gatherer: Getting Serious about Browsing.
    Seeking information and interacting with it is fundamental to the life of all animals on earth. Human beings are master information seekers, but our most native and natural ways of doing it can best be seen in neolithic times, before even writing was invented. These early hunter-gatherer behaviors still drive our most familiar and instinctive information seeking. What can we learn from these early patterns? In modern information system design, how can we make it easy to follow those instincts, while also enabling searchers to negotiate dense and complex text-rich resources?
    Marcia J. Bates is Professor Emerita in the UCLA Department of Information Studies. A Fellow of the American Association for the Advancement of Science, she is a leading authority on information search, human-centered design of information systems, and information practices. She was Editor-in-Chief of the 7-volume Encyclopedia of Library and Information Sciences, 3rd ed., and has received awards for research and leadership. She has been active as a technical consultant to numerous organizations. She is a graduate of Pomona College (B.A.) and of this School (M.L.S., Ph.D.). She served in the Peace Corps in Thailand. More at

Oct 5: Zachary BLEEMER: The University of California ClioMetric History Project.
    I will discuss the data-scientific methods with which the UC ClioMetric History Project has reconstructed a near-complete database of all students enrolled, faculty teaching, and courses taught at UC Berkeley since the late 19th century, much of which has been made publicly-available. Particular attention will be devoted to characteristics of students who enrolled at Berkeley's 100-year-old School of Information. I will show a selection of visualizations from these data, and then highlight a few new findings related to the history of Berkeley's faculty salaries and the end-of-career wages of Berkeley graduates from the 1960s and 1970s.
    Zachary Bleemer is the Director of the University of California ClioMetric History at the Center for Studies in Higher Education at UC Berkeley, where he is also a PhD candidate in economics. His research examines the long-run consequences of young Americans' post-secondary education and specialization decisions, with a side-interest in the computational analysis of structured text. For more information, see his website

Oct 12: Peter BRANTLEY, UC Davis: "I don't want to be a publisher!" Regulating liability for the sticky parts: An exploration of CDA Section 230, sex work, and user content.
    Over 20 years ago, the U.S. Congress passed Section 230 of the Communications Decency Act, a landmark piece of legislation which protected Internet platforms from liability for user generated content -- a distinction from the editorial determinations made by publishers. This year, Congress passed the Fight Online Sex Trafficking Act (FOSTA), reducing liability protections in Section 230 for certain types of speech. Targeted at sex trafficking, the new law not only immediately threatens the safety of sex workers, but also encroaches on the protections afforded online archives that host third party content, leading both the Electronic Frontier Foundation and the Internet Archive to file suit to block the law. Further, the rise of fake news and partisan manipulators of platform content place further pressure on Internet hosts to take a more active editorial role, threatening the safe harbor of Section 230. We'll discuss the threats to information sharing, users, and free speech in this open conversation.
    Peter Brantley is the Director of Online Strategy for the University of California Davis Library. Previously, he was the Director of Digital Development at New York Public Library, and before that, the Director of Scholarly Communication at the open source not-for-profit, Mr. Brantley worked at the Internet Archive on policy issues and open standards, and has managed technology groups at a variety of academic research libraries. More at

Oct 19: Howard BESSER, New York University: Digital Privacy for Librarians (and Others).
    With almost weekly revelations of massive privacy attacks (on email providers, health care companies, governmental agencies, universities, political campaigns, etc.), the public has developed a heightened awareness of the vulnerability of their private information. But there is a large gap between knowing that data breeches and hacks take place, and changing one's behavior as a protective measure.
    This talk reports on an IMLS-funded project to train librarians to go out into their communities and make those communities more aware of privacy threats, and to train community members in tools and habbits that will offer protection against various types of threats. The talk will cover methods for training these Privacy Advocates in techology-based tools, in discourse and advocacy, and in community engagement. It will also discuss the various types of threats, and a variety of tools designed to mitigate some of those threats. And it will raise some concerns about conflicts between privacy and preservation. The seminar will close with a vigorous public discussion of digital privacy issues and concerns.
    After a dozen years as an LIS professor, Howard Besser became Professor of Cinema Studies at NYU, and Founding Director of the Moving Image Archiving & Preservation MA Program. His work over the past 35 years has emphasized policy issues (copyright, privacy), technology issues (image and multimedia databases), metadata (Dublin Core, METS, PREMIS), media archiving and preservation (Personal Digital Archiving, museum time-based media conservation), and teaching with technology (distance learning). He is a graduate of South Hall. More at

Oct 26: Patrick GOLDEN, University of North Carolina, Chapel Hill: Florilegia: Organizing Scholarly Annotations in PDFs.     Annotation techniques that developed over centuries of reading paper documents have persisted with the advent of digital publishing. Highlighting, underlining, marking with symbols, scribbling lines, adding notes in margins, and bookmarking pages all remain common and important practices for interacting with digital documents. Yet while tools for authoring and reading digital documents have proliferated, the way that researchers are able to interact with annotations has not generally improved. Given that annotations are such a crucial part of the scholarly research process, more systems should be available that treat annotations themselves as documents worthy of being described, recalled, and connected in their own right. I will present a system I am currently developing, called Florilegia, which is intended to combine the representational capacities of PDF, RDF, the Web Annotation Data Model, and common annotation practices, towards the hope of re-centering the annotation in the scholarly process.
    Patrick Golden is a doctoral student at the University of North Carolina at Chapel Hill. His research focuses on the history and cultivation of scholarly research infrastructure. Prior to moving to North Carolina, Patrick was a researcher here at the Electronic Cultural Atlas Initiative, where he worked on the project Editorial Practices and the Web. More at

Nov 2: Double Program: Matt BAYLEY and Mark GRAHAM and also David S. H. ROSENTHAL.
    Matt BAYLEY and Mark GRAHAM: Facilitating Diverse Collection and Curation in Web Crawling and Indexing.
    We propose to create an open and publicly available index of the public Web. Building on the 22 year history of Internet Archive’s effort to archive, and make available, web pages (URLs) we will construct a publicly accessible list of web sites (hosts). We will provide a variety of ways for people to interact with the data with two key areas of focus being efforts to support more/better web archiving as well as general research about the Web. In addition to indexing about 2 billion URLs for web hosts we plan to create/associate various metadata including language, genre and last observed HTTP status codes. We consider this project to be foundational to an ongoing and expanding effort to map resources available via HTTP. Obvious additional enhancements (beyond the scope of this initial project phase) might include adding link graph data and user-generated metadata.
    Matt Bayley is a MIMS student at the I School with a background in data engineering and an interest in software, infrastructure, and tech policy.
    Mark Graham has created and managed innovative online products and services since 1984. As Director of the Wayback Machine he is responsible for capturing, preserving and helping people discover and use, more than 1 billion new web captures each week.
    David S. H. ROSENTHAL: Blockchain: What's Not To Like?
    We're in a period when blockchain or "Distributed Ledger Technology" is the Solution to Everything™, so it is inevitable that it will be proposed as the solution to problems in academic communication and digital preservation. These proposals typically assume, despite the evidence, that real-world blockchain implementations actually deliver the theoretical attributes of decentralization, immutability, security, anonymity, lack of trust, etc. The proposers appear to believe that Satoshi Nakamoto revealed the infallible Bitcoin protocol to the world on golden tablets; they typically don't appreciate or cite the nearly three decades of research and implementation that led up to it. This talk will discuss the mis-match between theory and practice in blockchain technology.
    David S. H. Rosenthal is retired from Stanford Libraries. He was a team member of CMU's "Andrew Project"; an early employee and Distinguished Engineer at Sun Microsystems; Employee #4, first Chief Scientist, and first sysadmin at Nvidia; and Co-founder 20 years ago of the LOCKSS Program. He has been blogging since 2007, about blockchains and cryptocurrencies since November 2013.

Nov 9: Cathryn CARSON, Dept of History: Data in the Undergraduate Curriculum.
    Undergraduate institutions nationally and internationally are increasingly grappling with how to provide data analytic competencies to their students. This talk offers three lines of sight into this development, reflecting on drivers internationally (looking at the case of a recent German national initiative), nationally (taking a synoptic look at recent U.S. efforts), and at UC Berkeley.
    Cathryn Carson is the faculty lead of the undergraduate data science program at Berkeley. She is a historian of science and technology. Her research has dealt with the intellectual, cultural, and political history of the twentieth-century sciences, especially physics; the integration of social scientific and humanistic perspectives into engineering education; the organization and management of contemporary research universities; and the history and ethnography of data science. More at

Nov 16: *POSTPONED. TO BE RESCHEDULED* Günter WAIBEL and John CHODACKI, California Digital Library: Community-Owned Data Publishing: CDL’s new partnership with Dryad.
    The California Digital Library (CDL) has invested considerable effort researching and building exemplars in research data management and data publishing. Like most institutions, we have had varying levels of success, especially when it comes to adoption and reach. In many instances, University of California researchers have taken advantage of tools that are offered to a much broader community, and are better integrated into their workflows. CDL’s strategic vision acknowledges that in many instances, to best serve the University of California, we now need to think and act in a context that is broader than our institutional home. To meet researchers where they are, CDL entered into a formal partnership with Dryad. This partnership will make it easier to integrate data publishing into researcher workflows, and to be focused on building a sustainable product that is a credible alternative to commercial offerings within the research data space. With both CDL and Dryad’s expertise, we will be able to offer:
- Researchers: a higher level of service and integration into their established workflows;
- Publishers: direct integrations and more comprehensive curation services; and
- Institutions: a globally-accessible, community-led, low-cost infrastructure and service that focuses on breaking down silos between publishing, libraries and research.
    For more info about the Dryad partnership, see:
    For CDL’s strategic vision, see:
    Günter Waibel is Associate Vice Provost & Executive Director of the California Digital Library. He has extensive experience in the digital library and broader cultural heritage communities and is well-known for his work in promoting cross-domain collaboration. In his previous position he oversaw the strategic plan for creating a digital Smithsonian out of the institution’s 19 museums and 9 research centers. More at
    John Chodacki is UC Curation Center Director, California Digital Library. He has a background in product management within digital publishing and scholarly communication organizations. More at

Nov 23: Thanksgiving: No Seminar meeting.

Nov 30: Matt BAYLEY and Clifford LYNCH.
    Matt BAYLEY: Facilitating Diverse Collection and Curation in Web Crawling and Indexing.

    Matt Bayley will briefly summarize his work this semester with the Internet Archive on collaborative web crawling, archiving, and indexing. This will include a survey of existing techniques and initiatives as well as an exploration of new protocols for crowd-sourcing these data and representing them within a shared infrastructure.
    Clifford LYNCH: Developments in 2018 and Prospects for 2019.
    Every December, I give a plenary talk at the member meeting of the Coalition for Networked Information (CNI), where I serve as the director. Among other things, this talk summarizes what I see as key developments in the previous year and critical prospects for the coming year across a very broad landscape of technology and networked information. Recently, we've established a tradition at Berkeley where the final session of the Fall seminar has been used for a somewhat more leisurely exposition and exploration of these developments and prospects in preparation for my plenary talk. Please join us for the 2018 version of this survey.

    The Seminar will resume on January 25.

Spring 2018 schedule and summaries. Spring 2019 schedule and summaries.