School of Information
 Previously School of Library & Information Studies

 Friday Afternoon Seminar: Summaries.
  296a-1 Seminar: Information Access, Spring 2019.

Fridays 3-5. 107 South Hall. Schedule. Weekly mailing list.
Details will be added as they become available.

Jan 25: Clifford LYNCH: Climate Change and Some Possible Surprises.
    1. Introduction to Seminar and plans for the Semester.
    2. Introductions.
    3. Brief Discussion: Stewardship and Climate Change.
Climate Change -- and particularly more intense storms and rising sea levels -- raises a number of challenges to the enterprise of cultural stewardship. I'll briefly enumerate some of these and solicit thoughts about priorities, strategies, and aspects that I've overlooked. This is very early thinking.
    4. Some Possible Surprises in Scholarly Communication.
There's a tradition of welcoming the new year with lists of possible surprises that may take place in various areas. Here I'll raise several fairly near term (next few years) scenarios that may seem unlikely but are worth some consideration. Additional suggestions for possible surprises are welcome.

Feb 1: Catherine MARSHALL: The Times They Are A-Changin': The Influence of Scandal and Experience on Users’ Attitudes to Social Media Data Control.
    Social media has become an entrenched function of today's Internet. Has widespread news of abuse--e.g. the Cambridge Analytica scandal--changed people’s perceptions of how corporations and public institutions can use personal data? We compare two datasets, one collected through an October 2013 study, and the other collected via an augmented version of the same study performed in May 2018. Overall, participants in 2018 are more willing to cede control of their personal data to others (including public institutions) than they were in 2013. Participants with greater awareness of the Cambridge Analytica scandal’s details show an increased desire for data mobility, more skepticism about targeted advertising and news, and an increased willingness for social media sites to demand corrections to inaccurate content. This work was done in collaboration with Frank Shipman, Center for the Study of Digital Libraries at Texas A&M University.
    Cathy Marshall is an independent researcher and writer and Adjunct Professor at the Center for the Study of Digital Libraries (CSDL) at Texas A&M University. For many years she was a principal researcher at Microsoft Research, Silicon Valley. Before joining Microsoft, Cathy was a hypertext researcher at Xerox PARC at the dawn of the Internet era.

Feb 8: Michael BUCKLAND: Information Flows and Cultural Disruption.
    Every society depends on a pattern of communication, coordination, trust, and coercion. A change to its pattern is culturally disruptive. Currently attention is on ‘fake news’ in social media and the invasive use of data on activity. Disruption induces cultural change and/or is mitigated by regulation. Innovation in information services have social consequences. Authoritarian regimes resist open information sources, a free press and freedom of speech and travel. Efforts to change library services during the Allied Occupation of Japan (1945-1952) provide a starting point for asking what “American librarianship” (compared with, say, Soviet or Japanese librarianship) could mean -- and why and how information access matters -– and, more generally, relationships between technology choices, resource allocation, and cultural differences. Join us for a discussion.

Feb 15: Mark GRAHAM, The Internet Archive: What's New with the Internet Archive’s Wayback Machine?
    Mark Graham will review various projects related to the missions of "Universal Access to All Knowledge" and efforts to "Help Make the Web More Useful and Reliable". He will share updates about work with Wikipedia sites to fix broken links and provide direct access to hundreds of thousands of books from Wikipedia articles. He will review new features of the Wayback Machine, new related API and browser extension functionality, and new projects to archive news and social media. In addition he will want to hear from everyone, requests and suggestions, about how they think the Wayback Machine can be made more useful and ideas about new collaborations and projects.
    Mark Graham, Director of the Wayback Machine at the Internet Archive, has been inventing, building and operating pioneering online services for more than 30 years. Most recently as SVP with NBC News Digital. He co-founded the NGO Association for Progressive Communications (APC.org), creating AOL’s interface to the Internet (Gopher and WAIS), leading technology and business development at The WELL, and building a platform to crowd-source and distribute live-video citizen reporting from mobile devices.

Feb 22: Rosalie LACK, UCLA: UCLA Library – International Partnerships and Projects.
    Libraries and archives play a critical role in preserving and providing access to the collective memory of communities. Unfortunately, the resources for enabling archive holders to preserve cultural, social, political, and historical evidence are scarce and urgently required. This talk will highlight UCLA Library’s international initiatives aimed at addressing these challenges, with a focus on the International Digital Ephemera Project (IDEP) and Documenting Global Voices (DGV). The goal of these programs is to preserve and provide broad, public access to at-risk materials. The presentation will provide an overview of both programs including the benefits and challenges of the different post-custodial models employed for each.
    Rosalie Lack is a Project Manager at the UCLA Digital Library and is responsible for grant-funded initiatives, including the Documenting Global Voices (DGV), International Digital Ephemera Project (IDEP), the Syriac and Arabic Manuscript project, and PRL (Pacific Rim Library). Rosalie experience also includes working as Director of Digital Special Collections at the California Digital Library and as Deputy Director for Electronic Information for Libraries (EIFL). Rosalie holds a Masters in Information Management and Systems from University of California, Berkeley. More at www.linkedin.com/in/rosalielack

Mar 1: Wayne de FREMERY: Computational Bibliography and the Sociology of Data.
    The speaker's current book project, "Computational Bibliography and the Sociology of Data", reinvigorates analytical bibliography by expanding the scope of what bibliography describes and by diversifying the forms used in bibliographic description. As etymologies of the word bibliography suggest, bibliographers have used bibliographic forms -- books -- to document books. Analytical bibliographers have typically investigated the materials and technologies used to create and circulate texts. Computational Bibliography and the Sociology of Data suggests expanding the scope of analytical bibliography to include the computational systems currently creating and circulating data. It suggests using computational methods to document computational systems in order to illuminate the materials and technologies expressing data, as well as to describe the socio-historical constraints within which people have worked to make and share data. This talk will outline the broad arguments presented by Computational Bibliography and the Sociology of Data and then narrow its focus to reveal the deep relationship between traditional forms of bibliographic description and newer forms of artificial intelligence, especially those related to machine learning. The talk proposes inductive approaches of bibliographers such as W.W. Greg and those creating machine learning frameworks are homologous. It also suggests that the critiques of bibliography as an inductive science leveled by scholars such as D.F. McKenzie and Jerome McGann are isomorphic with critiques of current machine learning methods.
    Wayne de Fremery teaches Korean literature and bibliography at Sogang University in Seoul, where he develops new technologies for investigating Korean literature and documentary traditions, as well as information systems as cultural systems.

Mar 8: Hany FARID: The Accuracy, Fairness, and Limits of Predicting Recidivism.
    Joint work with Julia Dressel. Algorithms for predicting recidivism are commonly used to assess a criminal defendant’s likelihood of committing a crime. These predictions are used in pretrial, parole, and sentencing decisions. Proponents of these systems argue that big data and advanced machine learning make these predictions more accurate and less biased than humans. Opponents, however, argue that predictive algorithms may lead to further racial bias in the criminal justice system. I will discuss an in-depth analysis of one widely used commercial predictive algorithm to determine its appropriateness for use in our courts.
    Hany Farid, a specialist in digital forensics, image analysis, and human perception, will join the faculty next summer in a joint appointment with EECS. He has degrees in computer science and mathematics and has worked at MIT and Dartmouth College. He is also the Chief Technology Officer and co-founder of Fourandsix Technologies and a Senior Adviser to the Counter Extremism Project. More at www.cs.dartmouth.edu/farid/.

Mar 15: Günter WAIBEL and John CHODACKI, California Digital Library: Community-Owned Data Publishing: CDL’s new partnership with Dryad.
    The California Digital Library (CDL) has invested considerable effort researching and building exemplars in research data management and data publishing. Like most institutions, we have had varying levels of success, especially when it comes to adoption and reach. In many instances, University of California researchers have taken advantage of tools that are offered to a much broader community, and are better integrated into their workflows. CDL’s strategic vision acknowledges that in many instances, to best serve the University of California, we now need to think and act in a context that is broader than our institutional home. To meet researchers where they are, CDL entered into a formal partnership with Dryad. This partnership will make it easier to integrate data publishing into researcher workflows, and to be focused on building a sustainable product that is a credible alternative to commercial offerings within the research data space. With both CDL and Dryad’s expertise, we will be able to offer:
- Researchers: a higher level of service and integration into their established workflows;
- Publishers: direct integrations and more comprehensive curation services; and
- Institutions: a globally-accessible, community-led, low-cost infrastructure and service that focuses on breaking down silos between publishing, libraries and research.
    For more info about the Dryad partnership, see: https://uc3.cdlib.org/2018/10/24/community-owned-data-publishing-infrastructure/.
    For CDL’s strategic vision, see: www.cdlib.org/cdlinfo/2018/04/12/introducing-cdls-strategic-vision/.
    Günter Waibel is Associate Vice Provost & Executive Director of the California Digital Library. He has extensive experience in the digital library and broader cultural heritage communities and is well-known for his work in promoting cross-domain collaboration. In his previous position he oversaw the strategic plan for creating a digital Smithsonian out of the institution’s 19 museums and 9 research centers. More at www.cdlib.org/contact/staff_directory/gwaibel.html
    John Chodacki is UC Curation Center Director, California Digital Library. He has a background in product management within digital publishing and scholarly communication organizations. More at www.cdlib.org/contact/staff_directory/jchodacki_profile.html.

Mar 22: Daniel KLUTTZ: AI, Professionals, and Professional Work: The Practice of Law with Automated Decision Support Technologies.
    A report on work being done jointly with Deirdre Mulligan. Technical systems employing algorithms are shaping and displacing human decision making in a variety of fields. As technology reconfigures work practices, researchers have documented potential loss of human agency and skill, confusion about responsibility, diminished accountability, and both over- and under-reliance on decision-support systems. The introduction of predictive algorithm systems into professional decision making compounds both general concerns with bureaucratic inscrutability and opaque technical systems as well as specific concerns about encroachments on expert knowledge and (mis-)alignment with professional liability frameworks and ethics. To date, however, we have little empirical data regarding how automated decision-support tools are being debated, deployed, used, and governed in professional practice.
    The objective of our ongoing empirical study is to analyze the organizational structures, professional rules and norms, and technical system properties that shape professionals’ understanding and engagement with such systems in practice. As a case study, we examine decision-support systems marketed to legal professionals, focusing primarily on technologies marketed for “e-discovery” purposes. Commonly referred to as “technology-assisted review” (TAR) or “predictive coding,” these systems increasingly rely on machine-learning techniques to classify and predict which of the voluminous electronic documents subject to litigation should be withheld or produced to the opposing side. We are accomplishing our objective through in-depth, semi-structured interviews of experts in this space: the technology company representatives who develop and sell such systems to law firms and the legal professionals who decide whether and how to use them in practice. We argue that governance approaches should be seeking to put lawyers and decision-support systems in deeper conversation, not position them as relatively passive recipients of system wisdom who must rely on out-of-system legal mechanisms to understand or challenge them. This requires attention to both the information demands of legal professionals and the processes of interaction that elicit human expertise and allow humans to obtain information about machine decision making.
    Daniel N. Kluttz is a Postdoctoral Scholar at the UC Berkeley School of Information. There, he helps organize and lead the Algorithmic Fairness and Opacity Working Group (AFOG), an interdisciplinary group that brings together UC Berkeley faculty, postdocs, and Bay Area technology professionals to develop research and policy recommendations regarding fairness and transparency, governance, professional ethics, and social impacts of emerging technologies and practices, particularly as applied to artificial-intelligence-based systems, algorithmic decision making, and data science. Drawing from intellectual traditions in organizational theory, law and society, economic sociology, social psychology, and technology studies, Kluttz’s research is oriented around two broad lines of inquiry: 1) the formal and informal governance of economic and technological innovations, and 2) the organizational and legal environments surrounding such innovations. His current projects include studies of the psychological, organizational, and cultural underpinnings of personal data exchange in the digital economy, the effects of automated decision-support technologies on professional work practices and the construction and implementation of data science ethics in the tech industry and higher education. He has employed both qualitative and quantitative methods in his work, including in-depth interviews, longitudinal and multi-level modeling techniques, surveys, geospatial analyses, and historical/archival methods. Kluttz’s research has appeared in a variety of peer-reviewed publications, including the Law & Society Review, Socio-Economic Review, and Handbook of Contemporary Sociological Theory. He holds a PhD in sociology from UC Berkeley, a JD from the UNC-Chapel Hill School of Law, and dual bachelors' degrees in sociology and psychology from UNC-Chapel Hill. Prior to pursuing his PhD, he practiced law in Raleigh, NC. More at www.danielkluttz.net.

Mar 29: Spring Break: No Seminar meeting.

April 5: Michael BUCKLAND & Clifford LYNCH: Basic Needs, Access, and Marketplace Structures.
    Michael BUCKLAND: Connecting Needs, Documentation, and Evidence.

    A hermeneutic approach suggests a possible conceptual bridge between the most basic need for information and designs for the organization of access to recorded evidence. A brief continuation of our discussion on February 8.
    Clifford LYNCH: New Marketplace Structures for Cultural Products and Implications for Stewardship.
    The structures of the public marketplace and exchange under the doctrine of first sale worked very well for cultural memory institutions such as libraries. Unfortunately, these are now being rapidly eclipsed by complex and opaque new market structures that incorporate (compulsory, large scale) license structures. The effects of these changes for memory institutions are a potential disaster, and are poorly understood. As time permits in this discussion, I'll begin an exploration of how the marketplaces in music, books, and moving image (video) materials are changing, and highlight some of the areas that seem particularly opaque to me.

Apr 12: Double program: Augmented Reality and a Film Premiere.
    3:10 pm: Nicole HADASSAH-VALDEZ: Augmented Reality and The Public.

    We will explore how Augmented Reality affects image, movement, and consciousness by analyzing the effects of Instagram filters, Google Map’s live directions, and Pokémon Go’s gamification on publics. First, I will take us through the popularity and functions of Augmented Reality in these three case studies, and then, I will undertake the explanation of AR’s implications on group experience. These implications, will then, be tied back to the problem of chronicling hybridized experience as an imposition of truths onto a plastic web that structures simulated information. Lastly, I will contrast the possibilities of AR’s strengths as enhancements on “real life” by introducing real time editing as an empowering tool that shapes the image of historical narratives that account for the public.
    Nicole Hadassah-Valdez is a senior studying Interdisciplinary Studies and Rhetoric. Born and raised in the Bay Area, she grew up appreciating multiple cultures and forms of thought, which brought her to concerns regarding the authority of information and the power of big data. As tools for argumentation, numerical values quantify experiences that each carry a story to compel an audience. She hopes to work for a quasi-public, or entirely public organization to serve local communities and streamline access to ( financial and developmental) information.
    4:10 pm: In DOCAM'S Footsteps. North American premiere of a new documentary film by Sabine Roux on the origins of the Document Academy featuring scenes shot in South Hall. The Document Academy is an informal international collaboration that has enlivened Information Studies by exploring the physical, cognitive, and social aspects dimensions of objects perceived as signifying. It began in 2001 when Niels W. Lund, founding director of the program in Documentation Studies at the University of Tromso, Norway, came to this School as a Visiting Professor. With narratives by Niels W. Lund, Roswitha Skare, and Michael Buckland. For more on the Document Academy see www.documentacademy.org/?about.
    Sabine Roux is a film-maker and high school librarian living near Toulouse, France.

Apr 19: Jeffrey MACKIE-MASON: Moving towards open scholarship: UC, Elsevier and all the rest.
    The movement to make new scholarship freely available to all readers began at least by 1994 with Stevan Harnad's "Subversive proposal". In 2013 the UC Academic Senate adopted one of the first mandatory OA policies in the US, requiring that a copy of all newly authored research be deposited in an open archive regardless of where it is published. In Winter 2018 the University Libraries published an action roadmap, Pathways to OA. Since then, the Libraries and the Academic Senate have worked closely together to pursue some of the actions discussed in "Pathways". Most visibly to date, the University sought to negotiate a "transformative" contract with Elsevier (the world's largest scholarly publisher) that would publish all UC-authored articles as open access, retain full reading rights to all Elsevier publications, and reduce the total cost of reading plus publishing for the University. Negotiations failed, but rather than sign a business-as-usual contract, the University canceled its agreement with Elsevier on 28 Feb 2019. At the same time, the University is negotiating for transformative agreements with several other publishers, and leading a coalition seeking to help non-profit scholarly society publishers to flip their journals to open access.
    Jeff MacKie-Mason is the University Librarian and Chief Digital Scholarship Officer at UC Berkeley and has joint appointments as a professor in the School of Information, and in the Department of Economics. He has played a leading role in building the UC-wide faculty-administration coalition, and is co-chair of the Publisher Negotiations Team, which carried out the negotiations with Elsevier. He will discuss the Univeresity's goals, the process leading to the new strategies, the negotiations with Elsevier, and other OA efforts currently underway.

Apr 26: AnnaLee SAXENIAN: The I School in 2019: Where we've been and where we're going.
    I'll look backward and review the changes in the school since I became Dean in 2004, as well as forward, to discuss scenarios for the future. Video recording.

May 3: Michael BUCKLAND: The Concept of Context; and Yasunori SAITO.
    Visiting scholar Yasunori SAITO will briefly introduce his work on information seeking behavior, the role of reference librarians, and the logic of knowledge and belief (epistemic logic). Prof. Saito is professor of Clinical Sociology in the School of Arts and Letters, Meiji University, Tokyo, and formerly a Vice Director of Meiji University Library.
    Michael BUCKLAND: The Concept of Context.
    Information is inevitably created in a context and, whenever used, is necessarily used in some context. Intermediaries, too, have their own contexts. The literature on information-related behavior mentioning context is vast and varied. Nevertheless the concept of “context” itself seems underdeveloped in information studies beyond the simple case of spatial and temporal metadata. Formal models of systems exist independently of contexts. Information system design ordinarily recognizes inputs, outputs, and boundaries, but neglects contexts. The large literature on “Information seeking in context” is much more about seeking than about context. I will argue, however, that components have long been available, in hermeneutics, social constructivism, bibliography, information science, and elsewhere, which, if combined, can support theorizing both context and contextualizing. Join us for a discussion.

May 10: Clifford LYNCH: Changing Production and Distribution Systems for Mass-Market Cultural Materials and Implications for Stewardship: The case of Video Materials.
    The way in which video (including "film") materials are produced and the pathways by which they are distributed have changed radically from the days of VHS or even DVD. This has broad implications for our cultural memory institutions and also for efforts to attempt to even understand patterns of availability of material for libraries, or the stewardship status of materials. I'll present what I believe are a number of open research questions. This seminar talk and discussion will continue and complement an earlier seminar session focusing on music.

May 17: Muhammad Raza KHAN: Machine Learning for the Developing World using Mobile Communication Metadata.
    A report on PhD dissertation research. Researchers working on the problems associated with the developed world generally have access to rich and diverse datasets like social media activity, sensors data, etc. However, the same is not correct about the developing world where access to comprehensive datasets is one of the most significant issues in the research. Social networks and digital sensors have not been that common in the developing world with one big exception, i.e. mobile phones. More than 95% of the world population today has mobile phone coverage, and even in some of the most under-developed places of the earth, the penetration of mobile phones is much higher as compared to other measures of human development like literacy or access to the financial infrastructure. As a result, researchers have been increasingly using the meta-data collected by the mobile phone companies in these developing countries as an alternative to the more conventional data sources. However, the mobile phone data may not be very well suited for the machine learning algorithms in its raw form. In other words, there is a need for algorithms to convert the raw mobile communication meta-data into features suited for the machine learning algorithms.
    In this talk, I am going to describe my work on extracting features from mobile communication logs using techniques like Deterministic Finite Automata (DFA). I will also show how this approach outperforms other methods for problems like product adoption. I further show that by using DFA based features and spectral analysis of the multi-view nature of mobile communication networks, advanced neural network algorithms can be developed that beat the current state of the art methods for the problems like poverty prediction and gender prediction. In the last part of this talk, I will describe the value of communication networks data for research questions related to social networks analysis like what are the salient differences between the behavioral patterns of men and women in the developing world as exhibited in the communication networks data.
    Muhammed Raza Khan recently completed his PhD in the School of Information where, as a member of the Data Intensive Development Lab, he worked on problems related to machine learning for social good. The insights resulting from his work on feature generation using mobile communication metadata has been used by the International Finance Corporation (a subsidiary of the World Bank) to improve financial inclusion in countries like Ghana and Zambia. In Ghana, this approach resulted in better targeting of the customers of mobile money products by a margin of 30% as compared to the existing methods. Raza was also one of the grantees of the UN Big Data for Gender Challenge - Data2X. Raza's work has been published in venues like the ACM SIG Knowledge Discovery and Data Mining and the Association for the Advancement of Artificial Intelligence. Prior to UC Berkeley, Raza completed his Masters in Computer Science from Georgia Tech as a Fulbright Scholar. For more see www.linkedin.com/in/razarehman.

    The Seminar will resume in the Fall Semester.

Fall 2018 schedule and summaries.