Previously School of Library & Information Studies
Friday Afternoon Seminar: Summaries.
296a-1 Seminar: Information Access, Fall 2010.
Fridays 3-5. 107 South Hall.
Friday, Aug 27: Clifford LYNCH: Introductions.
New Infrastructure and Instruments for the Human Sciences, and Implications for the Locus of Research and the Evolution of Disciplines in the Academy.
Introduction to the seminar and plans for coming weeks. Participant introductions,
with comments invited on interesting things that they have seen or done over the summer.
This talk, based on a paper currently in draft, examines several developments.
I explore a broad view of the prospects for observational and analytic instrumentation and
accompanying cyberinfrastructure to support new kinds of inquiry in a wide range of
disciplines, including the humanities, the social and behavioral sciences that are practiced
both inside and outside the traditional academy. A key point is that in contrast to the physical
and life sciences, where most instrumentation and cyberinfrastructure is intentionally purpose
built, here we find that many key systems depend on, or are the result of commercial activity,
of the transition of other media and formerly empheral human interactions to more persistent
and computational accessible digital forms, of government needs, or of citizen humanities
activities. Finally, I'll speculate about where research using such tools is likely to take place,
and how this relates to the evolution of the traditional academic disciplines.
Friday, Sept 3: No semester meeting.
Friday, Sept 10: Ryan SHAW: Events and Periods as Concepts for Organizing
Summary of dissertation research.
Events and periods are not objectively existing phenomena, but
concepts we use to organize our knowledge of history. They make
historical change comprehensible and help us orient ourselves with
respect to the history of the culture in which we participate. Thus
they are indispensable for describing both the content of history
scholarship and the context of documents that serve as evidence for
that scholarship. As historical discourse shifts its emphases and new
aspects of the past come to be considered significant, periods and
events are subject to constant change. Despite this change, we can
model periods and events in systems of knowledge organization because
it is possible to discern and formally describe relatively stable
recurrent patterns in their narration.
Friday, Sept 17: Charles van den HEUVEL, Regents' Lecturer:
Interface as Thing:
Annotation and Visualization of Historical Evidence in e-Research.
Michael Buckland coined in 1991 the phrase "information as thing"
and discussed this concept in relation to evidence. Buckland explored his notion that
information-as-thing can be seen as evidence further by analyzing types of information:
data, objects and documents from a historical perspective. One of the key figures in
his historical exploration of the term document was the Belgian pioneer of knowledge
organization Paul Otlet (1868-1944). Inspired by Buckland's concept of
information-as-thing, I will discuss Otlet's role of multidimensional
representations of knowledge in the development of "interface as thing."
Some of the hundreds of the visualizations of interfaces that Otlet made or commissioned
and that are kept in the Archives of the Mundaneum in Mons (Belgium) will be analyzed
to demonstrate the importance of "materialization of knowledge" in e-research for recent
discussions on the provenance and evidence of information in Web 2.0 and Semantic Web
solutions. The hypothesis will be put forward that the visibility of Otlet's struggle
with tensions of representation, incompatibility, and of inoperability of interfaces,
can lead to new questions that would not come to mind in current representations of
interfaces that connect data at first sight seamlessly into a unified but ultimately
problematic homogenous whole. Differences in the heterogeneity, ambiguity and
interpretative character of data, between e-research in the natural sciences, humanities
and social sciences should be acknowledged and made visible in e-research rather than
be reduced or ignored. I will focus particularly on the role of annotation and
visualization in the creation of evidence as part of e-research by discussing some
digital humanities projects I am involved in. Since Buckland stated that "evidence
implies passiveness [and] like information-as-thing, does not do actively"
several claims have been put forward by Berners-Lee and others about (semi-)
automated generated refinement of concepts that can be seen as a more active form
of evidence creation. I will refer to some historical antecedents of such claims
of automated hypothesis generation in the works of Otlet, Ostwald, Ranganathan,
Iverson and Miksa and explore the implications for information science to make
these manually and automated "chunks of evidence" accessible for future e-research.
Charles van den Heuvel works in the
Virtual Knowledge Studio for Humanities and Social Sciences in the Royal Netherlands Academy
of Arts and Sciences (KNAW), Amsterdam. He has been appointed to a short term visiting
appointment in the School of Information as Regents Lecturer.
Friday, Sept 24: Lewis LANCASTER, Tim TANGHERLINI (UCLA) and Michael BUCKLAND:
Network Pattern Recognition in Large Humanities Corpora.
A new grant to the Electronic Cultural Atlas Initiative will support
the application of
techniques developed for the analysis of very large science datasets to newly
available very large textual datasets in the humanities. In collaboration with
Tina Eliassi-Rad (Rutgers) and Christos Faloutos (Carnegie-Mellon & Google), we will
focus on recent developments in network analysis that focus on complex problems
including visual query systems, topic discovery, anomaly detection, and rapid mining
of complex time-stamped data as a means for extending these approaches to noisy Humanities
data using Buddhist Canonic texts (Chinese and Sanskrit); Irish studies journals (English
and Gaelic); and Danish folklore (English and Danish). We propose to begin by tuning
the visual query system for large graphs (GRAPHITE).
Friday, Oct 1: Ray R. LARSON, and Krishna JANAKIRAMAN, and Brian TINGLE
(California Digital Library).
The Social Networks and Archival Context Project.
Archivists have a long history of describing the people who, acting individually,
in families, or in formally organized groups, create and collect primary sources.
Archivists research and describe the artists, political leaders, scientists, government
agencies, soldiers, universities, businesses, families, and others who create and are
represented in the items that are now part of our shared cultural legacy. Because archivists
have traditionally described records and their creators together, this information is tied to
specific resources and institutions. Currently there is no system in place that aggregates and
interrelates those descriptions.
Leveraging the new standard Encoded Archival Context -- Corporate Bodies, Persons,
and Families (EAC-CPF), the Social Networks and Archival Context Project (SNAC) will use
digital technology to "unlock" descriptions of people from descriptions of their records and
link them together in exciting new ways. It will create an efficient open-source tool that
allows archivists to separate the process of describing people from that of records and will
create an integrated portal to creator descriptions--linked to resource descriptions in
archives, libraries and museums, online biographical and historical databases, and other
diverse resources--thereby providing more effective access and robust historical context to
a broad array of humanities materials. The prototype access system will demonstrate that
descriptions of persons, families, and organizations can be used as access points to
archive, library, and museum resources.
The Institute for Advanced Technology in the Humanities at the University of
Virginia will lead the SNAC project in partnership with the California Digital Library and
Berkeley's School of Information. EAC-CPF records will be derived from existing archival
finding aids from the Library of Congress, the Online Archive of California, the Northwest
Digital Archive, and Virginia Heritage; and also from name authority files supplied by the
Library of Congress, Getty Vocabulary Program, and OCLC Research.
Friday, Oct 8: Catherine MARSHALL, Microsoft Research:
Testing the Limits of Social Media Ownership.
Social media, by its very nature, introduces questions about content ownership.
Content ownership comes into play most crucially when we design services and applications to
archive, reuse, remix, or remove social media. We have been investigating social media
ownership issues using a series of Mechanical Turk surveys that probe respondents' current
attitudes and practices; the surveys combine open-ended questions about use with realistic
scenarios that test respondents' attitudes in specific situations. (This talk will describe
joint work with Frank Shipman at Texas A&M).
Friday Oct 15: Tuukka RUOTSALO, Visiting Scholar:
Knowledge Management for Digital Cultural Heritage.
Dr Ruotsalo is a Fulbright Scholar in the School of Information
for the current academic year. He has extensive experience in the use of digital
techniques in cultural heritage institutions. He will summarize recent experience from
large national projects in Finland (FinnONTO) on using ontologies and semantic knowledge
management technologies to overcome interoperability and information access problems.
He will also discuss the related project he is working on at the iSchool. For more see
Friday October 22:
Jeanette ZERNEKE, Electronic Cultural Atlas Initiative:
The Early California Cultural Atlas (ECCA).
The Early California Cultural Atlas (ECCA) is a collaborative research
project led by Professor Steven Hackel at UC Riverside in collaboration with Jeanette
Zerneke of ECAI. ECCA is developing a digital atlas of historical data related to the
colonization and settlement of early California. European settlement in North America
and the establishment of missions to Indians initiated dramatic demographic, environmental,
religious, and social change. In the first phase of the project we constructed a website of
historical change in the region of Monterey, California. Embedded Google Earth
visualizations show changes by year and allow the user to interact with the data layers and
time bar. The project has chosen to intentionally address ambiguity, developed an ambiguity
characterization methodology, and experimented with methods to visualize characteristic
land use patterns. In the process, we encountered significant new historical questions.
ECCA integrates multiple types of data such as:
- California Mission records from the Early California Population Project
based at the Huntington Library in Pasadena;
- Historical maps from the Library of Congress and David Rumsey Collection; and
- Hand drawn maps, images, and texts from the Online Archive of California.
For further information see:
Friday, Oct 29: Clifford LYNCH:
Some Initial Thoughts on Data Retention Lifecycles and Data Lifespans.
Determining what data to keep and what to discard, and most critically when to
discard it, is essential to any sustainable approach to research data curation.
Yet we seem to have almost no practical measures to help establish specific lifespans for
data. In this talk and discussion, I'll try to outline some initial thinking that may serve
as a point of departure and also identify some hypotheses that might be useful in advancing
further work on the topic.
Friday, Nov 5: Krishna JANAKIRIMAN and Michael BUCKLAND.
Krishna JANAKIRAMAN: Report on Matching and Clustering Entities in Large
Collections of Encoded Archival Context (Corporate Names, Persons and Families) Records.
I will be reporting my progress towards implementing techniques that match and
merge entities in collections of Encoded Archival Context (Corporate Names, Persons and Families)
records. I would be discussing cases where our initial simple techniques, techniques based on
exact matches using name authority files as a reference, failed to identify matches.
I also plan to discuss my experiments on using probabilistic graphical models to cluster
entities based on the information present in these records.
Also Michael BUCKLAND: (Re-)Using Other People's Data.
Join us for a discussion of the issues and impediments involved in the
use of data created by other people, especially
when the new use of old data is for a different purpose? How could the relative
importance of different barriers and the probable cost-effectiveness of alternative
remedies be assessed?
If we set aside the difference between digital and non-digital media,
what can be learned from our pre-digital experience?
Friday, Nov 12: John WILBANKS, Vice President, Science Commons, Creative Commons.
The Work of the Science Commons.
For John Wilbanks see
For the Science Commons see
Friday, Nov 19: Ryan SHAW & Patrick GOLDEN: Editorial Practices and the Web.
An initial progress report on the "Editorial Practices and the Web" project.
Scholarly editions of historically significant texts are important in the Humanities.
However, expert editorial work is difficult and funding is scarce. Current Web technology
can be used to improve the return on investment by making editors' work available more quickly,
more fully, and more widely. Additional objects are to avoid duplicative effort among different
projects and explore a closer relationship between scholarly editing and library special
Friday, Nov 26: Thanksgiving: No Seminar meeting.
Friday, Dec 3: Last meeting of the semester.
Krishna JANAKIRAMAN: Matching and Merging and
Megan FINN: Californians and Their Earthquakes.
Krishna JANAKIRAMAN: Matching and Merging Entities in
Collections of Archive Description Records.
I will present my final report on the progress made towards matching and
merging entities in collections of archive description records. I will discuss techniques
that use exact string matching algorithms, approximate string matching algorithms and discuss
how information from name authority files can be used to improve matching results. Experiments
with clustering algorithms and nearest neighborhood algorithms will be reported. I also plan
to discuss efforts towards linking data from dbpedia into the existing data and the
possibilities such linkages may provide.
Megan FINN: Californians and Their Earthquakes.
I will present a chapter of my dissertation research about Californian
information practices after earthquakes. In this talk I will discuss
the 1868 Hayward Fault earthquake, the last time an earthquake
originating on the Hayward fault shook the Bay Area. The presentation
will focus on the circulation of documents about the earthquake, with
an eye towards the telegraph and the circulation of reproduced images.
Upon the completion of the telegraph, the Sacramento Daily Union
presented a view of the telegraph that was not usual for the day: "the
lightning has annihilated a continent as an obstacle to intellectual
communication." I argue that the relatively new cross-continental
telegraph does not alone constitute an infrastructural epistemology,
but what Californians learn about the earthquake can be understood in
light of existing goals of several groups of people. Specifically, I
examine the documentary activities of the powerful San Francisco
Chamber of Commerce, the accountability of the San Francisco's
government, the newspapers' analysis of the quality of reports
available, and the authority of the California Academy of Sciences.
The Seminar will resume in the Spring Semester on Friday, January 21, 2011.
Fall 2010 schedule.
2011 schedule and