School of
Information
Previously School of Library & Information Studies
Friday Afternoon Seminar: Summaries.
296a-1 Seminar: Information Access, Spring 2009.
Fridays 3-5. 107 South Hall.
Schedule.
Summaries will be added as they become available.
Friday, Jan 23: Clifford LYNCH: Introduction to the Seminar.
Michael BUCKLAND: Remaking Access to Reference Resources.
Understanding depends on knowing the context and background of any topic,
so in a print and paper world a reference library is an excellent place to
read, to write, and to learn. But non-digital solutions are obsolescent and
the special amenity of the reference collection has not yet made an effective
transition to the digital library environment. In hindsight,
the print on paper codex-based was less than perfect in several ways.
Research on reference service
has focused on empowering the reference librarian rather than the even more
important task of empowering library users.
So how could we combine the
convenience of Google and the Wikipedia with
the selectivity and trustworthiness of the library reference collection?
Our projects supported by IMLS and NEH suggest how
we might achieve the best of both worlds. The talk will discuss the issues,
demonstrate prototypes being developed, discuss the possibilities, and invite
comments.
Background reading:
Library reference service in a digital environment, Library and Information
Science Research 30, no 2 (2008): 81-85. http://people.ischool.berkeley.edu/~buckland/libref.pdf.
Friday, Jan 30: Clifford LYNCH: Networked Instrumentation for the
Humanities and Social Sciences.
One of the key ideas of e-science cyberinfrastructure is the
integration of large scale observational instrumentation into the network,
data storage and computation fabric to support scientists; data collected or
derived from these instruments might be shared among large numbers of
researchers and in fact also repurposed to support teaching and learning at
all levels. Recently, I have been thinking about the implications of having
large corpora of literature and discourse -- both historic and current --
available on the net, and what this might offer for the creation of new networked
instrumentation in the humanities and social sciences, building on enabling
technologies like text mining. In this discussion, I'll outline what I believe
are some of the possibilities here, and also some of the issues involved in
access to data sources.
Friday, Feb 6: Fredrik WALLENBERG: Judging a Book by it's Cover:
Online Previews and Book Sales.
There has been a continued debate about the benefits and cost associated with
providing free samples of information goods on the Internet. Some argue that the
samples lead to increased sales through increased awareness of the good while others
claim that the previews and samples cannibalize sales. In this paper I present a
unifying model where we show that information about the good, specifically samples/trial
versions/previews, etc., of the good have both a sales promoting and cannibalizing effect
and that either of the two can be dominant. I then set up an experiment in which I
look at the impact on the sales of a specific set of books from the enabling of searching
the contents and previewing of pages relevant to the search query. I find no significant
impact on sales from the previews. The sample available to me is however on the small
side and also from a very specific genre, both of which impact the results.
Friday, Feb 13: David GREENBAUM, Steve MASOVER, and Rich MEYER:
The Bamboo Planning Project - Developing "Cyberinfrastructure" for the Arts,
Humanities, and Interpretive Social Sciences.
We will give an overview and update on the Bamboo Planning Project,
www.projectbamboo.org. Bamboo is a
multi-institutional, interdisciplinary, and inter-organizational effort that brings together
researchers in the arts and humanities, computer scientists, information scientists,
librarians, and campus information technologists to tackle the question: How can we advance
arts and humanities research through the development of shared technology services?
The Bamboo Planning project, funded by the Andrew W. Mellon Foundation and led by UC
Berkeley and the University of Chicago, is half way through its 18-month planning and
community design process. Over 100 universities, colleges, and organizations concerned
with digital humanities have participated in the planning process to date. In this
presentation we will discuss some of the cultural, organizational, and technical
opportunities and challenges in attempting to develop cyberinfrastructure across a highly
diverse set of scholarly practices and institutions.
David Greenbaum, Steve Masover, and Rich Meyer are
staff members in the campus's Information Services and Technology division.
Friday, Feb 20: Students' Progress Reports and 2009 i-School Conference report.
Megan FINN: Information Practices after Modern California Earthquakes.
This work will be a chapter of my dissertation which situates the
information practices of early internet users after the 1989 earthquake in the
context of California earthquakes. My hope is to make an argument that information
practices after an earthquake are more enduring than the latest information
technology.
Patrick RILEY: The Metadata of Live Television.
What if you could search for what is being mentioned on TV? What if
you wanted to get an email alert as soon as the Berkeley iSchool is
mentioned on live TV? Is there information we can gain from indexing
everything ever said on TV at any given time, in terms of linguistics
research, media monitoring, fact-checking, social media, and search
query trends? Is TV metadata even important compared to the enormous
production of social, user-generated media?
A new search engine created at the Berkeley iSchool for indexing the
metadata of live TV will be demonstrated by Patrick Riley, a Ph.D.
student at the iSchool, and a discussion on metadata, copyright, media
monitoring, and data mining will follow.
Report and Discussion on the 2009 i-School Conference by Daniela
Rosner, Christo Sims and maybe others.
The Fourth iSchools Conference brought together scholars and
professionals who come from diverse backgrounds and share interests in working
at the nexus of people, information, and technology. "The conference celebrates
and engages our multidisciplinary efforts to understand the scholarly, educational i
and engagement dimensions of the iSchool movement."
A report from people who attended. See www.ischools.org/iconferences/participation.
Feb 27: Tuukka RUOTSALO and Mikko VILLI, Finland.
Mikko VILLI, University of Art and Design Helsinki UIAH:
Visual mobile communication. Camera phone photographs in mobile messaging.
[Brief presentation only]: I concentrate on photo messages, i.e.
photographs taken with a camera phone
and sent to another mobile phone. A salient aspect is the convergence of phone and
camera -- the phone being a communication device intended mainly for interpersonal
communication, and, on the other hand, the camera being a device devoid of any
means to directly communicate with other people. From this disparity rises the central
question in my research: How does the convergence of photography and mobile phone
communication affect our communicational and photographic practices? More at www2.uiah.fi/~villi/tekstit/Researchplanvilli.pdf.
Tuukka RUOTSALO, Finnish CultureSampo, Helsinki University of Technology (TKK):
Cultural Heritage on the Semantic Web.
Cultural Heritage has recently become an important application area
for semantic technologies. The current semantic technologies enable
powerful ontology-based search and browsing capabilities for digital
collections. However, many bottlenecks of semantic systems can be
identified: 1) quality of ontologies, 2) mediation of heterogeneous
content and 3) information visualization and access. I will present the
publication concept and the online semantic portal CultureSampo, a system of
creating a collective semantic memory of cultural heritage on a national
level:
www.seco.tkk.fi/applications/kulttuurisampo
The system addresses the challenge of aggregating highly heterogeneous,
cross-domain cultural heritage into a semantically rich
intelligent system for human and machine users.
In addition to the CultureSampo system, I will present
methods and tools for collaborative ontology development, search, and
natural language processing developed within the National Semantic Web
Ontology Project in Finland (2003-2007, 2008-2010)
(www.seco.tkk.fi/projects/finnonto/),
and an ongoing work on
context-aware mobile search/recommending in the EU FP7 project SmartMuseum
(www.smartmuseum.eu).
More information about the Semantic Computing Research Group
at the Helsinki University of Technology and University of Helsinki:
www.seco.tkk.fi.
Friday, Mar 6: Group tour of remodeled Bancroft Library.
This week a guided group tour of the newly remodeled
Bancroft Library. Assemble as usual in South Hall 107 promptly at 3:10 p.m.
and we will go over as a group.
See http://blogs.lib.berkeley.edu/whats-new.php/2009/02/13/bancroft-in-the-news.
Friday, Mar 13: Michael BUCKLAND and Ryan SHAW: Editing Historical Papers
in a Digital Environment.
The editing of historical papers in projects such as the
Emma Goldman Papers Project here on campus is a challenging undertaking
in several ways. Such projects are hard to fund.
The traditional product is a set of printed volumes
which constrain the amount of editorial research that can be
published. Relatively little use is made of digital technology and
with current methods substantial duplication of effort appears to
be unavoidable.
In recent months we have been exploring ways in which
some efficiency and return on investment might be improved. Some
efficiencies might result from improved search support and editing tools.
A new genre of web-published "Editors Notes" might very beneficially complement
the print volumes, improve access, and reduce duplicative effort.
I will lead a discussion of the challenges and
how they might be addressed.
Friday, Mar 20: Paul DUGUID: The World According to grep:
What Have We Been Searching For?
In recent years the Internet has increasingly been defined by
search, its resources reached primarily through a search box. While the
Internet is new, search of course is not. And though modern search may
appear to endorse the idea that we have always been foraging for information,
and that progress has involved shrugging off old encumbrances in order to
make information increasingly "free" and autonomous, this discussion hopes
to put the history of search in an alternative light and so doing clarify
some of what is and is not new and perhaps what is and is not possible
for the developing world of digital search.
Also Clifford LYNCH: An Update on Institutional Repositories
and Development of International Repository Infrastructure
We have looked several times at the evolving role of
institutional repositories. In this discussion, which follows on an
international meeting earlier this week in Amsterdam looking at the roadmap
for inter-repository infrastructure, I will look at some of the goals that
we might hope to achieve in the evolution of inter-repository infrastructure
and inter-repository interoperability of various kinds, and some of the
technologies and standards that may be helpful in achieving these goals.
Friday, Mar 27: Spring Break. No Seminar meeting.
Friday, Apr 3: Jeanette ZERNEKE: Update and Issues in Digital Humanities.
This presentation will outline some recent developments in digital
humanities infrastructure, humanities computing, visualization tools and
techniques, and digital scholarship. Three primary themes will be outlined:
infrastructure; scholarly processes; and visualization.
Information from several recent conferences will be incorporated,
including the Electronic Cultural Atlas Initiative /
Pacific Neighborhood Consortium joint fall conference (Hanoi, Dec 2-6, 2008)
ecai.org/activities/2008-Hanoi/08hanoi.html,
the Visualizing the Past workshop (University of Richmond, Feb 20-21, 2009) dsl.richmond.edu/workshop,
and CAA 2009 (Computing and Computational Methods in Archaeology, Williamsburg,
March 22-26, 2009)
http://www.caa2009.org.
We hope to spark a lively discussion of each theme.
Jeanette Zerneke is Director of Information Technology for
the Electronic Cultural Atlas Initiative (ECAI) and was until recently Director,
Information Systems and Services, for International and Area Studies.
Friday, Apr 10: Interim Progress Reports: Ryan SHAW, Patrick RILEY, and Megan FINN.
Ryan SHAW: Providing Context for Historical Documents.
Tremendous resources are being invested in digitizing historical
documents. These investments promise to dramatically improve our
access to the documents of the past. Yet simply finding and accessing
a document does not, in itself, enable understanding. Effective use
requires understanding a document's context. Traditionally a library's
reference collection has provided various tools for assembling such
contextual understanding. How might we repurpose and augment these
tools for a networked environment? In this talk I will present my
research into representing and providing in a networked environment
contextual information of the kind typically found in reference works.
In particular, I will discuss my ongoing dissertation research, which
focuses on "events" as a topical category and examines the feasibility
of developing "event directory" services that aggregate and provide
basic information about historical events and their relationships. I
will argue that the design of such services should be grounded in a
clear understanding of the nature of historical events and how they
function as concepts for organizing our understanding of the past, and
that work in the critical philosophy of historiography can aid such
understanding.
Megan FINN: History of Post-earthquake Communication in California.
An update on the archival research I have been doing for
my dissertation chapter that examines Californian's communications related to
earthquakes. This chapter will help to situate my study of
information practices after the 1989 Loma Prieta Earthquake.
Patrick RILEY:
The Necessity and User Expectations of Real-time Indexing.
With so much live information being shared on the internet through
various communication channels, internet users are starting to expect
completely up-to-date search results. However, truly current searches
require real-time updating of the indexes and document weighting (if used)
as well as real-time execution of searches.
There currently exist various attempts at
real-time indexing, with twitter providing
search.twitter.com for searching tweets, and Google with search
results with news feeds, but how effective are these, and how can we
make this better?
Friday, Apr 17: Mari MILLER and Avi RAPPOPORT.
Mari MILLER: The Economics of Open Access.
Mari Miller, Librarian and Liaison to the I School,
will review issues in the economics of open access and resources she has
been exploring for a bibliography on the open access
movement (books, articles, blogs, websites, etc.). She will show where
they are located online and in the Library, and propose a collaborative
model for developing it further.
See updated webliography at http://www.lib.berkeley.edu/doemoff/sims/webliography.html.
Avi RAPPOPORT: Metadata, Sex, and Amazon.
Amazon failed in a big way on Easter weekend, and because
it is responsible for about a third of all electronic commerce in the
United States, it matters. If Amazon won't sell a book, or will sell it
but will "de-list" it, the book practically disappears.
The ways Amazon failed are many: it did not (and still doesn't) have a
clear policy on adult (sex and sexuality) content, and there is evidence
that it deals with adult materials in special ways. It placed too great
a reliance on metadata. The technical infrastructure was too flexible,
allowing changes without approval. Its communications to its customers,
authors and the media were worse than nothing. And it had the bad luck
to make a significant mistake regarding people who are highly articulate
and communicative, at a moment when there are technology tools to support
them, and the bad judgment to stay silent hoping it would go away.
Avi Rappoport is a metadata and search engine
consultant with Search Tools Consulting
<http://www.searchtools.com>.
Friday, Apr 24: Clifford LYNCH: The Scholarly Journal and the 4th
Paradigm of Science.
I've been asked to write a short chapter for a book being
published in celebration of the work of the late Jim Gray (who has spoken
in the Seminar in the past), particularly his vision of a 4th paradigm of
scientific inquiry based on data intensive science within which the earlier
approaches of theory, experiment and simulation might be unified. My talk
will present and test the main theses of this chapter. After very briefly
summarizing some of Jim's ideas, I'll explore what this 4th paradigm might
mean for the evolution of scholarly communication and the scholarly record,
both looking backwards at the evolution of the traditional scholarly journal
article and also into more speculative ideas such as open notebook science.
Friday, May 1: Students' Final Progress Reports.
Ryan SHAW: Mining Events from Wikipedia.
Last semester I presented progress on mining texts for descriptions of
events by looking for statistically significant co-occurrences of
dates and names. This semester I will present progress on mining
descriptions of events from a rather more structured source: Wikipedia
chronologies. Wikipedia has a great many chronology or timeline
articles that are rich sources of 1 or 2 sentence event descriptions.
By scraping these articles and parsing the individual chronology
entries into event representations, using the Wikipedia links as a
high-quality form of named entity detection, I can quickly assemble
databases of events. I have been experimenting with making these
events available on the web as Linked Data and queryable via SPARQL.
Megan FINN: The History of Communication After California Earthquakes.
I will present a final outline for my chapter on the history of
communication after California earthquakes paying particular attention
to information practices of ordinary people.
Patrick RILEY: Giving Relevancy to Twitter:
A new approach at Real-time Search.
Twitter search is an interesting "pulse" of what people are thinking
and talking about, and offers a new source of live, and most
importantly, public information. However, the search engine
purchased by Twitter provides only chronological search results, with
no offer or consideration of search priority based on relevancy. A
popular and successful way of determining "relevancy" on the internet
is the HITS or similar PageRank algorithm, which depends on web
publishers to show their level of support by linking to the most
important pages, which isn't possible in a "real-time" source of
information. In this presentation, I will present a search algorithm
for assigning a factor of relevancy to Twitter search results.
Friday, May 8: No Seminar Meeting.
The Seminar will resume on August 28.
Spring 2009 schedule.
Fall
2008 schedule
and summaries.