Group 3: California Digital Library - Pilot Study

California Digital Library

Assignment 9: Pilot User Study

INTRODUCTION

CDL Prototype #2 concentrates on clarifying the user's orientation to the digital library sources while illustrating that the user should expect a two level search in order to find specific resources. We have created a working website of 10 pages -- 3 main pages and 7 placeholders that simply illustrate the overall proposed information architecture for the site. While we have filled the main pages -- Home, Choose a Source, and Browse Sources -- with significant functional content to work through the scenarios, it is important to note that the Browse Sources and Choose a Source pages function directly with the live CDL website to retrieve the actual Sources, source lists, and "more info" on sources. Consequently, the user inevitably will encounter live CDL pages containing UI problems that are outside the scope of our interface redesign. This implementation choice presents some unavoidable constraints to the usability test.

To facilitate testing and analysis we attempted to make a distinction between source (high level) and resource (roughly, individual publications), but this distinction turned out to be less clear to our users than we had hoped.

The purpose of this informal study is to determine how close we are to meeting the goals of our Prototype and to identify any other major problems that need to be addressed at this final stage of development. In general, our goal for the Prototype was to improve a user's ability to understand the CDL collection and to facilitate more effective searching of the Collections to find a desired resource more directly. To help isolate the UI during testing, we chose to modify our scenarios to more closely match the domain knowledge of our targeted participants, yet still represent the tasks to be tested. To test the effectiveness of the UI, we outlined specific objectives of our interface (below) and chose to measure a selection of those (see Test Measures section.) .

Specifically, does a user:

understand what resources and services are available?
understand the difference between databases, electronic journals, finding aids, etc.?
know where they are in the site?
know what other options are available?
understand the necessary search process:
- ...the two-stage search - first select a resource (eg, database), then search within it for a specific item?
- ...the difference between citations, full text, and listings of library holdings?

Likewise, does the system:

facilitate users' searches?
help them search quickly?
help them assess the relevance of different materials?
help them recognize errors and switch to a more productive search method?
And outside the scope of our study but important nonetheless:

offer search precision - retrieve few irrelevant documents?
search recall - doesn't miss relevant documents?

METHOD

Participants
Each of the participants was a graduate student in the School of Information Management and Systems (SIMS) with no previous direct exposure to this project. This homogenous sampling was important to help hold user experience and prior knowledge variables as constant as possible in this setting. Likewise, the tasks were reworked to more closely match the domain knowledge of the participants rather than having them imagine outside their discipline of study. More SIMS-related content searches were used rather than the Science disciplines used in the previous scenarios. See more on this under Tasks. This allowed the participants to use the Prototype in a more realistic manner and therefore more accurately test the UI.

Apparatus
The participants were seated in front of a computer in the SIMS lab and flanked by CDL team testing members. The CDL Prototype #2 website was already in their browser and presented for their use. One CDL team member read the script and facilitated the testing. Paper handouts were provided to the participant containing the salient details of each scenario to assist the user in recall of the necessary tasks to be accomplished during the testing. Another team member took notes on the critical incident log per scenario to track the course of actions taken by the participant. No video or audio tape were used. This decision was made partly because there would not have been sufficient time to analyze the video or audio tapes, and partly because we felt the process we used during the lo-fi testing was reasonably appropriate for this small sample tested.

Tasks
We have rewritten the scenarios for this testing process, as noted under Participants above, to better match our participant's domain of knowledge and to better observe that we are testing the interface and not the participants. However, we ensured that the important tasks were still represented in the new scenarios we used in testing. For example, the original scenarios #2 & 3 [from our lo-fi testing] were designed to test two specific, important tasks: (1) the known-item search; and (2) the comprehensive search over a very narrow topic, i.e., user X needs to know everything published about such-and-such in the past year.

Scenario #1 and #2 are now designed to cover the task of a known item search and also used to test the users' understanding of the system -- not just a two-level search (first identify the database, then search for the specific item), but also a two part search (each with two levels). That is, first use a database to find the citation of the specific item sought, then find the actual item, using either an online electronic journal or the physical library (and its on-line catalog). Of course, we didn't require that the participants go to the physical library, but it's important to recognize that it's a part of the complete process, and to examine whether they consider the digital library a substitute for the physical library or an assistant in using the physical library.

Scenario #1: Imagine that you are a UC Berkeley student writing a comparative literature paper. You want to compare one of your class readings with Dr.Seuss' famous book Green Eggs and Ham. You want to find a copy of this book at UCB.

Scenario #2 is a variation of the known-item search designed to test the use of the Electronic Journals as well as searching with only partial knowledge of pertinent search criteria.

Scenario #2: Imagine that you are a UC faculty member studying 20th century theatre history. You are looking for a specific journal article that was recommended by a colleague; unfortunately, you are unable to remember the exact citation. You know that the article concerns the discovery of two "lost" puppet (marionette) plays from the Spanish Civil War (1936-39). The article was published in a leading English-language theatre research journal in late 1997 or early 1998 and was written by J McCarthy.
Please attempt to identify this article. If it is possible, you would like to obtain the full text of the article on-line; if not, you would like to identify libraries where you could view the physical journal.

Scenario #3 is designed to test the Narrow-Subtopic search--everything possible about a topic. Although that is a large task for our testing situation, it is important, as it is a common task in the live use of the CDL. We have chosen not to implement the use of the Profile or Update services offered by the current CDL, because these are not working in the CDL except for the Melvyl portion. Instead we have designed the scenario to have users search on a top level for sources they deem applicable.

Scenario #3: You are a UC Berkeley graduate student in the School of Information Management and Systems. You are expected to write a report on the National Information Infrastructure (NII). You want to find out about the history of NII and where it is headed. Your arguments about the future must be supported by past and current trends seen in the information industry. You have already browsed the WWW and now you want to see if there are any published materials in this area.

Please identify a few (top level) CDL sources that might be relevant to your research on this topic. Then choose a couple of these sources and delve deeper to find specific resources and papers on the history and development of the NII.

Procedure
The participants were seated in front of a computer in the SIMS lab flanked by CDL team testing members. The CDL Prototype #2 website was already in their browser and presented for their use. One CDL team member read the script and facilitated the testing session. Participants were provided with a Consent form to read and sign, which was also explained in the script. They were then interviewed using the pre-test questions to ascertain specifics regarding their status as students and familiarity with the live CDL and the other library systems like Melvyl.

In lieu of a demonstration of our prototype (because it was determined that the library search UI was common enough and the scenarios self-explanatory) the participants were allowed a moment to review the interface if they wanted to. Interestingly enough, each chose a different path at this point. Participant #1 simply read the home page, Participant #2 declined to take the time to review the site, preferring to simply begin the scenarios, and Participant #3 chose to click on all the links from the home page presumably to glean an overview of the site architecture.

Each scenario was read by the facilitator and paper handouts were provided to the participant containing the salient details of each scenario to assist the user in recall of the necessary tasks to be accomplished during the testing. The participants were not helped in any way even when they were a bit lost and needed help with orientation. Another team member took notes on the critical incident log per scenario to track the steps taken and other actions or thoughts (aloud) made by the participant. No video or audio tape were used. This decision was partly made because there would not have been sufficient time to analyze the video or audio tapes, and partly because we felt the process we used during the lo-fi testing was reasonably appropriate for this small sample tested. The lo-fi prototype testing yielded a significant amount of information which lead to an effective redesign. Scenario 1 & 2 had specific outcomes planned, Scenario 3 was determined "complete" when the participant deemed it was (even if it fell short of our planned outcome.)

After each scenario the participant was asked if they needed to rest before moving on. At the end of the testing of all the scenarios, follow-up interview questions were asked of the participant whose responses were recorded by a team member. After each tester, the team discussed the testing session before moving on to the next participant.

TEST MEASURES

Quantifiable measures attempted to be studied in this pilot test:

Speed of searches (measured in time)
Ability to recognize errors - count the number of steps a user takes down a wrong path before recognizing and retreating (or alternatively, measure the time spent on a wrong path ).
Search precision and recall - define the desired/intended set of search results for each scenario, and measure how many of them the user succeeds in locating