Introduction
Method
Test
Measures
Results
Discussion
Formal
Experiment Design
Appendices
Link
to Powerpoint Presentation, in-class 5/1
Introduction
Our user study tested the interface of the ReadingTree website. This test
was designed to evaluate the perceived usefulness and usability of the
system, according to our target user group--elementary school-age children
who like talking about and finding out about books. We wanted to see children
interacting with the system, performing both structured and unstructured
tasks in order to see which features are most attractive to our users
and which ones present difficulties. We
also wanted to confirm that we had sufficiently addressed the problems
identified during the heuristic evaluation, and see how child users responded
to the member rewards structure (newly added to our prototype).
Method
Participants
We tested five
children, ages 6 to 11 (2nd through 5th grade). There were four boys and
one girl. Three
testers were children known to us through family or classmates, while
the others were recruited through the Berkeley Public Library. We did
not perform any formal screening, other than to ask the children whether
they enjoyed both books and computers. None of the children were non-readers
(i.e. all of them self-selected as enjoying reading to some degree) and
each of them had prior computer and web browsing experience.
Apparatus
We conducted three of the tests in the upstairs computer lab in South
Hall. These participants used Netscape Navigator 4.75 running on a Windows
NT workstation with 19" monitor. The other two were conducted at
the test subjects' home in Castro Valley. These participants accessed
the Internet via the AOL browser, running on a home PC with 56k modem
and 13" monitor. The computer was on a desk in the corner of the
kitchen, next to but facing away from the family room.
Environment
Since we lacked a formal usability lab in either place, there were distractions
in each location. In the South Hall computer lab, other students were
talking and working nearby, and the Campanile bells made it difficult
to hear during at least one test. Additionally, the parents of each child
returned about 45 minutes after the beginning of the test, which seemed
to signal to the child that the test was over. This contributed in part
to the paucity of the responses to the post-test questions.
Tests conducted in
the home also had distractions. Since the computer was located in one
end of the kitchen, dinner preparations and other household sounds occasionally
diverted the users' attention. The computer was also adjacent to the family
room, the door to which was usually closed. However, siblings and friends
playing video games quietly in the next room would periodically observe
the proceedings for a few minutes, which also distracted the users to
a degree.
Tasks
We had six primary tasks that we hoped to have the user perform. These
were:
- Sign up to become
a member of ReadingTree. (Designed to test signup procedure.)
- Find a book you
have read and enjoyed. (Where the users were unable to find any books
in our limited database, we directed the users to Harry Potter. Designed
to test search capabilities and assess how kids were likely to search
for a known item.)
- Say what you think
about it. (While this task was designed to lead to rating and reviewing,
any feedback was permissible -- we allowed the users to determine when
the task has been completed, more or less. Designed to test rating,
reviewing and possibly asynchronous communication.)
- Find out what
other kids think about this book. (This task was often completed before
the above. Designed to see when/if/how other kids' opinions influenced
their decision to read a book or not.)
- Find a book you
haven't read but think you might enjoy. (Designed to test how the system
supports kids' unknown-item searches, including concept of the Bookshelf,
book suggestions, Featured Books, What's Hot and The Yuck List.)
- Remember this
book for later. (Designed to test how our interface supports asynchronous
book choice, i.e. how a kid remembers an interesting book when ee is
not able to immediately acquire it. Includes concept of the Bookshelf.)
While we tried as
much as possible to lead each user through each task, we were also interested
in the unstructured browsing of the site, since a user could conceivably
quite happily use the site often without using all its features. During
these tasks, or site exploration, we
were concerned with 5 general areas:
- Content (amount
of writing, readability)
- Navigation (easy
for users to get where they want to go, intuitive for beginners)
- System structure
(users know what their options are)
- Usefulness (of
system, of specific features)
- Appeal (graphics,
color, level of interactivity)
Specifically, we
were hoping to gain insights into the following questions:
- What find-a-book
methods do children prefer? (searching for an item, browsing by subject,
looking at a hand-picked list)
- What say-what-you
thought methods do children prefer (rating, reviewing, posting a message)?
- What save-book-info
method do children prefer (save, print, write it down, tell someone)?
- When/where do
users want a help/site search/site map, if ever?
- Do our terms make
sense: Treehouse? Bookshelf? What's New?
- Do users understand
how book recommendations are generated/improved (and do they want to
know?)
- How could search
function be improved?
- Is the site interactive
enough? (Was there enough to do here?)
- Under what circumstances
would children use this site? What would make them more likely to use
it?
Procedure
We attempted to stick with the script
as much as possible. After a brief introduction to the project members,
the purpose of the site and the assurance that "we're not testing
you," we asked the user a number of questions about ers experience
with computers and the Internet, and ers opinion about books and reading.
The user was then permitted to explore the site for a few minutes before
beginning the formal tasks. Their free-form exploration occasionally overlapped
with one or more of the tasks. Since this test iteration was not concerned
with deriving statistically valid data, the order in which the tasks were
completed (and indeed whether the task was completed at all in some cases)
was not enforced.
Rather than ask the
users to think aloud, which seemed artificial and difficult for kids,
the facilitator attempted to ask questions during the testing about why
certain tasks were completed in certain ways and whether completion seemed
difficult or confusing. The facilitator tried to strike a balance between
maintaining a constant flow of communication between tester and user and
distracting the user with continual questions.
Since a large part
of what we hoped to discover pertained to overall satisfaction with the
site and an understanding of what's available and how to use it, we decided
not to curtail site exploration in favor of task completion. Wherever
possible, the facilitator attempted to remind the user of the task at
hand, or ask about the user's interest in some aspect unrelated to the
current task.
Once the tasks had
been completed, or about 30 minutes of interaction with the system elapsed
(whichever came first), we stopped the testing and asked several followup
questions pertaining to overall satisfaction with the system, appropriateness
for younger kids and perception of the relationship between kids who like
computers and kids who like to read. We then solicited open feedback about
anything they wanted to tell us about their experience with ReadingTree.
Testers received
a $10 gift certificate to either a local bookstore (for children who had
indicated previously that they loved to read) or to a local movie theater.
Test Measures
We were interested in observing two main types of user/system interaction
- their ease in performing tasks (i.e. could they sign
up, or obtain book suggestions), and their ability to notice and understand
system functionality (i.e. could they figure out what to do with a bookshelf,
or understand the results of a poll).
Test measures included
the time it took to complete each task, the path taken, the number of
questions asked, the number of times the facilitator had to help the user,
as well as user-stated ease of use and satisfaction ratings.
We collected data
in two ways:
- One observer used
an event
logging spreadsheet, developed by Anoop Sinha from the GUIR
group of Berkeley's Computer Science Department. This tool allowed the
observer to quickly track user movements and basic interactions.
- Another observer
took notes on a hardcopy version of the task list and questionnaire.
This observer concentrated on recording user comments and user/faciliator
interactions.
Results
Because our tasks
were somewhat open-ended, we were not able to observe every user using
every task available from ReadingTree. Furthermore, during the tests we
recorded these evaluations of understanding, ease of use, and enjoyment
anecdotally (through observations and recording answers to questions),
rather than systematically. However, some general patterns did arise.
User typing and
spelling skills strongly affected usability
Though all of our users reported using computers at home and school, they
seemed more comfortable with the mouse than the keyboard. During the test
they all typed slowly and deliberately, and were very careful not to make
mistakes. While this meant that the user (usually) produced an an error-free
document, it also led to some user frustration, and activities like sign
up and book reviews took a long time for users to complete. (see User
Two, Task B notes).
Spelling was also
a concern for the kids - slightly misspelled author names returned no
hits during search and a couple of users were hesitant while typing reviews
and warned us throughout the tests that they were bad spellers. Fear of
posting reviews with misseplled words could keep some kids from posting
- something we need to closely watch in further tests.
Users who liked
points REALLY liked points
(Note - We added a bare bones Member Rewards section - consisting
of member levels, certificates and bookmarks - after submitting but before
testing this version of Reading Tree.)
While all of the users
noticed they received points for rating/reviewing books or answering polls,
and most visited the member rewards page from the link on their Bookshelf,
two of our five users became very interested in points - what they could
get, what level they were at, etc. For one user getting and monitoring
his points was the primary concern, and he was interested in completing
tasks only if they would help him move to the next member level.
Introducing points
was intended to increase and reward Reading Tree member participation.
While it motivated all users to varying degrees, we may have to accept
that for some users acquiring points may be a bigger motivator than a
simple love of reading.
Screen size mattered
Though we modified our screens to to be usable on a 800 x 600 monitor,
we found that small screen size had some adverse effects on usability.
Important functions (such as new member sign up) displayed completely
below the fold and pages became more difficult to read as column width
narrowed.
Kids read content
- especially if it's from other kids
All of the users carefully read and had comments about kid reviews
and bulletin board posts. Users wrote their own reviews carefully and
read other's reviews critically with clear ideas of the kind of information
they wanted (i.e. recommendations like "If you liked Book A you'll
like Book B"). They placed a lot of value on what other kids had
to say.
A variety of search
options are used (if not always useful)
Different users had
different preferences for searching and navigating. Most users used either
title/author or subject search more, but all were able to use both during
the test. Users were less likely to use the letter-based search, but it
is unclear if they didn't see or didn't understand that option.
Visual representation
of the user's performance during the study:

Additional
data:
Discussion We
learned a great deal from our pilot study.
1. The site appeals
to children who like to read. Four of our five users said that they would
use this site again and would recommend it to a friend. (We are aware,
however, that these results may reflect the children's desire to tell
us what they think we want to hear, rather than their true feelings.)
2. The basic navigational
structure supports most users in achieving primary goals of our design
personae, Danny and Jenny: finding a book they would like to read and
letting other kids know about a book they have read. One of our users,
however, experienced a great deal of trouble in accomplishing the tasks
and requested that the test be concluded early. This could mean that our
design is somewhat frustrating for younger, less Web-savvy and less book-loving
kids. This would need to be investigated further through more user testing.
3. There are several
issues with our page layout and form design that must be addressed if
the website is to succeed with this user group. Many of these changes
are simple, and are necessitated by variation in monitor sizes and by
our users' limited ability to type and spell.
Sign Up
Sign Up option
should appear above the sign-in boxes on the home page.
On the "Sorry,
you're not a member" page, Sign up needs to appear above the fold.
When sent to this page in error, kids did not know what to do to correct
the problem, because the sign-up forms were not immediately visible.
Also, if a user makes a mistake in signing up, ee should not be required
to enter all of the information again, only the problematic items. This
was an error handling element that we meant to implement but decided
to postpone. We saw in the user tests that it is an essential feature.
It took one child 15 minutes to complete the sign-in process.
Sign-up field labels
should align with the fields. Even the small misalignment that resulted
when the site was displayed on a smaller monitor led to unnecessary
confusion.
Find a Book
Title search and
author search need to be moved farther apart. Having them grouped together
so closely caused every user who attempted a title search to enter the
author's name as well.
Kids were very
frustrated when they searched for a book and received no results. Until
our database is more complete, we should consider allowing them to add
books. Also, when no results are returned the system should send the
user back to the Find Book page automatically and perhaps provide tips
for improving the search results.
Spelling is also
a challenge. A future version of our system should offer search results
that include "near misses"-- titles and authors that match
the search criteria by all but one or two letters.
Review a
Book
"Review
a book" link needs to be made more prominent on the book information
page.
On the "review
book" form, we should consider how to revise the labels to make
it clear that a) each box is optional and b) the writing does not need
to be formal.
Member Rewards
Because the point
system clearly had a strong appeal, our design needs to make it easy
for kids to find out how to earn and spend their points. Links to information
about member rewards should appear on every page of the site, including
the "thanks for rating/reviewing/answering a poll" pop-up
page.
Answer Polls
We either need
to clarify that the vote does not get submitted until the user clicks
Rate it, or we need to change the system so that the vote is submitted
automatically, on the first click, perhaps with a confirmation message
providing the option to cancel it.
Changes for the
"Real" Experiment
We would like to
address the issue of self-selection by testing with users who did not
self-identify as "kids who love to read." More generally, we
would like to test a broader range of user types, of different genders,
ethnicities and cultures, as well as different levels of computer skills.
We found this pilot
study useful for learning how the site's basic features and structure
would be received. We did not require users to complete every task on
the list. For our real experiment, we would take a slightly more formal
and structured approach, asking users to complete every task in a specific
order. We might also use a grid similar to the one shown in the Results
section, to be sure to capture the same information for each task and
user.
In gathering feedback
from the kid testers, we would design a form to collect quantitative data
on satisfaction and ease of use. Following recommendations from Allison
Druin and others, we would use graphical images such as smiling and frowning
faces as anchor points on vertical rating scales. We could read our questions
out loud and ask the children to mark the point on the scale to indicate
"how much" of something is true. According to some researchers
(Risden, Hanna, and Kanerva 1997), children respond more reliably to this
sort of representation which incorporates meaningful images and concepts
of more and less (using a vertical rather than horizontal scale).
Formal Experiment Design
We propose to study
two variations of our page design. Our current prototype has a horizontal
orientation, with two or three columns of content, so that the pages are
wide, rather than long. We believed that this design would allow children
to more easily discover and access ReadingTree's features. We are curious
to know whether our current design succeeds in this goal or whether a
vertical page orientation, which requires more up-down scrolling but which
is also more conventional for Web interaction, would be more usable and
satisfying. We
also wonder whether a user's goal in visiting the site will make a difference
to the user's design preference.
Hypotheses: Our first hypothesis is that users will prefer
the horizontal page design, regardless of their goals in visiting the
ReadingTree site. Our second hypothesis is that users will spend more
time at the site and discover more features with the horizontal design.
Factors and Levels:
Our tests would focus
on the most content-intensive pages (Find Book, Treehouse, and What's
New) as these pages would be most dramatically affected by the page reorientation.
Independent
Variables:
- Page design: horizontal
(as in current prototype) vs. vertical (would need to be implemented--here
is an example page).
- User goals. We
would select users of two kinds: children who read for pleasure frequently
and children who read only for school assignments. The teacher would
make the first assessment, which we would validate with a "reading
habits" questionnaire.
- Gender. We would
hope to have equal number of boys and girls in each test group so as
to minimize (and identify) any gender-based differences.
Dependent:
- Ease of use ratings
- Subject satisfaction
rating
- Success at completing
task (able to find a book they would want to read?)
- Length of time
using the system (at what point do they become bored?)
Plan to Control
Confounding Variables:
- We would limit
our user base to those in a single grade, or better still, at the same
reading level (as assessed by a teacher) to minimize the confounding
variable of reading ability.
- We would also
seek to recruit users with a minimum level of web experience, so that
difficulties resulting from lack of knowledge of basic Internet conventions
(for example, using underlining to indicate a hyperlink) will not confound
the results. Level of experience will be assessed through a screening
questionnaire that asks about current number of hours per week of Web
use and number of years using the Web.
Blocking and
Repetitions: We
would use a 2 X 2, between-groups design. There are three reasons we opted
for a between-groups design. First, users in our target age group have
a limited attention span so asking them to test two systems in a single
session is not realistic, unless we abbreviate the tasks and interview
sessions. However, asking users to participate in two separate sessions
is also not workable, as it is difficult to recruit minors for even one
experimental session. Learning effects are a third concern--the second
interface tested would most likely be received more favorably because
it is more familiar.
|
Reads
for Fun
|
Does
Not Read for Fun
|
Boys
|
Girls
|
Boys
|
Girls
|
Horizontal
Layout |
5
|
5
|
5
|
5
|
Vertical
Layout |
5
|
5
|
5
|
5
|
Ideally, and provided
that we had the full cooperation of an elementary school (or schools),
we would test 10 users in each cell, 5 boys and 5 girls, to allow us to
complete a thorough statistical analysis of the results.
Appendices
Materials
Recruitment
flyer
Consent
form
Script,
with pre- and post-test questionnaire
Raw Data
List
of design problems and programming bugs
User
Data (user responses and our observations)
Event
logger (Excel spreadsheet--includes observer comments)
Summary
of pages visited (Excel spreadsheet)
Fix
list for Prototype 3
|