Assignment Eight:

Introduction

Method

   Participants ApparatusTasksProcedure

Test Measures

Results

Discussion

Formal Experiment Design

Appendices

 

Introduction

VERN is a web-based meeting scheduling tool for the academic community at SIMS. (VERN is the cousin of eD, a web-based democratic decision-making tool that tabulates votes and sends out the results via email.) The purpose of VERN is to eliminate the tedious, inefficient process of sending numerous emails or making various phone calls to schedule meetings times between many users. VERN automates the scheduling process so the amount of effort and time required by each group member is reduced, while enabling group leaders, such as professors, to visualize the general availability of members within the group.

VERN automates the meeting scheduling process by providing an interface where the user can specify a range of time of when a meeting needs to occur and the people who are invited to attend. The program then sends out an email to the attendees who can then use different criteria to 'vote' for a meeting time. Depending on the results of the vote, the meeting initiator can decide on a meeting time and can use VERN to notify all attendees. VERN also features the ability to input a recurring class schedule so users can see at a glance the times they are unavailable.

TOP

Method

The purpose and rationale of our pilot usability study was to observe the ease with which our test subjects navigated through the various functions of our second interactive prototype. This was especially important since is a complex system with lots of functionality, thus clarity of function is imperative. Three proposed user task scenarios were used to test VERN's various features. We were particularly interested in how intuitive our users perceived the language, icons and design to be, and what complications or confusion might arise.

The prototype used in the pilot usability study is located here: http://www.sims.berkeley.edu/academics/courses/is213/s05/projects/vern/prototype/

TOP

Participants

We chose three MIMS students who are highly technical and familiar with principles of good user interfaces. Because we had developed our prototype extensively since the first prototype, we wanted to take advantage of members of our SIMS community who could offer us more detailed and thorough feedback. We felt that we had already demonstrated VERN to more novice users who were able to convey to us the difficulties in using a new system. Our more advanced users were able to provide us extremely useful information regarding terminology and layout, especially with our drag-and-drop interface.

User Demographics
Age Technical Skills Prior Familiarity
with VERN
Dave Hong mid 20's Medium-High Low
Scott Fisher mid 30's High Medium
Helen Kim late 20's Medium-High Low

TOP

Apparatus

We tested with a Windows XP laptop using Mozilla FireFox version 1.0.3 with a screen resolution of 1600x1200. While it may have been preferable for participants to test on their own laptops so that we could see how our interface appeared on a variety of screens, our purpose during this usability test was not to test browser functionality, but that actual prototype itself. Previous test results have shown some differences in screen appearance that we plan to address for the third prototype. However, since it was not on our critical task list, we have not implemented those fixes yet. We wanted to make sure our testers were able to test in an environment that allowed them to test the interface itself, without being bogged down by browser dependent issues.

The tests took place around the conference table in the basement of South Hall.

We recorded the audio using Audacity, free recording software found at http://audacity.sourceforge.net/

We had one group member be the facilitator and one group member be the timer and note taker. In some cases, there were multiple members present to take notes. We used a stop watch to measure the time needed to perform various tasks.

TOP

Tasks

Because our previous scenarios were based on our paper prototype and a hard-coded version of our system, we amended them to fit the current prototype. We have now implemented it to hit the database and actually track meeting information, so we needed to update the scenario tasks to reflect that. The scenario updates reflect the following changes:

  • Added cancel and finalize meeting buttons
  • Check history of prior meetings
  • Use history page to schedule new meeting
  • Implemented functional login screen
  • User can scroll to prior and future weeks
  • We removed Scenario #3 from this test because we have not yet implemented organizer functionality so we felt like the test would not be thorough enough to be beneficial

Links to prior scenarios are in Assignment 4. Our new scenarios are below.

Scenario #1
You are professor at SIMS who is trying to work out your schedule for the upcoming week using VERN.

  1. Sign up to the system
  2. Login to the system
  3. Check your meetings for this week
  4. Check your meetings for next week
  5. Look for any meetings you have scheduled this week
    1. If any of the meetings can be pinned down, inform all the meeting attendees of the final meeting time
    2. Cancel one of the meetings that have already been confirmed
  6. Schedule a project meeting with your new Nobel Prize Project group
    1. Schedule a 1 hour kickoff meeting for this group next week
    2. (optional) Create a group called Nobel Prize and add 3 members (Forrest Gump, Papa Smurf and Lil Kim)
    3. Make sure that your admin, Cookie Monster, has delegated authority to finalize the meeting
  7. Logout of the system

Scenario #2
You are a student in SIMS who has received an invitation in the email from a professor trying to schedule a kickoff meeting for their potentially Nobel Prize winning research project.

  1. Login to the system
  2. Find the meeting invitation from your professor
  3. Respond to the professor's request for meeting times
  4. Put your class schedule into the "Class Schedule" screen.
  5. Check the history of your meeting schedules
  6. Logout of the system

TOP

Procedure

  1. Each interview took one hour
  2. User sat in front of the computer with the facilitator watching from one side and the time and observers watching from the other.
  3. Introduced the system and explained the problem we were trying to solve
  4. Explained to the user that we would be recording them
  5. Explained the first scenario to the user
  6. Instructed the user to create a new login
  7. Facilitator read the first task of the scenario
  8. User executed the task
  9. Facilitator read the next task
  10. User executed it
  11. Etc.
  12. Timer recorded time taken to accomplish key tasks
  13. Note-taker recorded user’s comments while performing tasks
  14. We repeated some tasks multiple times, or came back to them at a later time in the test, especially the drag and drop interface, to compare the time differential between the user’s first interaction with the interface and his subsequent interactions
  15. Facilitator asked for any additional feedback

TOP

Test Measures

The test measure we used is time. We recorded how long it took for each user to complete a task. Also, if the user made an error, we recorded their choices and behavior. We chose to measure time because VERN is supposed to make meeting scheduling easy and fast, with a minimal number of clicks. Thus a fast time would indicate a positive correlation with VERN's goals. We decided to record a user's choices and behaviors when they made an error because that would inform us of a feature that needed clarification and how to approach the problem.

TOP

Results

During the usability study, we asked the testers to describe what they were doing, thinking and observing. While it may have been against the ideal protocol, we would also pause things to further explore problems and issues that came up, as well as ask our testers to give suggestions for improvements. The consequence of this approach is that we received enormous amounts of qualitative information and were able to engage in a dialog with users to explore further design possibilities, but did not gather as much purely quantitative information. While all three testers were timed on scenario 1, we only gathered timing information from 1 tester on scenario 2.

The timings for all three testers on scenario 1 are presented, and the errors and problems they came across will be described in a narrative/qualitative manner. In a more formal study, we appreciate that more things would be quantified – however that would really require reviewing videos and properly coding all the observations. We did not have the resources to video the study, but it is clear from our experience that this would be required for a rigorous, quantitative study. In addition, our method involved a lot of discussion and interrupts, which effected the timings.

Timings Scenario 1

Task Times
Tester 1 Tester 2 Tester 3
Sign-Up 0:45 1:30 0:23
Login 0:45 3:30 0:22
Schedule Mtg 4:00 4:30 2:41

Observations:

There were several tasks across all the scenarios that were repetitious, we do not include redundant sections for observations of repeated tasks. All observations about the same type of tasks are merged into a single section.

Sign-up and Login:
  • One tester found it hard to identify the "new account" link. This was due to the small font used on a high resolution screen
  • Several testers commented that the 2 step login for new users (register, then login again) was tedious and unnecessary. This caused one tester to have an unusually long login because of some confusion after the initial registration.

Checking meetings:
  • This was the landing page and users generally scanned the page and gave us feedback. This was not generally timed, because it often turned into an open-ended discussion about the interface. It was especially difficult for users to see their meetings because they were new users, and had no meetings whatsoever.
  • The layout of meetings on the side panel was hard for testers to understand. The categories and visual layout need to be made more intuitive
  • Users were sometimes confused about the meaning of the colors on the calendar screen ("Whats with the gray things - I thought it meant not available?")

Schedule a Project Meeting:
  • Many testers had trouble with the "Meeting occurs before" option, few of the testers understood the purpose.
  • Testers asked why the year selection and location weren't dropdowns
  • Delegated organizer was also an unclear concept that needed brief explanation
  • GMail style name completion seemed to be intuitive for all testers
  • The 2 step meeting scheduling turned out to be very confusing. Users must setup the meeting and are then sent to a page that allows them to input their own preferred times. Very different from most (non-democratic) calendaring systems such as Outlook.
  • For selecting times, there are 3 colors that are user selectable, but there are an additional 2 colors on the calendar that were unclear to users: the gray background indicated time slots in the past, and the default background color which was neither preferred, possible or unavailable.
  • One user expected the click and drag semantics to be more like text selection, and not like painting i.e. if you drag past a location, you can back up and the overshot area will be deselected.
  • Testers generally found the week view starting on Sunday to be intuitive
  • The location of the "Create Meeting" button at the bottom of the left side panel was an issue for all testers. The phrasing "Create Meeting" was also an issue, with the phrase "Propose Meeting" preferred by our testers.
Logout:
  • No significant observations
Class Schedule Screen:
  • It became clear that the real semantics of this page was more along the lines of "recurring meetings", and that "Class Schedule" was something of a misnomer. This brought up the general issue of how to schedule a recurring meeting (not addressed in the current interface).
  • Once again, the semantics of the different colors on the applet screen proved to be confusing. Testers expected there to be some meaning to the default background color.
  • Users expected that times entered on this screen would reflected in the main landing page (eventually it will)
Contacts and Groups:
  • The contacts page is only a prototype and several users clicked around expecting to be able to edit fields
  • One tester suggested that a pick list combined with a text entry field would be best group creation. Others suggested being able to drag and drop from the contacts table to create groups.
  • The ability to create a meeting directly from the contacts/groups page was asked for by testers.

TOP

Discussion

Observations about the study can be broken up into 2 categories: the process of performing a usability study and changes to the Vern application:

Performing a usability study:

Based on our experiences, it is difficult to train all the interviewers in a fixed protocol for timings and data gathering in advance of the interviews. It is also next to impossible for a person taking notes by hand to capture all the necessary data. Videotaping the process, and then coding the results afterwards (or some similar approach) seems like the only viable way to approach this.

During the interview process, timers and interviewers would sometimes ask questions that resulted in brief discussions that effected timings - in a rigorous experiment, these would have been delayed until after the timings were done. However, we were actually more interested in a qualitative assessment of the interface (and the assignment is explicitly informal, not formal).

Having users actually think aloud and describe their observations and mental processes is invaluable. It brings out the expectations they bring to the task, highlights vocabulary problems and serves as a fertile ground for discovering potential changes to the UI.

Changes to Vern:

Login:
The 2 step registration/login is cumbersome and we will send the user into the app immediately after registration.

Contacts:
Currently, the contacts page doesn't really do anything. We plan on adding genuine functionality and work on either drag and drop or a pick list for groups We will also provide a way for meetings to be created directly from the contacts page.

Side Panel:
The layout of meetings on the side panel needs work to become easier to parse
Creating Meetings:
The button on the lower left of the side panel for creating meetings is badly located, and the labeling misleading. A new tab will be created called "Propose Meeting", so that this action is easier to find.

Colors on scheduling applet:
The semantics of the different colors on the applet proved to be a problem. What became especially obvious that the default state of a time block should be clarified. We have decided that the default state of a block of time should be changed to unavailable, simplifying the number of selections on the applet and providing a meaningful default value. In addition, we will add a radio button called "repeating class" that can be used from the applet at any time to block out a period time as a repeatedly unavailable. There were also comments that the layout of the radio buttons and colors on the calendar needed work - these are also on the ToDo list.

Two step meeting creation:
It became crystal clear that the 2 step meeting creation was a source of confusion: users fill out a form with basic meeting information, and are then sent to a screen to fill in their own proposed meeting times. This is problematic in several ways:
  • Users are often familiar with the MS-Outlook approach, where a meeting time is chosen and then meeting information entered on a form. Vern is democratic, so the MS-Outlook approach is not entirely appropriate
  • There are at least 2 steps in the process, and the user's location in the process is not clear. Some form of navigational aid is required.

The changes we are making to the interface are:

  • Create a navigation diagram that shows the procedure for creating a meeting, and display the user's location in the process prominently in the various screens (stolen shamelessly from the Sylvia design)
  • Shorten the initial form into a smaller set of required fields, and include the time selection applet on the same page

Finalize Meeting:
This screen has not yet been designed (it is actually the most difficult problem we've come across), we have several design ideas and will present our results at or before the final presentation.

General Bug Fixes:
We have started to work back end database functionality into the application. Predictably, a lot of the Vern application is now considered "broken" because we have the new expectation that UI actions actually DO SOMETHING!!! This has also forced us to test areas of interaction that had only been handwaving before, and put us in a position where we really need to think through how the application behaves. We hope to have an initial Vern application with working database functionality before the end of term.

TOP

Formal Experiment Design

Based on extensive user feedback all centering around the desire for a “unified interface” to work with meeting scheduling, as well as confusion around the usage of the “Weekly Classes” feature, our group proposed to combine the “Weekly Classes” generic calendar view into all of the “Vote for a meeting time” views. A repeating conflict is represented as a new color in the voting view, and marking a time of the week (Monday June 25th 10:00 AM) as a “repeating conflict” marks all Mondays at 10:00 am as unavailable.

This change would potentially promote the usage of the Weekly Classes feature and accelerate data entry. However, it has the potential for user confusion, with users not expecting a change in one week’s availability to impact the availability for that specific time slot for all weeks in all meetings.

Hypothesis

  1. Combining the “Weekly Classes/Conflicts” into every meeting voting page will increase weekly conflict usage while reducing data entry and speeding up the voting process.
  2. Users with a busier schedule are more likely to use features like the repeating class indicator.

Factors (Independent Variables):

  • Weekly schedule page represented in an independent tab vs. represented as a third radio button present in every meeting voting page.
  • Average number of meetings/week each user attends over the time period of the test

Factors (dependent Variables):

  • Overall usage of the “weekly classes” indicator – per user, what percent of the calendar is marked as repeating class
  • "Delta" or Rate of Change of the “weekly classes” indicator – How much do users alter their weekly schedule (do they fill it in and leave it in the first week, or do they continue to add items to flesh out their weekly schedule)
  • Time to fill in an average meeting request

Experiment Design:

Our experiment would test to see if use of the “Weekly Class” availability indicator has a greater adoption rate when merged with the meeting voting screens. It would also test to see if users “flicker” back and fourth, marking and unmarking a repeating time slot, or if they generally fill it in and leave it constant for the duration of a semester. Because of the rate of adoption question, the experiment would necessarily have to track a group of users over a multi-week time period. In order to accurately estimate adoption rates, users would be asked to use VERN for coordinating all emails theat they would normally coordinate over email for a period of one month. This month would have to be an average month during the semester, and not overlap with finals, midterms, or vacation.

User Participants:

A group of 40 users of similar background (all masters students at SIMS) would be sufficient to draw a comparison between the integrated-repeating and the separate tab design options. They would be divided into two groups with 20 users in each group. The "between group" style experiment would be employed with one group using the Weekly Schedule tab version, and the other using radio button version.

The rate of usage across both groups would be compared to the usage of the repeating class feature to prove or disprove hypothesis #2. After deciding if hypothesis #2 is correct, the usage of the repeating class option could optionally be normalized against the number of meetings each student attends.

TOP

Appendices

Interview One

Interview two

Interview Three

 

TOP