Assignment Eight:Participants Apparatus Tasks Procedure
IntroductionVERN is a web-based meeting scheduling tool for the academic community at SIMS. (VERN is the cousin of eD, a web-based democratic decision-making tool that tabulates votes and sends out the results via email.) The purpose of VERN is to eliminate the tedious, inefficient process of sending numerous emails or making various phone calls to schedule meetings times between many users. VERN automates the scheduling process so the amount of effort and time required by each group member is reduced, while enabling group leaders, such as professors, to visualize the general availability of members within the group. VERN automates the meeting scheduling process by providing an interface where the user can specify a range of time of when a meeting needs to occur and the people who are invited to attend. The program then sends out an email to the attendees who can then use different criteria to 'vote' for a meeting time. Depending on the results of the vote, the meeting initiator can decide on a meeting time and can use VERN to notify all attendees. VERN also features the ability to input a recurring class schedule so users can see at a glance the times they are unavailable. MethodThe purpose and rationale of our pilot usability study was to observe the ease with which our test subjects navigated through the various functions of our second interactive prototype. This was especially important since is a complex system with lots of functionality, thus clarity of function is imperative. Three proposed user task scenarios were used to test VERN's various features. We were particularly interested in how intuitive our users perceived the language, icons and design to be, and what complications or confusion might arise. The prototype used in the pilot usability study is located here: http://www.sims.berkeley.edu/academics/courses/is213/s05/projects/vern/prototype/ ParticipantsWe chose three MIMS students who are highly technical and familiar with principles of good user interfaces. Because we had developed our prototype extensively since the first prototype, we wanted to take advantage of members of our SIMS community who could offer us more detailed and thorough feedback. We felt that we had already demonstrated VERN to more novice users who were able to convey to us the difficulties in using a new system. Our more advanced users were able to provide us extremely useful information regarding terminology and layout, especially with our drag-and-drop interface.
ApparatusWe tested with a Windows XP laptop using Mozilla FireFox version 1.0.3 with a screen resolution of 1600x1200. While it may have been preferable for participants to test on their own laptops so that we could see how our interface appeared on a variety of screens, our purpose during this usability test was not to test browser functionality, but that actual prototype itself. Previous test results have shown some differences in screen appearance that we plan to address for the third prototype. However, since it was not on our critical task list, we have not implemented those fixes yet. We wanted to make sure our testers were able to test in an environment that allowed them to test the interface itself, without being bogged down by browser dependent issues. The tests took place around the conference table in the basement of South Hall. We recorded the audio using Audacity, free recording software found at http://audacity.sourceforge.net/ We had one group member be the facilitator and one group member be the timer and note taker. In some cases, there were multiple members present to take notes. We used a stop watch to measure the time needed to perform various tasks. TasksBecause our previous scenarios were based on our paper prototype and a hard-coded version of our system, we amended them to fit the current prototype. We have now implemented it to hit the database and actually track meeting information, so we needed to update the scenario tasks to reflect that. The scenario updates reflect the following changes:
Links to prior scenarios are in Assignment 4. Our new scenarios are below. Scenario #1You are professor at SIMS who is trying to work out your schedule for the upcoming week using VERN.
Scenario #2You are a student in SIMS who has received an invitation in the email from a professor trying to schedule a kickoff meeting for their potentially Nobel Prize winning research project.
Procedure
Test MeasuresThe test measure we used is time. We recorded how long it took for each user to complete a task. Also, if the user made an error, we recorded their choices and behavior. We chose to measure time because VERN is supposed to make meeting scheduling easy and fast, with a minimal number of clicks. Thus a fast time would indicate a positive correlation with VERN's goals. We decided to record a user's choices and behaviors when they made an error because that would inform us of a feature that needed clarification and how to approach the problem. ResultsDuring the usability study, we asked the testers to describe what they were doing, thinking and observing. While it may have been against the ideal protocol, we would also pause things to further explore problems and issues that came up, as well as ask our testers to give suggestions for improvements. The consequence of this approach is that we received enormous amounts of qualitative information and were able to engage in a dialog with users to explore further design possibilities, but did not gather as much purely quantitative information. While all three testers were timed on scenario 1, we only gathered timing information from 1 tester on scenario 2. The timings for all three testers on scenario 1 are presented, and the errors and problems they came across will be described in a narrative/qualitative manner. In a more formal study, we appreciate that more things would be quantified – however that would really require reviewing videos and properly coding all the observations. We did not have the resources to video the study, but it is clear from our experience that this would be required for a rigorous, quantitative study. In addition, our method involved a lot of discussion and interrupts, which effected the timings.
Timings Scenario 1
Observations:There were several tasks across all the scenarios that were repetitious, we do not include redundant sections for observations of repeated tasks. All observations about the same type of tasks are merged into a single section.
Sign-up and Login:
Checking meetings:
Schedule a Project Meeting:
Logout:
Class Schedule Screen:
Contacts and Groups:
DiscussionObservations about the study can be broken up into 2 categories: the process of performing a usability study and changes to the Vern application:
Performing a usability study:Based on our experiences, it is difficult to train all the interviewers in a fixed protocol for timings and data gathering in advance of the interviews. It is also next to impossible for a person taking notes by hand to capture all the necessary data. Videotaping the process, and then coding the results afterwards (or some similar approach) seems like the only viable way to approach this.During the interview process, timers and interviewers would sometimes ask questions that resulted in brief discussions that effected timings - in a rigorous experiment, these would have been delayed until after the timings were done. However, we were actually more interested in a qualitative assessment of the interface (and the assignment is explicitly informal, not formal). Having users actually think aloud and describe their observations and mental processes is invaluable. It brings out the expectations they bring to the task, highlights vocabulary problems and serves as a fertile ground for discovering potential changes to the UI.
Changes to Vern:Login:The 2 step registration/login is cumbersome and we will send the user into the app immediately after registration.
Contacts:Currently, the contacts page doesn't really do anything. We plan on adding genuine functionality and work on either drag and drop or a pick list for groups We will also provide a way for meetings to be created directly from the contacts page.
Side Panel:The layout of meetings on the side panel needs work to become easier to parseCreating Meetings:The button on the lower left of the side panel for creating meetings is badly located, and the labeling misleading. A new tab will be created called "Propose Meeting", so that this action is easier to find.
Colors on scheduling applet:The semantics of the different colors on the applet proved to be a problem. What became especially obvious that the default state of a time block should be clarified. We have decided that the default state of a block of time should be changed to unavailable, simplifying the number of selections on the applet and providing a meaningful default value. In addition, we will add a radio button called "repeating class" that can be used from the applet at any time to block out a period time as a repeatedly unavailable. There were also comments that the layout of the radio buttons and colors on the calendar needed work - these are also on the ToDo list.
Two step meeting creation:It became crystal clear that the 2 step meeting creation was a source of confusion: users fill out a form with basic meeting information, and are then sent to a screen to fill in their own proposed meeting times. This is problematic in several ways:
The changes we are making to the interface are:
Finalize Meeting:This screen has not yet been designed (it is actually the most difficult problem we've come across), we have several design ideas and will present our results at or before the final presentation.
General Bug Fixes:We have started to work back end database functionality into the application. Predictably, a lot of the Vern application is now considered "broken" because we have the new expectation that UI actions actually DO SOMETHING!!! This has also forced us to test areas of interaction that had only been handwaving before, and put us in a position where we really need to think through how the application behaves. We hope to have an initial Vern application with working database functionality before the end of term.Formal Experiment DesignBased on extensive user feedback all centering around the desire for a “unified interface” to work with meeting scheduling, as well as confusion around the usage of the “Weekly Classes” feature, our group proposed to combine the “Weekly Classes” generic calendar view into all of the “Vote for a meeting time” views. A repeating conflict is represented as a new color in the voting view, and marking a time of the week (Monday June 25th 10:00 AM) as a “repeating conflict” marks all Mondays at 10:00 am as unavailable. This change would potentially promote the usage of the Weekly Classes feature and accelerate data entry. However, it has the potential for user confusion, with users not expecting a change in one week’s availability to impact the availability for that specific time slot for all weeks in all meetings. Hypothesis
Factors (Independent Variables):
Factors (dependent Variables):
Experiment Design: Our experiment would test to see if use of the “Weekly Class” availability indicator has a greater adoption rate when merged with the meeting voting screens. It would also test to see if users “flicker” back and fourth, marking and unmarking a repeating time slot, or if they generally fill it in and leave it constant for the duration of a semester. Because of the rate of adoption question, the experiment would necessarily have to track a group of users over a multi-week time period. In order to accurately estimate adoption rates, users would be asked to use VERN for coordinating all emails theat they would normally coordinate over email for a period of one month. This month would have to be an average month during the semester, and not overlap with finals, midterms, or vacation. User Participants: A group of 40 users of similar background (all masters students at SIMS) would be sufficient to draw a comparison between the integrated-repeating and the separate tab design options. They would be divided into two groups with 20 users in each group. The "between group" style experiment would be employed with one group using the Weekly Schedule tab version, and the other using radio button version. The rate of usage across both groups would be compared to the usage of the repeating class feature to prove or disprove hypothesis #2. After deciding if hypothesis #2 is correct, the usage of the repeating class option could optionally be normalized against the number of meetings each student attends. Appendices
|