Pilot User Study

By Haydee Hernandez, Qun Liang, and Hailing Jiang

Introduction [top]

Knowledge management (KM) is an emerging field in business studies that has seen explosive growth in recent years. Despite the many KM resources available on-line, there are various kinds of information needs that are not well satisfied in this field at present. There is a strong user demand for a web site that offers sufficient novice support, and serves as a KM portal. The Gotcha system is meant to meet this demand.

The interface of Gotcha system has gone through several iterations based on feedback from low-fi testing and the heuristic evaluation. Both methods exposed some UI problems and lead to important changes. But low-fi testing was performed on a paper-based prototype where some interaction aspects could not be evaluated. Heuristic Evaluation also had some limitations too since the evaluator only measured violations based on prescribed heuristics. If a problem does not fall within any of Nielsen's heuristics it wasn't included e.g. poor content. This pilot study addresses those weaknesses by being interactive and allowing users to demonstrate "unusable" aspects of the systems and to express their personal opinions on Gotcha's strengths and weaknesses.

Method [top]

Participants

Two of the participants were chosen because they had reasonable experience surfing on-line and because they had little familiarity with knowledge management (KM). Because the UI is directed at domain novices it was important that participants have little domain familiarity. The third participant was a domain expert due to his extensive professional experience in KM. He enabled us to see how this class of users might interact with our interface. Although our interface is designed to facilitate the KM novice users' information seeking process, we hoped that it could also serve expert users as well.

Participant 1 is a master's student in the Engineering. He had heard of knowledge management before and had a minimal understanding of KM. Participant 2 is a graduate student at SIMS. She has heard of "Knowledge Management" but was not clear what it is. Participant 3 is an KM professional. He showed profound understanding of the KM discipline. The three participants are all frequent Web users and experienced with search engines.

Apparatus

We did not use any special equipment other than computers in this user testing.

Task Scenarios

We kept the first two task scenarios from the low-fi testing with only a slight modification. We eliminated part two of scenario one which asked participant's to find books by an author. Since our UI no longer supports bibliographic searches we already knew it would be inherently more difficult to accomplish this task. Because of UI changes, the third scenario also changed since it would be too easy to accomplish. With a products tab, users could just look for products there so we created a more challenging third scenario.

Scenario 1 -- You work as a project manager in a company. You are new to the knowledge management domain and want to find some general information about the subject. Our guess:

Since she knows little about KM, he decides to read information in About KM first to get familiar with the area.
She may follow some of the links provided after the brief explanations.
She’ll then move on to explore the Resources section and see if there’s anything of interest there.

Scenario 2 – You are a graduate student in the Business school working on a class project. A book briefly describing a branch of knowledge management called intellectual capital seems likely to have a lot of information for your project. Find information about intellectual capital. Our guess:

The user has some keywords in mind for these topics so she’ll go directly to the Search tab.
She’ll type the phrase "intellectual capital" and not select the query enhancement feature.
The search results present some records of interest that the user may explore. The user keeps clicking interesting records until she either runs a new query altogether or she has enough information.

Scenario 3 – Your sales staff has not been meeting its quotas and customers are complaining about poor customer service. You heard through the grapevine that a lot of knowledge management projects fix just these kinds of problems, but you’re not sure how. You want to find information about these kinds of projects that other companies have set into motion and their results. Our guess:

Since the question is pretty vague, she has no good keywords to start with. She decides to explore the About KM section for a brief overview of the subject.
She explores the Resources tab next.
She thinks that the Case Studies page would provide a good starting point.
She scans the case study titles and clicks those that seem interesting.
She continues reading these studies until she either runs a query on the Search page or feels confident that KM will/ will not meet her needs.

As participants tried to achieve their tasks, group members observed their behavior and actions. Group members were trying to see whether participants followed the thought processes we expected. The demo script labeled them "our guess". If users followed unexpected paths, we asked them why. If users expected different content when they viewed a selected page, we sought out their rationale. For instance, was it a confusing label on the page? Approaching the testing this way provided insight on user thinking to improve this iteration. It also provides tangible user experience to draw from when discussing new iterations.

Procedure

Pre-test
The participant was escorted to the room where the testing would be held. A verbal pre-test questionnaire was given to determine the participant's understanding of knowledge management and his web experiences. The facilitator read the statements on the demo script explaining the testing process and his responsibilities as well as our inability to provide help during the testing. There was also time allotted to answer any questions that he might have already. The participant was asked to sign the informed consent form. Then we gave a brief introduction and demo of the system.

In-test
After demo, the participant was handed a strip of paper with the goal he needed to achieve. In most cases, the facilitator had to prompt the participant to "think aloud". Throughout the testing, group members observed the participant's behavior. The note taker kept a log of critical incidents both positive and negative regarding the participant's actions and thought process. When the participant stated that he had achieved the task, the next scenario was handed to him. We also set up a clock to record the time that the participant spent on each task.

Post-test
When the testing was over, a follow-up interview was conducted. The facilitator interviewed the participant with the questions on the post-test questionnaire. The note-taker recorded the participant's responses to the questions. Then the group members spent time asking the participant questions regarding his decisions. For instance, why he pursued one course of action over another, why he didn't use certain features, and what general comments the participant had. Group members also answered questions the participant had during the testing. We ended the test by thanking the participant and inviting him to visit our site again, if interested.

Test measures [top]

In this testing, we measured the following:

First, we wanted to see if the participants followed the path as we expected. This may tell us if our initial guesses were on target regarding the best path for task completion.
We looked for troublesome features causing them to slow down, pause or ask questions. These may expose the potential problems in our interface that should be reconsidered.
We also wanted to measure the participants' subjective evaluation of our interface in the post-test questionnaire. We want to know what they like, and what they don't like about our interface. This may help us better understand our users' preferences and improve our interface to meet their needs.
We measured the time that the participants spent on each task. If the participant lingers on one particular task for too long (longer than expected or reasonable), it may indicate problem areas. We can then investigate and analyze the situation and figure out the reasons or potential problems.

With users' feedback, we will be able to fine-tune our interface by eliminating unnecessary functions, adding new user-desired features, and modifying the design based on negative feedback.

Results [top]

Overall the results of the pilot study cast the Gotcha UI in a positive light. Users were able to satisfy their information needs successfully. Their subjective satisfaction ratings were also good. And the major issues users mentioned dealt more with minor details and some content concerns.

As to the overall web design, one participant commented that he liked the look of the site. Another participant suggested to expand the use of graphics and multimedia, which is more popular today and makes the site more attractive and pleasing. He also suggested to add advertising to the site. It implies sponsorship and credibility.
The three participants expressed their satisfaction on the information contents of the site. One participant suggested to add "What's New" in Home page to notify frequent users the latest updates in the site. Another participant also suggested to add a calendar, keeping the users informed by listing the current events in KM areas.
Two participants found the mouse-over pop-up windows in Search page annoying or distracting. One of them suggested that it would be nicer to put the descriptions of search sites on the side of the site names. The third participant did not notice mouse-over feature.
All the three participants did not notice the search options provided in the Search page before clicking on "Go" to start the search. Most times they just typed in the keywords in the entry box and directly clicked on the "Go" button without bothering to read the options. Sometimes they said"Oops" after clicking on "Go" because they just noticed the messages or options on the screen.
None of the participants checked the search tips. One participant said that she may read it if needed. Another participant admitted that personally he never reads search tips or such things.
One participant said that he never use the query enhancement feature. He never figured out how it works. The other two participants tried the query enhancement. Partly because of the nature of the given scenario, query enhancement did not suggest more useful terms.
Search speed affects the use of Search feature. When one search took longer than expected, the participants had no patience to wait for the results. They stopped the search process, and then tried to limit the search or turned to other resources.

Discussion[top]

The pilot study was an appropriate forum for identifying future changes in upcoming prototypes as well as modifications that should be incorporated into a "real experiment".

For the next prototype, the major UI changes that need to be incorporated predominately focus on the search page, the search engine, and the search results page. They include:

Eliminate the pop-up windows on the search page that present tips and definitions. Users find them too distracting or they cannot understand how to control them.
Terms on the glossary page should not be linked. A separate hyperlink at the end of the definition should read "View records related to this term". Records should then be returned from all sources not just the Gotcha library. This eliminates user surprises or confusion on the function of the links and ensures some results.
Rename the search option on the search page from "the Web in general" to "Search the Web." This is more clear.
Change the default sources searched on the search page to just "Search the Web" to minimize overwhelming users with too many results.
Change the sorting of search results. Instead of ranking and then sorting alphabetically, rank and then interleaf different sources with the same ranking e.g. one result from Gotcha, then one from CIO, then one from the Web, etc. It eliminates dependency on one source and prevent us from looking biased (hence untrustworthy).
Fix bug that changes the query name at the top of the search results page e.g. on one page the query read intellectual capital on another it would say just capital. This caused needless user confusion. The query should be consistent on all pages.
Add a quick explanation on the types of results a source will return e.g KMWorld retrieves product and events information.
Select query enhancement as a default on the search page. This forces users to familiarize themselves with its power the first time they use search instead of ignoring this novel feature. But users can still unselect query enhancement if they chose.

If we had more time, we would have incorporated other novel suggestions but since the site is so content-intensive the value of implementing these features was too difficult. Some of them would have included incorporating more non-textual ways of communicating information e.g., charts, graphs, sound bites, or multimedia presentations. Another would have been to incorporate resources enabling human to human interaction e.g. events calendar, discussion forums, listservs. There was also suggestions to improve or add on more content on the case studies, products, and about KM pages.

If we were going to conduct a "real" experiment, the pilot user study showed some areas in need of improvement. First, we should develop more complex and specific scenarios. Participant 3 suggested that by providing a greater level of detail, he would have been able to understand his information need better instead of making too many assumptions. Second, we should have developed scenarios requiring participants to use the query enhancement feature. As a result, we were unable to acquire any real input on user behavior or reactions to this feature. Finally, we would force users to read web pages. Users simply scanned Gotcha's pages and the search results pages and conjectured what they would do. If they had read the pages they would have been able to build iterative queries based on page content. This is more representative of a user's behavior.

Formal Experiment Design[top]

Hypothesis

Query expansion using KM Thesaurus generates more precise search results and higher user satisfaction.

The search page enables users to enhance their queries by comparing their query with a proprietary KM thesaurus. Terms can point out synonyms, broader terms, and narrow terms which users can select to modify their query. Given proper terminology, we expect users to modify queries generating better search results. If search results are more precise there should be a correlation to increased user satisfaction.

Factors and levels

The factor being altered in the course of the experiment would be the search options made available to the user. The levels would then relate to whether a user was given the ability to expand his query or not. No query enhancement would be just a simple search box for query entry.

Response variables

Precision measure of search results;
User satisfaction of the search.

Blocking and repetitions

For this test, we ask the user to do two searches based on the same query: one is done with query expansion using thesaurus terms, the other without query expansion. Thus we obtain two result sets. Then we ask the user to read the two result sets and judge the relevance of the retrieved documents. Based on the user's relevance judgement, we compute the precision measure for the two searches. According to our hypothesis, we expect that the search with query expansion should achieve higher precision than the one without query expansion. After testing, we will ask the user about his satisfaction over the two searches. The precision measure and user satisfaction are combined in our analysis. The following table shows some possible results and our corresponding interpretations.

Precision with query expansion	User satisfaction with query expansion	Interpretation
Relatively lower	Relatively higher satisfaction score	UI is good, but we must improve descriptors or better understood.
Relatively lower	Relatively lower satisfaction score	Bad UI design, and the thesaurus needs improvement or we need to make the categories better understood
Relatively higher	Relatively higher satisfaction score	Good system
Relatively higher	Relatively lower satisfaction score	UI problem. Determine methods for improvement.
About same	Relatively higher satisfaction score	Good UI design. Seek ways (if any) to improve performance
About same	Relatively lower satisfaction score	Improve UI and thesaurus.

The analysis of test results may lead us to do further tests regarding issues like "does the presentation of the suggested terms affect the use of query expansion?".

Appendices [top]

Informed consent form
Demo Script
Task descriptions
Follow-up questionnaire
Test raw log