Pilot User Study

By Haydee Hernandez, Qun Liang, and Hailing Jiang

[Home| Introduction| Method| Test Measures| Results| Discussion|Formal experiment design| Appendices]

Introduction [top]

Knowledge management (KM) is an emerging field in business studies that has seen explosive growth in recent years. Despite the many KM resources available on-line, there are various kinds of information needs that are not well satisfied in this field at present. There is a strong user demand for a web site that offers sufficient novice support, and serves as a KM portal. The Gotcha system is meant to meet this demand.

The interface of Gotcha system has gone through several iterations based on feedback from low-fi testing and the heuristic evaluation.  Both methods exposed some UI problems and lead to important changes. But low-fi testing was performed on a paper-based prototype where some interaction aspects could not be evaluated. Heuristic Evaluation also had some limitations too since the evaluator only measured violations based on prescribed heuristics. If a problem does not fall within any of Nielsen's heuristics it wasn't included e.g. poor content. This pilot study addresses those weaknesses by being interactive and allowing users to demonstrate "unusable" aspects of the systems and to express their personal opinions on Gotcha's strengths and weaknesses.


 

Method [top]

Participants

Two of the participants were chosen because they had reasonable experience surfing on-line and because they had little familiarity with knowledge management (KM). Because the UI is directed at domain novices it was important that participants have little domain familiarity. The third participant was a domain expert due to his extensive professional experience in KM. He enabled us to see how this class of users might interact with our interface. Although our interface is designed to facilitate the KM novice users' information seeking process,  we hoped that it could also serve expert users as well.

Participant 1 is a master's student in the Engineering. He had heard of knowledge management before and had a minimal understanding of KM. Participant 2 is a graduate student at SIMS. She has heard of "Knowledge Management" but was not clear what it is. Participant 3 is an KM professional. He showed  profound understanding of the KM discipline. The three participants are all frequent Web users and experienced with search engines.

Apparatus

We did not use any special equipment other than computers in this user testing.

Task Scenarios

We kept  the first two task scenarios from the low-fi testing with only a slight modification. We eliminated part two of scenario one which asked participant's to find books by an author. Since our UI no longer supports bibliographic searches we already knew it would be inherently more difficult to accomplish this task. Because of UI changes, the third scenario also changed since it would be too easy to accomplish. With a products tab, users could just look for products there so we created a more challenging third scenario.

Scenario 1 -- You work as a project manager in a company. You are new to the knowledge management domain and want to find some general information about the subject. Our guess:

Scenario 2 – You are a graduate student in the Business school working on a class project. A book briefly describing a branch of knowledge management called intellectual capital seems likely to have a lot of information for your project. Find information about intellectual capital. Our guess:

Scenario 3 – Your sales staff has not been meeting its quotas and customers are complaining about poor customer service. You heard through the grapevine that a lot of knowledge management projects fix just these kinds of problems, but you’re not sure how. You want to find information about these kinds of projects that other companies have set into motion and their results. Our guess:

As participants tried to achieve their tasks, group members observed their behavior and actions. Group members were trying to see whether participants followed the thought processes we expected. The demo script labeled them "our guess". If users followed unexpected paths, we asked them why. If users expected different content when they viewed a selected page, we sought out their rationale. For instance, was it a confusing label on the page? Approaching the testing this way provided insight on user thinking to improve this iteration. It also provides tangible user experience to draw from when discussing new iterations.

Procedure

Pre-test
The participant was escorted to the room where the testing would be held. A verbal pre-test questionnaire was given to determine the participant's understanding of knowledge management and his web experiences. The facilitator read the statements on the demo script explaining the testing process and his responsibilities as well as our inability to provide help during the testing. There was also time allotted to answer any questions that he might have already. The participant was asked to sign the informed consent form. Then we gave a brief introduction and demo of  the system.

In-test
After demo, the participant was handed a strip of paper with the goal he needed to achieve. In most cases, the facilitator had to prompt the participant to "think aloud".  Throughout the testing, group members observed the participant's behavior. The note taker kept a log of critical incidents both positive and negative regarding the participant's actions and thought process. When the participant stated that he had achieved the task, the next scenario was handed to him. We also set up a clock to record the time that the participant spent on each task.

Post-test
When the testing was over, a follow-up interview was conducted. The facilitator interviewed the participant with the questions on the post-test questionnaire.  The note-taker recorded the participant's responses to the questions. Then the group members spent time asking the participant questions regarding his decisions. For instance, why he pursued one course of action over another, why he didn't use certain features, and what general comments the participant had. Group members also answered questions the participant had during the testing. We ended the test by thanking the participant and inviting him to visit our site again, if interested.
 

Test measures [top]

In this testing, we measured the following: With users' feedback, we will be able to fine-tune our interface by eliminating unnecessary functions, adding new user-desired features, and modifying the design based on negative feedback.

Results [top]

Overall the results of the pilot study cast the Gotcha UI in a positive light. Users were able to satisfy their information needs successfully. Their subjective satisfaction ratings were also good. And the major issues users mentioned dealt more with minor details and some content concerns.

Discussion[top]

The pilot study was an appropriate forum for identifying future changes in upcoming prototypes as well as modifications that should be incorporated into a "real experiment".

For the next prototype, the major UI changes that need to be incorporated predominately focus on the search page, the search engine, and the search results page. They include:

If we had more time, we would have incorporated other novel suggestions but since the site is so content-intensive the value of implementing these features was too difficult. Some of them would have included incorporating more non-textual ways of communicating information e.g., charts, graphs, sound bites, or multimedia presentations. Another would have been to incorporate resources enabling human to human interaction e.g. events calendar, discussion forums, listservs. There was also suggestions to improve or add on more content on the case studies, products, and about KM pages.

If we were going to conduct a "real" experiment, the pilot user study showed some areas in need of improvement. First, we should develop more complex and specific scenarios. Participant 3 suggested that by providing a greater level of detail, he would have been able to understand his information need better instead of making too many assumptions. Second, we should have developed scenarios requiring participants to use the query enhancement feature. As a result, we were unable to acquire any real input on user behavior or reactions to this feature. Finally, we would force users to read web pages. Users simply scanned Gotcha's pages and the search results pages and conjectured what they would do. If they had read the pages they would have been able to build iterative queries based on page content. This is more representative of a user's behavior.

 

Formal Experiment Design[top]

Hypothesis

Query expansion using KM Thesaurus generates more precise search results and higher user satisfaction.

The search page enables users to enhance their queries by comparing their query with a proprietary KM thesaurus. Terms can point out synonyms, broader terms, and narrow terms which users can select to modify their query. Given proper terminology, we expect users to modify queries generating better search results. If search results are more precise there should be a correlation to increased user satisfaction.

Factors and levels

The factor being altered in the course of the experiment would be the search options made available to the user. The levels would then relate to whether a user was given the ability to expand his query or not. No query enhancement would be just a simple search box for query entry.

Response variables

Blocking and repetitions

For this test, we ask the user to do two searches based on the same query: one is done with query expansion using thesaurus terms, the other without query expansion. Thus we obtain two result sets. Then we ask the user to read the two result sets and judge the relevance of the retrieved documents. Based on the user's relevance judgement,  we compute the precision measure for the two searches. According to our hypothesis, we expect that the search with query expansion should achieve higher precision than the one without query expansion. After testing, we will ask the user about his satisfaction over the two searches. The precision measure and user satisfaction are combined in our analysis. The following table shows some possible results and our corresponding interpretations.


 

Precision with query expansion User satisfaction with query expansion Interpretation
Relatively lower Relatively higher satisfaction score UI is good, but we must improve descriptors or better understood.
Relatively lower Relatively lower satisfaction score  Bad UI design, and the thesaurus needs improvement or we need to make the categories better understood
Relatively higher Relatively higher satisfaction score Good system
Relatively higher Relatively lower satisfaction score  UI problem. Determine methods for improvement.
About same  Relatively higher satisfaction score Good UI design. Seek ways (if any) to improve performance
About same  Relatively lower satisfaction score  Improve UI and thesaurus.

The analysis of test results may lead us to do further tests regarding issues like "does the presentation of the suggested terms affect the use of query expansion?".


 

Appendices [top]

Informed consent form
Demo Script
Task descriptions
Follow-up questionnaire
Test raw log