The interface of Gotcha system has gone through several iterations based on feedback from low-fi testing and the heuristic evaluation. Both methods exposed some UI problems and lead to important changes. But low-fi testing was performed on a paper-based prototype where some interaction aspects could not be evaluated. Heuristic Evaluation also had some limitations too since the evaluator only measured violations based on prescribed heuristics. If a problem does not fall within any of Nielsen's heuristics it wasn't included e.g. poor content. This pilot study addresses those weaknesses by being interactive and allowing users to demonstrate "unusable" aspects of the systems and to express their personal opinions on Gotcha's strengths and weaknesses.
Participant 1 is a master's student in the Engineering. He had heard of knowledge management before and had a minimal understanding of KM. Participant 2 is a graduate student at SIMS. She has heard of "Knowledge Management" but was not clear what it is. Participant 3 is an KM professional. He showed profound understanding of the KM discipline. The three participants are all frequent Web users and experienced with search engines.
We kept the first two task scenarios from the low-fi testing with only a slight modification. We eliminated part two of scenario one which asked participant's to find books by an author. Since our UI no longer supports bibliographic searches we already knew it would be inherently more difficult to accomplish this task. Because of UI changes, the third scenario also changed since it would be too easy to accomplish. With a products tab, users could just look for products there so we created a more challenging third scenario.
Scenario 1 -- You work as a project manager in a company. You are new to the knowledge management domain and want to find some general information about the subject. Our guess:
Scenario 2 – You are a graduate student in the Business school working on a class project. A book briefly describing a branch of knowledge management called intellectual capital seems likely to have a lot of information for your project. Find information about intellectual capital. Our guess:
Scenario 3 – Your sales staff has not been meeting its quotas and customers are complaining about poor customer service. You heard through the grapevine that a lot of knowledge management projects fix just these kinds of problems, but you’re not sure how. You want to find information about these kinds of projects that other companies have set into motion and their results. Our guess:
As participants tried to achieve their tasks, group members observed their behavior and actions. Group members were trying to see whether participants followed the thought processes we expected. The demo script labeled them "our guess". If users followed unexpected paths, we asked them why. If users expected different content when they viewed a selected page, we sought out their rationale. For instance, was it a confusing label on the page? Approaching the testing this way provided insight on user thinking to improve this iteration. It also provides tangible user experience to draw from when discussing new iterations.
Pre-test
The participant was escorted to the room where the testing would
be held. A verbal pre-test questionnaire was given to determine the participant's
understanding of knowledge management and his web experiences. The facilitator
read the statements on the demo script explaining the testing process and his
responsibilities as well as our inability to provide help during the testing.
There was also time allotted to answer any questions that he might have already.
The participant was asked to sign the informed consent form. Then we gave a
brief introduction and demo of the system.
In-test
After demo, the participant was handed a strip of paper with the goal he needed
to achieve. In most cases, the facilitator had to prompt the participant to
"think aloud". Throughout the testing, group members observed the participant's
behavior. The note taker kept a log of critical incidents both positive and
negative regarding the participant's actions and thought process. When the participant
stated that he had achieved the task, the next scenario was handed to him. We
also set up a clock to record the time that the participant spent on each task.
Post-test
When the testing was over, a follow-up interview was conducted. The facilitator
interviewed the participant with the questions on the post-test questionnaire.
The note-taker recorded the participant's responses to the questions. Then the
group members spent time asking the participant questions regarding his decisions.
For instance, why he pursued one course of action over another, why he didn't
use certain features, and what general comments the participant had. Group members
also answered questions the participant had during the testing. We ended the
test by thanking the participant and inviting him to visit our site again, if
interested.
Overall the results of the pilot study cast the Gotcha UI in a positive light. Users were able to satisfy their information needs successfully. Their subjective satisfaction ratings were also good. And the major issues users mentioned dealt more with minor details and some content concerns.
The pilot study was an appropriate forum for identifying future changes in upcoming prototypes as well as modifications that should be incorporated into a "real experiment".
For the next prototype, the major UI changes that need to be incorporated predominately focus on the search page, the search engine, and the search results page. They include:
If we had more time, we would have incorporated other novel suggestions but since the site is so content-intensive the value of implementing these features was too difficult. Some of them would have included incorporating more non-textual ways of communicating information e.g., charts, graphs, sound bites, or multimedia presentations. Another would have been to incorporate resources enabling human to human interaction e.g. events calendar, discussion forums, listservs. There was also suggestions to improve or add on more content on the case studies, products, and about KM pages.
If we were going to conduct a "real" experiment, the pilot user study showed some areas in need of improvement. First, we should develop more complex and specific scenarios. Participant 3 suggested that by providing a greater level of detail, he would have been able to understand his information need better instead of making too many assumptions. Second, we should have developed scenarios requiring participants to use the query enhancement feature. As a result, we were unable to acquire any real input on user behavior or reactions to this feature. Finally, we would force users to read web pages. Users simply scanned Gotcha's pages and the search results pages and conjectured what they would do. If they had read the pages they would have been able to build iterative queries based on page content. This is more representative of a user's behavior.
Query expansion using KM Thesaurus generates more precise search results and higher user satisfaction.
The search page enables users to enhance their queries by comparing their query with a proprietary KM thesaurus. Terms can point out synonyms, broader terms, and narrow terms which users can select to modify their query. Given proper terminology, we expect users to modify queries generating better search results. If search results are more precise there should be a correlation to increased user satisfaction.
For this test, we ask the user to do two searches based on the same query: one is done with query expansion using thesaurus terms, the other without query expansion. Thus we obtain two result sets. Then we ask the user to read the two result sets and judge the relevance of the retrieved documents. Based on the user's relevance judgement, we compute the precision measure for the two searches. According to our hypothesis, we expect that the search with query expansion should achieve higher precision than the one without query expansion. After testing, we will ask the user about his satisfaction over the two searches. The precision measure and user satisfaction are combined in our analysis. The following table shows some possible results and our corresponding interpretations.
Precision with query expansion | User satisfaction with query expansion | Interpretation |
Relatively lower | Relatively higher satisfaction score | UI is good, but we must improve descriptors or better understood. |
Relatively lower | Relatively lower satisfaction score | Bad UI design, and the thesaurus needs improvement or we need to make the categories better understood |
Relatively higher | Relatively higher satisfaction score | Good system |
Relatively higher | Relatively lower satisfaction score | UI problem. Determine methods for improvement. |
About same | Relatively higher satisfaction score | Good UI design. Seek ways (if any) to improve performance |
About same | Relatively lower satisfaction score | Improve UI and thesaurus. |
The analysis of test results may lead us to do further tests regarding issues like "does the presentation of the suggested terms affect the use of query expansion?".