Test Measures
In conducting user testing, we looked for how easy or difficult it was for the users to complete assigned annotation tasks. Given that the task of annotation is quite complex and therefore subject to speed variations among users, we did not time the users but relied on qualitative measures. More specifically, we studied the flow for accessing the system from different entry points; whether or not the flow was natural to the users; whether the information they needed for making an annotation was readily available to them; whether the proposed "prefill" functionality was helpful and/or intuitive; what they needed for making an assessment of the work of others; what role the annotation history, discussion, user rating of the annotation, or "batting average" of the previous annotator played in that assessment; how they would go about expressing a negative opinion about an existing annotation; and whether they could comfortably make "batch" discussion comments for a list of genes.
Additionally, we asked users for their impressions of the member/fellow access control system and whether the users would be interested in IMG supporting a "batch annotation" functionality. We also tried to get a sense of how likely the users would be to vote in approval of an annotation. The balance in likelihood for voting "I agree" vs. "I disagree" is important for determining the "batting average" criteria for automatic fellowship and the usefulness of reporting user ratings and batting averages. Our hypothesis was that the users would not click "I agree" for an annotation if they agreed with it but would click "I disagree" if they disagreed.