Computer Essay Graders vs. Human Essay Graders

Whoops, meant to post this when we were talking about LSA. As I mentioned in class, my mom's occasionally graded papers for various standardized testing services. She once mentioned to me that most of the guidelines for the human readers actually focused on structure, not content — things like, "Does this essay have a topic sentence? Are there supporting details? Is there an introducton and a conclusion?" (Her words: "Very formulaic, very predictable, and, in most cases, very boring.") So her thoughts about LSA, which would presumably be able to read a paper and compare its content with another paper's content, were pretty interesting ...

She said she occasionally ran across essays that were innovative content-wise but didn't follow the same structure as the rest, so she was nervous about giving them a high score (in case she'd be reprimanded by the expert graders who could pass final judgment on the papers). So she was interested to hear more about how LSA would actually process content. The computer grading systems she'd heard about before mostly analyzed grammar and sentence structure, not anything with content/words/vocabulary.

One of her big complaints about the emphasis on formula was that she'd occasionally get "nonsense essays" that didn't answer the question but nonetheless did properly follow a standard essay format. With LSA, she wondered if there would be the opposite sort of nonsense problem: If students (or test-prep centers) figured out what sorts of words needed to appear in essays that followed certain prompts, they could write an essay that included those words but really didn't follow any sort of structural principles of good writing. So, just like the human graders might prioritize fomula over content (for better or worse), LSA might prioritize words and vocabulary over structure or sense (again, for better or worse).

To her, at this point, the advantages of computer grading basically boil down to: "It's consistent, it's fast, it doesn't get tired, and it doesn't get angry" (i.e. offended by the content of an essay or annoyed at the opinions of other graders around the table). With human grading, whatever the criteria, there can be a tradeoff between accuracy and speed; presumably, a well-trained computer program can be equally fast and equally accurate every time. For now, though, she doesn't seem completely convinced that either computers *or* humans can completely measure all of the right things that go into making a quality essay.

i202 Fall 2009 School of Information, UC Berkeley

Navigation

User login

Computer Essay Graders vs. Human Essay Graders