Computer vs. Human Essay Graders

After Bob's LSA lecture last year, I sent an alarmed e-mail to my mom, who's occasionally worked as a grader for the SAT and other standardized tests. I blogged parts of her thoughts about computer, and I'm copying the best of it below:

***

In my mom's experience, human graders were encouraged to focus on the structure of the essays (topic sentences, introductions, conclusions, etc) and spent less time evaluating content. One of her big complaints about the emphasis on formula was that she'd occasionally get "nonsense essays" that didn't answer the question but nonetheless did properly follow a standard essay format. With LSA, she wondered if there would be the opposite sort of nonsense problem: If students (or test-prep centers) figured out what sorts of words needed to appear in essays that followed certain prompts, they could write an essay that included those words but really didn't follow any sort of structural principles of good writing. So, just like the human graders might prioritize fomula over content (for better or worse), LSA might prioritize words and vocabulary over structure or sense (again, for better or worse).

To her, at this point, the advantages of computer grading basically boil down to: "It's consistent, it's fast, it doesn't get tired, and it doesn't get angry" (i.e. offended by the content of an essay or annoyed at the opinions of other graders around the table). With human grading, whatever the criteria, there can be a tradeoff between accuracy and speed; presumably, a well-trained computer program can be equally fast and equally accurate every time. For now, though, she doesn't seem completely convinced that either computers *or* humans can completely measure all of the right things that go into making a quality essay.