Whose influencing who? Is content analysis changing the way we write?

In Bob’s lecture last Wednesday he talked about the use Latent Semantic Analysis (LSA) for essay grading. This got me thinking about the ways that we are increasingly using computers for content analysis, which in turn is shaping the content that we create, so that this content can to be more easily “understood” by computers. As Kimra notes in her blog post on LSA, human graders may emphasize structure over content while computers graders may emphasizes vocabulary over structure. As we move toward having LSA graded tests people’s writing will be changed by these considerations. For example, on my GRE essays I was instructed to use large vocabulary words and clear transitions to ensure that the computer would grade my essay highly.  Keeping this in mind certainly changed the way I wrote my essays for the exam.

Changing the vocabulary and structure of written content in order to get a “higher mark” is also prevalent on the web. A high rank in a search query is dependent on the keywords and page rank  (among other factors) of that given page.  In 2008 companies spent $1.4 billion on Search Marketing Optimization (SEO) in an attempt to incorporate keywords and links on their site that would ensure that their site would appear in a top position in search queries. While this is beneficial for these companies because they are now more predominately displayed in relevant search queries, some subtly of sentiment has certainly been lost. You can see this loss of sublet in things like newspaper headlines. A search for  “airline industry” on the New York Times Archive for 1995 – 1997 returns headlines such as, “An I.R.S. Ruling Ruffles Airline Industry Feathers” and “In Bid for Airline Security, Echoes of Unmet Promises.” The same query for 2009- 2010 returns much more factual and less colorful results such as, “Security Protest Could Disrupt Thanksgiving Travel” and “Airline Unions Seek a Share of the Industry Gains.” 

Google Instant has also started to modify they way people present content in real time, potentially reducing subtly and the long tail of search. This feature gives you results before you finish your query and predicts what your search will be as you type.  While this certainly gives you the benefit of faster search results it also potentially misses some subtly and the long tail of your search queries. For example, if my intended search is “chicken recipes quick”, I will start by typing “chi.” These three letters immediately give me results for Chipotle. At this point I might abandon my initial search in favor of a quick chicken meal I can order in. If I do decide to continue with my initial search and type, “chicken recipes,” Google Instant suggests, “chicken recipes easy.” At this point I might abandon my initial and full search of “chicken recipes quick” and click on “chicken recipes easy” reasoning that this will give me better results anyway because Google suggested it. 

The bigger question is does any of this it matter? Does a change the structure and style of writing change the meaning or the content or what is written? Does Google Instant really prevent people from a more nuanced searching? Or will it help find the most relevant documents more quickly with less effort?