Blogs

Buffalo

The sentence “Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo” is a grammatically and semantically valid sentence in English and a great example of the challenges homophony presents for IR. Although a search engine would index this as 8 instances of the same word, there are actually three variations of “buffalo”:

Page Rank for Social Web?

Google completely revolutionized the way search was performed when it brought out the concept of Page Rank in webpages. Pages which were "legitimate" so to speak were given more weight as compared to the others and this started the prolific trend of SEO companies that we see today. Trying to make your website more "friendly" for the spider to take notice of you and increase the number of back links to your websites using numerous link building measures only seemed to add to the choreographed process of Search Engine Optimization.

Dimensionality Reduction - Reduce this...

While reading "Patterns in Unstructured Data" which is one of the readings for the topic Dimensionality Reduction, I remembered the sentence: James while John had had had had had had had had had had had a better effect on the teacher. I had read about this sentence many years back and reading about Latent Semantic Indexing just brought it back.

Computer vs. Human Essay Graders

After Bob's LSA lecture last year, I sent an alarmed e-mail to my mom, who's occasionally worked as a grader for the SAT and other standardized tests. I blogged parts of her thoughts about computer, and I'm copying the best of it below:

***

Grammatical Structures

 In today's lecture, Bob mentioned the differences between traditional linguistic and statistical approaches to language processing.  In my undergraduate linguistics major, I studied both syntax and "grammar engineering," for which we had to construct representations like the one below.  If anyone ever had to diagram sentences in middle or high school, it's kind of like that on steroids: 

Feeture Structure

Automated essay scoring

Bob mentioned the usage of Latent Semantic Analysis (LSA) to do automated essay scoring. Surprisingly, the scores of the tool (IEA) agreed (a.k.a., were as good and as bad) with human experts "as accurately as expert scores agreed with each other". Had I known this before, I probably had spent more time doing text analysis of "good essays" than writing sample essays for my GRE test!

If you're interested in this stuff, take a look at the paper here: http://www-psych.nmsu.edu/~pfoltz/reprints/Edmedia99.html

How Caesar encrypted his messages...

Having talked about how the Enigma encryption algorithm and the analysis of language, it is interesting to note that Caesar already used a character shift encryption method to secure his messages from other unauthorized people. As a response, his enemies started to analyze language and figured out the frequency of characters in the language (as Bob has mentioned several times). In that sense, the analysis of language already goes back to Caesar... http://en.wikipedia.org/wiki/Caesar_cipher

Beginning of the end for XML?

 http://blog.programmableweb.com/2010/11/10/twitter-goes-json-only-with-one-api-more-to-come/

Twitter has gone JSON only for one of its APIs. And the blogosphere is buzz with activity that this is the start of transition for Twitter to go JSON only from XML/JSON. 

Goggles - Google Smartphone app for image search

Source: http://www.nytimes.com/2010/11/16/business/media/16adco.html?_r=1&ref=te...

Other than a traditional search box for information retrieval, it is not a new idea to employ much richer user interfaces such as voice search, or visual search.

A year ago, Google introduced a smartphone application that lets users take photos of objects and get search results in return. Now they start testing water by working with five national brands to see how consumers interact with a brand.

Substituting Information for Interaction

I recently wrote (with Karen Nomorosa, Ischool 2010 grad) a  paper called "Substituting Information for Interaction" that some of you have talked to me about so I thought I'd share it with everyone.  (The paper has some relevance to the "tradeoffs" we've talked about a lot in 202 but is more relevant to the other course I teach this semester on "Information Systems and Service Design.")  The big idea in the paper is to reframe a lot of design decisions in information systems and services as tradeoffs between interacting with a user/customer to obtain needed information or us

Syndicate content