Powered by
Social Media Classroom
Jessica did a wonderful job with the Shazam / Midomi demo.
For those intertested, here's a good article that compares the feature set of Shazam vs. Midomi...
http://www.theiphoneblog.com/2008/11/05/app-app-shazam-midomi/
Something mentioned in passing near the end of today's lecture reminded me of something I saw demonstrated at an Apple keynote a decade or so ago. The code name was V-Twin (technically the Apple Information Access Toolkit); it allowed you to paste in text and then use a slider to whittle down the text all the way to "key terms". It even showed the results live as you moved the slider -- pretty cool.
The magic behind it? Just what we've been learning: tf, idf, dimensionality reduction... .
Whoops, meant to post this when we were talking about LSA. As I mentioned in class, my mom's occasionally graded papers for various standardized testing services. She once mentioned to me that most of the guidelines for the human readers actually focused on structure, not content — things like, "Does this essay have a topic sentence? Are there supporting details?
Turns out this quote seems to have been misquoted all along. It's a couple thousand years older than that...
"For the mind does not require filling like a bottle, but rather, like wood, it only requires kindling to create in it an impulse to think independently and an ardent desire for the truth." —Plutarch, "On Listening to Lectures". (link)
And a closer translation from Penguin Classics:
I find it ironic that while the key to good IO is separating the content from the presentation, it seems that conversely, the key to good IR is combining the content with the presentation. That explains why it's so hard and frustrating to make IO and IR work in harmony for documents in the wild - if Shakespeare were alive today, this is what he would be writing about. :-)
Sarah Palin's Book, Going Rogue, came out last week but the blogosphere quickly noticed one missing element: an index. Not willing to give up a golden opportunity to carefully evaluate the content of Ms. Palin's, several blogs/pubs created their own indexes for Mrs. Palin's masterpiece, including Huffington Post, The New Republic, and Slate.
Developed by Karen (2010), Julian (2011), Satish (2011)
Just announced: speedi.ly, a real-time text/URL classifier. The API isn't available yet, but you can play with it via the web site. It looks like it's based on vector analysis against a set of generic topic documents, and could be handy for spring semester projects. I just threw the first page of the 202 blog at it with the following result:
Just heard about this project, the total scope of which seems outside of 202 (but is still fascincating): the MediaBugs project (http://www.pbs.org/idealab/2009/11/how-do-we-categorize-all-journalistic...) hopes to be a fact- and system-checking process similar to bug tracking in software development.
The most 202-ish aspect of this is their call for help in categorization:
"One of the challenges of info pros has been to use the structured information-retrieving and -filtering tools, which really do require sequential, left-brained thinking, while simultaneously thinking creatively and intuitively about the entire spectrum of information sources and features, which requires right-brained analysis. It sort of feels like I'm trying to solve a quadratic equation while playing the piano."
When reading this article I had a hard time with the author's assumption that search-engine users should be engaging in "deliberative debate" by default. Many searches are conducted with the simple goal of answering a question, or getting very general information on a topic. If I search for "flowers," I'm asking about a very, very big topic. We've previously learned that users rarely pass the first page of results. In the ten hits my search engine returns, should several of them be devoted to fringe controversies involving flowers? I think not.
Tobacco companies are avoiding hundreds of millions of dollars a year in taxes by altering categories. Please refer to following link for further information:
Link: http://www.huffingtonpost.com/2009/11/17/tobacco-companies-using-l_n_360...
Doesn't it look similar to "potato chips" case? (reading for L2)
- Dhawal
The Oxford New American Dictionary officials, in an armchair moment, captures some of our popular words in 2009. "Twitterisms," unsurprisingly is a 'notable word cluster' but fails to win the top word. Sure, language isn't static, but who should be the authority to rank 'tweeting' above 'unfriend'? *grumble*
Top Word of 2009: Unfriend, But Twitterisms Abound
-joan
A recent article from the BBC, "Great Writers 'Fail' Online Test" notes how famous literary pieces by the likes of Winston Churchill and Ernest Hemingway scored poorly when they were graded by computers. The article doesn't get into the complexities of analyzing the computer's grading system, but it does mention that the automatic grading system cannot pick up on human emotion and language subtleties.