i290-16: Research Seminar on Technology for Search and Social Media

Fall 2007
Prof. Marti Hearst

Course Description

Graduate seminar on research topics surrounding computational analysis of social media such as blogs, online profiles, and recommendation systems, and technology for advanced search engine applications. Students will suggest, read and discuss research papers in the field and will complete a substantial research project.

Projects

Coming soon!

Readings

For Sept 5: Personalization in Search

    Personalization in Search, Marti Hearst, to appear in Modern IR 2E, 2008.

For Sept 12: Tagging

For Sept 19: Wikipedia

    Cosley, D., Frankowski, D., Terveen, L., and Riedl, J. 2007. SuggestBot: using intelligent task routing to help people find work in wikipedia. In Proceedings of the 12th international Conference on intelligent User interfaces (Honolulu, Hawaii, USA, January 28 - 31, 2007). IUI '07. ACM Press, New York, NY, 32-41. http://doi.acm.org/10.1145/1216295.1216309

    Adler, B. T. and de Alfaro, L. 2007. A content-driven reputation system for the wikipedia. In Proceedings of the 16th international Conference on World Wide Web (Banff, Alberta, Canada, May 08 - 12, 2007). WWW '07. ACM Press, New York, NY, 261-270. DOI= http://doi.acm.org/10.1145/1242572.1242608

    Bellomi and Bonato, Network Analysis for Wikipedia

    wikirage.com - Wikirage lists down the 100 most actively edited articles of Wikipedia.

    Wiki Stats - A ton of wikipedia statistics from different languages

For Sept 26: Recommender Systems

    Adomavicius, G. and Tuzhilin, A. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. Knowledge and Data Engineering, IEEE Transactions on, 17(6):749, 2005. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1423975

    Good, N., Schafer, J.B., Konstan, J., Borchers, A., Sarwar, B., Herlocker, J., and Riedl, J., Combining Collaborative Filtering with Personal Agents for Better Recommendations. Proceedings of the 1999 Conference of the American Association of Artifical Intelligence (AAAI-99), pp 439-446. http://www.grouplens.org/papers/pdf/aaai-99.pdf

    McDonald, D. W. and Ackerman, M. S. 2000. Expertise recommender: a flexible recommendation system and architecture. In Proceedings of the 2000 ACM Conference on Computer Supported Cooperative Work (Philadelphia, Pennsylvania, United States). CSCW '00. ACM Press, New York, NY, 231-240. DOI= http://doi.acm.org/10.1145/358916.358994

For Oct 10: Anchor Texts and Tags

    Craswell, N., Hawking, D., and Robertson, S. 2001. Effective site finding using link anchor information. In Proceedings of the 24th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (New Orleans, Louisiana, United States). SIGIR '01. ACM Press, New York, NY, 250-257. http://research.microsoft.com/users/nickcr/pubs/craswell_sigir01.pdf

    Eiron, N. and McCurley, K. S. 2003. Analysis of anchor text for web search. In Proceedings of the 26th Annual international ACM SIGIR Conference on Research and Development in informaion Retrieval (Toronto, Canada, July 28 - August 01, 2003). SIGIR '03. ACM Press, New York, NY, 459-460. http://www.mccurley.org/papers/anchor.pdf

    Kraft, R. and Zien, J. 2004. Mining anchor text for query refinement. In Proceedings of the 13th international Conference on World Wide Web (New York, NY, USA, May 17 - 20, 2004). WWW '04. ACM Press, New York, NY, 666-674. http://portal.acm.org/citation.cfm?id=988672.988763

    Optionally, read section 2 of this paper:

    Chakrabarti, S., Dom, B., Raghavan, P., Rajagopalan, S., Gibson, D., and Kleinberg, J. 1998. Automatic resource compilation by analyzing hyperlink structure and associated text. In Proceedings of the Seventh international Conference on World Wide Web 7 (Brisbane, Australia). P. H. Enslow and A. Ellis, Eds. Elsevier Science Publishers B. V., Amsterdam, The Netherlands, 65-74. http://www.cs.cornell.edu/Info/People/kleinber/www98-arc.pdf

For Oct 17: Question-Answering System

For Oct 24: Email and Chat Summarization

    Summarizing email conversations with clue words Giuseppe Carenini, Raymond T. Ng, Xiaodong Zhou International World Wide Web Conference - Proceedings of the 16th international conference on World Wide Web Banff, Alberta, Canada, 2007. http://portal.acm.org/citation.cfm?id=1242586

    Generating overview summaries of ongoing email thread discussions Stephen Wan and Kathy McKeown International Conference On Computational Linguistics Proceedings of the 20th international conference on Computational Linguistics,Geneva, Switzerland, 2004. http://portal.acm.org/citation.cfm?id=1220434

    Digesting virtual "geek" culture: the summarization of technical internet relay chats Liang Zhou and Eduard Hovy Annual Meeting of the ACLProceedings of the 43rd Annual Meeting on Association for Computational Linguistics Ann Arbor, Michigan, 2005 http://portal.acm.org/citation.cfm?id=1220434

For Oct 31: Text Similarity

    Hatzivassiloglou, V., Klavans, J.L., Eskin, E. (1999). Detecting text similarity over short passages: exploring linguistic feature combinations via machine learning. In Proceedings of empirical methods in natural language processing and very large corpora EMNLP'99. MD, USA. http://www1.cs.columbia.edu/~vh/Papers/1999/SimFinder-EMNLP.pdf

    T. Pedersen, S. Patwardhan, and J. Michelizzi. Wordnet::similarity - measuring the relatedness of concepts. In Appears in the Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04), 2004. http://acl.ldc.upenn.edu/N/N04/N04-3012.pdf

    C.D. Corley and R. Mihalcea Measuring the Semantic Similarity of Texts , in Proceedings of the ACL 2005 Workshop on "Empirical Modeling of Semantic Equivalence and Entailment", Ann Arbor, MI. June, 2005. http://acl.ldc.upenn.edu/W/W05/W05-1203.pdf

For Nov 7: Active Learning

For Nov 21: Tagging

    Farooq et al. (2007). Evaluating tagging behavior in social bookmarking systems: metrics and design heuristics. ACM Proceedings of the International GROUP Conference on Supporting Group Work (Sanibel Island, Florida, November 4-7, 2007), In Press. http://cscl.ist.psu.edu/public/users/ufarooq/Publications/GROUP07-EvaluationOfCiteulike.pdf

    Noll, M. G. and Meinel, C. 2007. Authors vs. readers: a comparative study of document metadata and content in the www. In Proceedings of the 2007 ACM Symposium on Document Engineering (Winnipeg, Manitoba, Canada, August 28 - 31, 2007). DocEng '07. ACM, New York, NY, 177-186. http://doi.acm.org/10.1145/1284420.1284465 Boydell, O. and Smyth, B. 2007. From social bookmarking to social summarization: an experiment in community-based summary generation. In Proceedings of the 12th international Conference on intelligent User interfaces (Honolulu, Hawaii, USA, January 28 - 31, 2007). IUI '07. ACM, New York, NY, 42-51. http://doi.acm.org/10.1145/1216295.1216311

For Nov 28: Large Scale Systems

    J. Dean and S. Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, Proc. 6th Symp. Operating System Design and Implementation (OSDI), Usenix Assoc., 2004, pp. 137-150.

    Q. Su, D. Pavlov, J. Chow, and W. Baker. Internet-Scale Collection of Human-Reviewed Data. Proceedings of WWW 2007, May 2007