i290 Social Computing

Breaking News

  • November 30, 2011.
  • November 29, 2011.
    • Project presentations on 12/9
    • Project reports due on 12/16 midnight
  • October 28, 2011.
    • Midterm review report updated on Class Projects page.
    • List of class projects available here . Please use the test credentials to edit and upload to your respective pages relevant documents using the namespace “teamX:<File>” while uploading.
  • October 17, 2011. The midterm class project report will be due on Nov. 1. More information of the report will be provided shortly.
  • September 23, 2011
    • Yelp datasets available for academia . See links under “Data” page
    • Project proposals presentations completed.
  • September 21, 2011
    • Google Doc updated with project presentation time slots for 09/23.
  • September 13, 2011
    • Project group list googleDoc link updated under “Class Projects” section of website . Please ensure you are part of a group by Friday (Sep 16,2011) before 1pm.
    • New Section added to Blog : Data . (has useful reference links relevant to the course)
  • September 8, 2011
  • September 7, 2011
  • September 6, 2011
  • September 1, 2011.
    • Download the R and related packages that are described here.
    • Mr. Natarajan Chakrapani is the course tutor. His office hour is 4:00 pm, Friday (immediately after the class) location classroom 202,south hall.
    • The regular class will start at 1:10 pm in 202 South Hall.
  • August 27, 2011. O'Reilly books are freely available online for Berkeley students through safaribooksonline. You have to be on AirBear or VPN into campus in order to have access.
  • August 27, 2011. Please check the Forums section for more tips on installing Python packages that were used in the tutorial.
  • August 16, 2011. The course website is now live.

Course Description

This course introduces fundamental as well as applied computational techniques for collaborative and collective intelligence of group behaviors on the Internet. The emphasis of the course is on data mining and knowledge discovery of social interactions, signals and data that are the byproduct of social media services such as search engines, social network sites, blogs, micro-blogs, wikis, etc. The course topics include, but are not limited to: web data mining, knowledge discovery on the web, web analytics, web information retrieval, ranking algorithms, recommender systems, human computation, models and theories about social networks, large graph and link-based algorithms, social marketing, monetization of the web, security/privacy issues related to social computing, etc.

Course Information

  • Term: Fall 2011
  • Units: 2 to 3 units depending on the project
  • Instructor: Irwin King
  • Teaching Assistant: Natarajan Chakrapani
  • Time: F 1-3 (Lab: 3-4); F 4-5 (TA's office hour)
  • Location: 202 South Hall
  • CCN: 42619
  • Class notes: here

Prerequisites

Basic computer science principles and skills. Python programming skills. Familiar with basic statistics, graph theory, and linear algebra.

Requirements

The coursework is composed of reading assignments, homework assignments, and a class project. The class project is the major component of the course; hence, you are expected to spend time to make a proposal, present the initial proposal, make a final presentation, and submit your codes and a final report.

Syllabus

Week Date Topics Tutorials
1 8/26 Introduction to Social Computing Python basics and APIs
2 9/2/11 Social Network Theory R Basics, set-up, basic operations
Ritesh Agrawal
3 9/9/11 Graph Theory and Mining Web Crawler with Python, basic concepts, issues, examples
Nate Murray
4 9/16/11 Community Detection
Lei Tang, Yahoo! Research
R Advanced, packages, graphics, statistics, machine learning, data sets, etc.
Ritesh Agrawal
5 9/23/11 Project Discussion Project Discussion
6 9/30/11 Learning and Learning to Rank
Jean-Francois Paiement and David Grangier, ATT Labs Research
GBDT and other learning
methods using R and Python
Jean-Francois Paiement and David Grangier
7 10/7/11 Sentiment Analysis and Opinion Mining
Bo Pang, Yahoo! Research
NLTK
8 10/14/11 Recommender Systems, Social Recommendation, Query Recommendation Recommender Systems
9 10/21/11 Social Media in Education
Bebo White, Stanford Linear Accelerator Lab
PageRank, HITS, etc.
10 10/28/11 Human Computation/Crowdsourcing Crowdsourcing/Human Computation
11 11/4/11 FaceBook
Lars Backstrom, Facebook
Midterm project updates
12 11/11/11 Public Holiday Public Holiday
13 11/18/11 Q&A, cQA, DeepQA, etc. Information Extraction
14 11/25/11 Public Holiday Public Holiday
15 12/2/11 Social Monetization Course Review
16 12/9/11 Wrap-up/Presentations Project Presentation

References

  1. H. Marmanis and D. Babenko, Algorithms of the Intelligent Web, 1st ed. Manning Publications, 2009.
  2. S. Alag, Collective Intelligence in Action, Pap/Dol. Manning Publications, 2008.
  3. L. Tang and H. Liu, “Community Detection and Mining in Social Media,” Synthesis Lectures on Data Mining and Knowledge Discovery, vol. 2, pp. 1-137, Jan. 2010.
  4. D. J. Cook and L. B. Holder, Mining Graph Data, 1st ed. Wiley-Interscience, 2006. [2-hour loan from UCB Library]
  5. M. A. Russell, Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites, 1st ed. O’Reilly Media, 2011. [Available via Safaribooksonline for Cal students]
  6. S. Chakrabarti, Mining the Web: Discovering Knowledge from Hypertext Data, 1st ed. Morgan Kaufmann, 2002. [2-hour loan from UCB Library]
  7. T. G. Lewis, Network Science: Theory and Applications, 1st ed. Wiley, 2009. [2-hour loan from UCB Library]
  8. D. Easley and J. Kleinberg, Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, 2010.
  9. B. Pang and L. Lee, Opinion Mining and Sentiment Analysis (Foundations and Trend). Now Publishers Inc, 2008.
  10. C. M. Bishop, Pattern Recognition and Machine Learning, 1st ed. 2006. Corr. 2nd printing ed. Springer, 2007.
  11. T. Segaran, Programming Collective Intelligence: Building Smart Web 2.0 Applications, 1st ed. O’Reilly Media, 2007. [Available via Safaribooksonline for Cal students]
  12. J. Adler, R in a Nutshell: A Desktop Quick Reference, 1st ed. O’Reilly Media, 2010. [Available via Safaribooksonline for Cal students]
  13. B. Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, 2nd ed. Springer, 2011. [2-hour loan from UCB Library]

Notes

  1. O'Reilly books are freely available online for Berkeley students through safaribooksonline. You have to be on AirBear or VPN into campus in order to have access.
Except where otherwise noted, content on this wiki is licensed under the following license: GNU Free Documentation License 1.3