Data Science and Analytics |
Assignment 1 Handed out: 1/23/2012, Due: 1/ 24/2012,
A. INFO296A-2: Thought Leaders in Data Science and Analytics Seminar 1. The Speaker List and Schedule are up on website http://courses.ischool.berkeley.edu/i296a-dsa/s12/ (We may have a few changes or adjustments a bit later). Firms represented by executives/researchers: - Internet/Social Networks: Google, Facebook, Yahoo, eBay, Claritics, Salesforce.com, BlueKai Data Exchange - Enterprise/Internet: TCS, SAP, Proctor& Gamble, Deloitte, Wells Fargo, Bank of America - In process: Microsoft, AOL, Kaiser
2. Catching up on Data Mining: i) PhDs: I will assume that you have some prior machine learning, data mining, and/pr probability, statistics, etc. If you do not, follow Masters guidance below.
ii) Masters: - For detailed learning, please consider INFO290 Data Mining and Analytics (Fri 1-4 pm) http://courses.ischool.berkeley.edu/i290-dma/s12/doku.php - - Review on your own the book by Tan, Steinbach, Kumar (PPTS, Weka software etc.) http://www-users.cs.umn.edu/~kumar/dmbook/index.php
iii) Masters - Less detailed learning: - Review on your own the book by Tan, Steinbach, Kumar (PPTS, Weka software etc.) http://www-users.cs.umn.edu/~kumar/dmbook/index.php and/or - Attend first half of INFO290 data Mining and Analytics (clustering, prediction, classification etc.) and/or - Contact Jimi Shanahan for assistance with basic material, code, etc.
3. Homework: Recall the requirement of weekly submission, due by Tuesday, 11 pm, of a given week - i) Bullet points on the talk just past, and/or related to the monthly and reports you will submit, and building towards the logic and conclusions of the report. - ii) Other bullet points should cover: Interesting or highlght points of lecture or discussion, what is new. Identify user need for analytics (in a given area), the mix and emphasis of solutions based on: 1. "Commodity" analytics 2. Novel combinations of " commodity" analytics.
4. development of new theory based on need. 4. New directions and approaches.
5. New business models. - iii) Answers a few simple questions Jimi will ask based on Lecture 1
B. INFO296A-3: Advanced Project Course on Data Science and Analytics - Objective is to develop a research paper or methodology for a conference or journal paper - Please send email whether you would like to supplied a real world problem from Silicon Valley or would you prefer to discuss a problem which you have already considered - Topics include data/text/image/video mining, event detection, information retrieval, relevance feedback, active learning, recommender systems, content-based retrieval, multimedia information extraction and retrieval, optimization, stochastic dynamic programming and reinforcement learning - Please submit in a 2 para verbal description of your problem (unless you would like one from industry) - Please submit any modeling which looks reasonable, to start with, and associated analysis; both the verbal model, the mathematical model, and the mathematical analysis must be carefully thought out - Please make sure that your inputs are via email to me, by Tuesday night 10 pm. We will set up a Dropbox to share research papers etc.
|
ISchool 296A Spring 2012 |