Data Mining, Analytics, and Information Extraction in Intelligent Business Services: Online Ads, Healthcare, and Service Centers

Date

Topic

Related Techniques

Required Readings

Optional Readings

Homework

Guest Speakers

Lecture 1

Jan 19

Introduction

Lecture 1a

Lecture 1b

Lecture 1-2011-01-19-Final

 

 

 

 

 

Lecture 2

Jan 26

Lecture 2

Finacial Services

Pridiction of House Prices

Health Services: Weight Prediction

-Linear Prediction

- Bayesian Logistic Regression (if time permits)

TSK Ch 4

Linear Regression

LGC Ch 3

Prediction methods for babies' birth weight using linear and nonlinear regression analysis  (Etikan and Kazim)

 

 

Lecture 3

Feb  2

Lecture 3

R_Exmaples

Finacial Services: Fraud Detection

Online Purchase Probabilities

 

- Linear Prediction

- Logistic Regression

 

Logistic Regression

BB Ch 1,2,3,4

LGC Ch 4

Using Latent Semantic Indexing to Filter Spam 

 

 

Lecture 4

Feb  9

Lecture 3

 

- Logistic Regression

 

 

 

 

Lecture 5

Feb  16

Lecture 4

Loan Approval

Social Networks Eigenrumor Detection

Health Services: Cancer Identification

- Nearest Neighbor

- Naïve Bayes

 

 

 

 

Lecture 6

Feb  23

Lecture 4

CrowdScience Project

- Naïve Bayes

The Optimality of Naive Bayes

 

 

CrowdScience.com

Lecture 7

Mar 2

Jenny Presentation

Why Naïve Bayes works

- Naïve Bayes

 

 

 

 

Lecture 8

Mar 9

Lecture 4b

Text Mining using SVD

Search Engines

Claritics Research Project

- Indexing

- Space based search engine

- SVD

- LSI

 

Probabilistic Principal Component Analysis (Tipin & Bishop)

Sensitivity of PCA to Traffic Anomaly Detection (Ringberb)

Using Latent Semantic Indexing to Filter Spam

Topic Identification with soft clustering using PCA and ICA (Zhukouv)

PCA EX1

PCA EX2

 

LawPivot.com

Lecture 9

Mar 16

Lecture 6a

Market Segmentation

- Clustering

 

 

 

Claritics.com

Mar 23

Spring Break

 

 

 

 

 

Lecture 10

Mar 30

Lecture 7a

Lecture 7b

Lecture 7c

(attribute to William Cohen, CMU)

Entity Extraction

- Introduction

- Named Entities Recognition (NER)

Information Extraction, S. Sarawagi, FnT Databases, 1(3), 2008.

Information Extraction: Distilling Structured Data from Unstructured Text, McCallum, ACM Queue 2005.

 

 

- Introduction

- Named Entities Recognition (NER)

Lecture 11

Apr 6

Lecture 6b

Lecture 6c

Recommender System

Item-based

User-based

Hybrid

Nearest Neighbor

 

Factor in the Neighbors: Scalable andAccurate Collaborative Filtering

 

 

Lecture 12

Apr 13

Recommender System

Latent Semantic Analysis

Recommender System

Collaborative Filtering

Stochastic Gradient Descent

 

MATRIX FACTORIZATION TECHNIQUES FOR RECOMMENDER SYSTEMS

Collaborative Filtering withTemporal Dynamics

 

 

Lecture 13

Apr 20

 

 

 

 

 

 

 

Lecture 14

Apr 27