Resources:
Google
Google Scholar
CiteSeer
DBLP Bibliography
Course Information:
|
Instructor: Chengkai Li
|
TA: Ning Yan
|
Course
Description: This is an introductory course on data mining. Data
Mining refers to the process of automatic discovery of patterns and knowledge
from large data repositories, including databases, data warehouses, Web,
document collections, and data streams. We will study the basic
topics of data mining, including data preprocessing, data warehousing and OLAP,
data cube, frequent pattern and association rule mining, correlation analysis,
classification and prediction, and clustering, as well as advanced topics
covering the techniques and applications of data mining in Web and text.
Prerequisites:
CSE 3330/5330 Database Systems
I
or
CSE
4331/5331 Database Systems
II or similar
courses or consent of instructor
Announcements: Stay tuned and make sure to check WebCT frequently. Important announcements will be posted there.
Regrading: Regrading request must be made within 7 days after we post scores on WebCT. TA will handle regrade requests. If student is not satisfied with the regarding results, you get 7 days to request again. The instructor will regrade, and the decision is final.
WebCT: Log in to the WebCT page http://www.uta.edu/webct with your NetID and password. We use WebCT for: (1) Announcements; (2) Assignment Submission; (3) Discussion Group; (4) Releasing materials, assignments, scores and grades. Follow these steps exactly during electronic assignment submission.
Ethics Policies and Academic Integrity: The College cannot and will not tolerate any form of academic dishonesty by its students. This includes, but is not limited to cheating on examinations, plagiarism, or collusion (explained in the document below). Students are required to read the following document carefully, sign it, return the signed copy to the instructor, and keep a copy for their own records. Hardcopies of this document will be provided to the students in the first class, and also can be picked up in the instructor's office. If you print by yourself, please make it double-sided.
Statement on Ethics, Professionalism, and Conduct for Engineering Students
Miscellaneous: If you require accommodation based on disability, I would like to meet with you in the privacy of my office during the first week of the semester to ensure that you are appropriately accommodated. Please read the page of the office for students with disabilities.
Schedule:
Date | # |
Lecture |
Assignment |
Lecture Notes |
|
Out |
Due |
||||
08/26 | 1 | Course Overview | [PDF] | ||
08/31 | 2 |
Introduction
(Chapter 1) |
[PDF] | ||
Data Warehousing, OLAP, Data Cube (Chapter 3, 4) |
|||||
09/02 | 3 | Data Warehousing and OLAP | HW1 | [PDF] | |
09/07 | 4 | Data Cube | |||
Classification and Prediction (Chapter 6) | |||||
09/09 | 5 | Decision Tree |
[PDF] | ||
09/14 | 6 |
Decision Tree |
|||
09/16 | 7 | Evaluating Classification Models | P1 |
HW1 |
[PDF] |
09/21 | 8 |
Evaluating Classification
Models |
|||
09/23 | 9 | Bayesian Classifiers | [PDF] | ||
09/28 | 10 |
|
[PDF] | ||
09/30 | 11 | Support Vector Machine | HW2 | [PDF] | |
Frequent Pattern and Association Rule Mining (Chapter 5) |
|||||
10/05 | 12 | Association Rule Mining |
[PDF] [PPT] |
||
10/07 | 13 | Correlation Analysis |
[PDF] [PPT] |
||
|
|||||
10/12 | 14 | Data, Data Quality, Data Preprocessing | HW2 | ||
10/14 | Midterm Exam (Thursday, Oct. 14th, 2:00pm-3:20pm, WH210) | ||||
Clustering
(Chapter 7) |
|||||
10/19 | 15 | Overview of Clustering, Similarity/Dissimilarity Measure |
[PDF] [PPT] |
||
10/21 | 16 | P1 (Due at Oct. 24) | |||
10/26 | 17 |
K-means |
[PDF] [PPT] |
||
10/28 | 18 |
K-means |
|||
11/02 | 19 |
Hierarchical |
[PDF] [PPT] |
||
11/04 | 20 |
Hierarchical |
P2, HW3 | ||
Text and Web Mining | |||||
11/09 | 21 | Vector Space Model | [PDF] | ||
11/11 | 22 |
|
[PDF] | ||
11/16 | 23 | Document Clustering | [PDF] | ||
11/18 | 24 | MapReduce | P3 | HW3 (Due at Nov. 19) |
[PDF] [PPT] |
11/23 | 25 | MapReduce | P2 | ||
11/25 | Thanksgiving Holidays | ||||
11/30 | 26 | MapReduce | |||
12/02 | 27 |
Link Analysis: PageRank |
[PDF] | ||
12/07 | 28 | Link Analysis (cont'd) | |||
12/09 | 29 | Final Review | P3 (Due at Dec. 11) | [PDF] | |
12/14 |
Final Exam
(Tuesday, Dec. 14th,
2:00pm-4:30pm, WH210) |
University calendar: Fall 2010