Fall 2014 CSE 5334/4334 Data Mining

Course Information Instructor: Naeemul Hassan TA: TBA
  • Office hours: Tue/Thu 10:00am-12:00pm
  • Office: ERB 509
  • Phone: (817) 437-4518
  • E-mail: naeemul DOT hassan AT mavs DOT uta DOT edu
  • Homepage: http://idir.uta.edu/~naeemul
  • Office hours: TBA
  • Office: TBA
  • Phone: TBA
  • E-mail: TBA
  • Homepage: TBA

Course Description

This is an introductory course on data mining. Data Mining refers to the process of automatic discovery of patterns and knowledge from large data repositories, including databases, data warehouses, Web, document collections, and data streams. We will study the basic topics of data mining, including data preprocessing, data warehousing and OLAP, data cube, frequent pattern and association rule mining, correlation analysis, classification and prediction, and clustering, as well as advanced topics covering the techniques and applications of data mining in Web, text, big data, social networks, and computational journalism.

Student Learning Outcomes

A solid understanding of the basic concepts, prunciples, and techniques in data mining; an ability to analyze real-world applications, to model data mining problems, and to assess different solutions; an ability to design, implement, and evaluate data mining software.

Prerequisites

Textbook

Grades

The final letter grades will be based on the curve of students' performace.

Attendance

Students are highly encouraged to attend lectures.

Announcements

Stay tuned and make sure to check Blackboard frequently. Important announcements will be posted there.

Assignments and Deadlines

Regrading

Regrading request must be made within 7 days after we post scores on Blackboard. TA will handle regrade requests. If student is not satisfied with the regarding results, you get 7 days to request again. The instructor will regrade, and the decision is final.

Drop Policy

Students may drop or swap (adding and dropping a class concurrently) classes through self-service in MyMav from the beginning of the registration period through the late registration period. After the late registration period, students must see their academic advisor to drop a class or withdraw. Undeclared students must see an advisor in the University Advising Center. Drops can continue through a point two-thirds of the way through the term or session. It is the student's responsibility to officially withdraw if they do not plan to attend after registering. Students will not be automatically dropped for non-attendance. Repayment of certain types of financial aid administered through the University may be required as the result of dropping classes or withdrawing. For more information, contact the Office of Financial Aid and Scholarships (http://wweb.uta.edu/ses/fao).

Americans with Disabilities Act

The University of Texas at Arlington is on record as being committed to both the spirit and letter of all federal equal opportunity legislation, including the Americans with Disabilities Act (ADA). All instructors at UT Arlington are required by law to provide "reasonable accommodations" to students with disabilities, so as not to discriminate on the basis of that disability. Any student requiring an accommodation for this course must provide the instructor with official documentation in the form of a letter certified by the staff in the Office for Students with Disabilities, University Hall 102. Only those students who have officially documented a need for an accommodation will have their request honored. Information regarding diagnostic criteria and policies for obtaining disability-based academic accommodations can be found at www.uta.edu/disability or by calling the Office for Students with Disabilities at (817) 272-3364.

Academic Integrity

All students enrolled in this course are expected to adhere to the UT Arlington Honor Code:

I pledge, on my honor, to uphold UT Arlington’s tradition of academic integrity, a tradition that values hard work and honest effort in the pursuit of academic excellence. I promise that I will submit only work that I personally create or contribute to group collaborations, and I will appropriately reference any work from other sources. I will follow the highest standards of integrity and uphold the spirit of the Honor Code.

Instructors may employ the Honor Code as they see fit in their courses, including (but not limited to) having students acknowledge the honor code as part of an examination or requiring students to incorporate the honor code into any work submitted. Per UT System Regents’ Rule 50101, §2.2, suspected violations of university’s standards for academic integrity (including the Honor Code) will be referred to the Office of Student Conduct. Violators will be disciplined in accordance with University policy, which may result in the student’s suspension or expulsion from the University.

Student Support Services

UT Arlington provides a variety of resources and programs designed to help students develop academic skills, deal with personal situations, and better understand concepts and information related to their courses. Resources include tutoring, major-based learning centers, developmental education, advising and mentoring, personal counseling, and federally funded programs. For individualized referrals, students may visit the reception desk at University College (Ransom Hall), call the Maverick Resource Hotline at 817-272-6107, send a message to resources@uta.edu, or view the information at www.uta.edu/resources.

Electronic Communication

UT Arlington has adopted MavMail as its official means to communicate with students about important deadlines and events, as well as to transact university-related business regarding financial aid, tuition, grades, graduation, etc. All students are assigned a MavMail account and are responsible for checking the inbox regularly. There is no additional charge to students for using this account, which remains active even after graduation. Information about activating and using MavMail is available at http://www.uta.edu/oit/cs/email/mavmail.php.

Student Feedback Survey

At the end of each term, students enrolled in classes categorized as lecture, seminar, or laboratory shall be directed to complete a Student Feedback Survey (SFS). Instructions on how to access the SFS for this course will be sent directly to each student through MavMail approximately 10 days before the end of the term. Each student’s feedback enters the SFS database anonymously and is aggregated with that of other students enrolled in the course. UT Arlington’s effort to solicit, gather, tabulate, and publish student feedback is required by state law; students are strongly urged to participate. For more information, visit http://www.uta.edu/sfs.

Final Review Week

A period of five class days prior to the first day of final examinations in the long sessions shall be designated as Final Review Week. The purpose of this week is to allow students sufficient time to prepare for final examinations. During this week, there shall be no scheduled activities such as required field trips or performances; and no instructor shall assign any themes, research problems or exercises of similar scope that have a completion date during or following this week unless specified in the class syllabus. During Final Review Week, an instructor shall not give any examinations constituting 10% or more of the final grade, except makeup tests and laboratory examinations. In addition, no instructor shall give any portion of the final examination during Final Review Week. During this week, classes are held as scheduled. In addition, instructors are not required to limit content to topics that have been previously covered; they may introduce new concepts as appropriate.

Emergency Exit Procedures

Should we experience an emergency event that requires us to vacate the building, students should exit the room and move toward the nearest exit, which is located right outside the door. When exiting the building during an emergency, one should never take an elevator but should use the stairwells. Faculty members and instructional staff will assist students in selecting the safest route for evacuation and will make arrangements to assist handicapped individuals.

Schedule

As the instructor for this course, I reserve the right to adjust this schedule in any way that serves the educational needs of the students enrolled in this course. –Naeemul Hassan

Date # Lecture Assignment Lecture Notes Extra Reading
Out Due
Thu, Aug 21, 2014 1 Course Overview
Tue, Aug 26, 2014 2 Introduction (Chapter 1)
Data Warehousing, OLAP, Data Cube (Chapter 3, 4)
Thu, Aug 28, 2014 3 Data Warehousing, OLAP, Data Cube
Tue, Sep 02, 2014 4 Cancelled, To be Rescheduled
Thu, Sep 04, 2014 5 Cancelled, To be Rescheduled
Tue, Sep 09, 2014 6 Data Warehousing, OLAP, Data Cube
Thu, Sep 11, 2014 7 Data Warehousing, OLAP, Data Cube
Classification and Prediction (1) (Chapter 6)
Tue, Sep 16, 2014 8 Decision Tree
Thu, Sep 18, 2014 9 Decision Tree (continued) HW1
Tue, Sep 23, 2014 10 Course Project
Thu, Sep 25, 2014 11 Bayesian Classifiers
Tue, Sep 30, 2014 12 Bayesian Classifiers (continued)
Text and Web Mining (1)
Thu, Oct 02, 2014 13 Vector Space Model HW1
Tue, Oct 07, 2014 14 Document Classification/Document Clustering P1
Thu, Oct 09, 2014 Midterm Exam
Classification and Prediction (2) (Chapter 6)
Tue, Oct 14, 2014 15 Nearest Neighbor Classifiers
Thu, Oct 16, 2014 16 Evaluating Classification Models
Tue, Oct 21, 2014 17 Evaluating Classification Models (continued) HW2
Thu, Oct 23, 2014 18 Support Vector Machine
Clustering (Chapter 7)
Tue, Oct 28, 2014 19 Overview of Clustering, Similarity/Dissimilarity Measure
Thu, Oct 30, 2014 20 K-means P2 P1
Tue, Nov 04, 2014 21 K-means (continued)
Thu, Nov 06, 2014 22 Hierarchical clustering
Tue, Nov 11, 2014 23 Hierarchical clustering (continued) HW3 HW2
Frequent Pattern and Association Rule Mining (Chapter 5)
Thu, Nov 13, 2014 24 Association Rule Mining
Tue, Nov 18, 2014 25 Association Rule Mining (continued)
Thu, Nov 20, 2014 26 Correlation Analysis
Text and Web Mining (2)
Tue, Nov 25, 2014 27 Link Analysis: PageRank P2
Research Projects
TBD Prominent Streak Discovery (guest lecture by Gensheng Zhang)
TBD Incremental Discovery of Prominent Situational Facts (guest lecture by Afroza Sultana)
Review
Tue, Dec 02, 2014 28 Review for final exam HW3
Tue, Dec 09, 2014 Final Exam