CSC8740 - Advanced Data Mining
Administrative Info
Instructor: Berkay Aydin
Email: baydin2@gsu.edu
Course Webpage: this webpage or can be reached via iCollege
Office Location: 25 Park Pl NE - Room 720
Course Overview
Credit Hours: 4.0 hours
Class Policies: General class policies can be accessed from iCollege.
Pre-requisites: CS 4740 or 4780 or 6740 or 6780 (or equivalent) with a grade of “C” or higher
Textbooks: [Main] Introduction to Data Mining, 2nd edition by Tan et al. (ISBN: 978-0133128901); [Recommended] Data Mining Concepts and Techniques, 4th edition by Han et al. (ISBN:978-0128117606); [Recommended] Interpretable Machine Learning: A Guide For Making Black Box Models Explainable by Molnar (ISBN: 979-8411463330)
Description: This course presents, in detail, following advanced foundational concepts of data mining: data preprocessing, classification and cluster analysis, pattern mining, and anomaly detection, as well as contemporary ones including spatial and temporal data mining and learning model interpretability. The lectures are designed to provide graduate students with sufficient foundation to conduct their own, but supervised research in the field of data mining at the graduate-student level. Students will gain hands-on experience on the chosen aspect of data mining methods through completion of a graduate research project. In the first part of the course the following data mining components will be introduced: data cleaning, curating, transformations, advanced supervised and unsupervised learning algorithms, frequent pattern mining and anomaly detection. In the second part of the course, selected aspects of contemporary data mining concepts will be covered – these include time series mining, spatial and spatio-temporal data mining, and interpretability/explainability methods for learning models. The last part of the course will be devoted to students’ research, conducted during work on graduate projects, which are going to be developed in the latter part of the second half of the semester. During the third part of the course the students are encouraged to extend the presented material by their own studies and by the development of projects meeting their own research interests and the gathered data mining expertise. The work on projects will be closely supervised by the instructor of the course.
Outcomes: At the end of the course, students should be able to: LO1: Understand fundamental data mining processes and tasks such as classification, clustering, frequent pattern mining, association analysis and anomaly detection. LO2: Critically review, investigate and evaluate key aspects of contemporary advanced data mining techniques and assess when to apply such techniques LO3: Contextualize, research and utilize current data mining approaches, applications and technologies to develop a scientific project that involves designing and implementing a novel data mining solution and communicating the results of their projects
Requirements: Students are expected to have at least moderate programming skills in Python. They are also expected to be well-versed in foundational data exploration and data science concepts.
Topics Covered
Introduction to Data Mining and Knowledge Discovery
Data
Supervised Learning
Unsupervised Learning
Frequent Pattern Mining
Anomaly Detection
Time Series Mining
Spatial/Spatio-temporal Data Mining
Interpretability
Policies
Late Policy: No late assignments are allowed. Exceptions will only be given in extreme cases. I grant leniency in the areas of serious personal tragedy, extreme illness, etc. Please be warned, this leniency is fully at the instructor’s discretion.
Academic Honesty: I take academic honesty very seriously. Academic honesty is a core value of the university and all members of the university community are responsible for abiding by the tenets of the policy. You can find GSU’s code of conduct (pg. 23) for academic dishonesty from the hyperlink. Please also take a look at other resources here. Lack of knowledge of this policy is not an acceptable defense to any charge of academic dishonesty. Examples of academic dishonesty include but are not limited to plagiarism, cheating on examinations, unauthorized collaboration, falsification, multiple submissions and unauthorized public posting and distribution of instructor-prepared course material. If the occurrence of academic dishonesty is proven, the student or students will receive an immediate and final grade of F. Disciplinary penalties will also be sought in addition to academic penalties. Names of the persons involved will be reported to the Dean of Students. This includes all parties involved, who will be treated equally, and I will not attempt to determine who actually developed the solution and who copied.
Attendance Policy: Students are expected to attend all the lectures.
Make-up Examination Policy: No make-up exams or quizzes will be given by default. Exceptions will only be given in extreme cases. I grant leniency in the areas of serious personal tragedy, extreme illness, etc. This leniency is fully at the instructor’s discretion. You are expected to notify me before the exam/quiz or as early as possible within a reasonable time frame.
Accommodations: Students who wish to request accommodation for a disability may do so by registering with the Access and Accommodation Center. Students may only be accommodated upon issuance by the Access and Accommodation Center of a signed Accommodation Plan and are responsible for providing a copy of that plan to instructors of all classes in which accommodations are sought.
Sharing of Instructed Generated Material: The selling, sharing, publishing, presenting, or distributing of instructor-prepared course lecture notes, videos, audio recordings, or any other instructor-produced materials from any course for any commercial or non-commercial purpose is strictly prohibited unless explicit written permission is granted in advance by the course instructor. This includes posting any materials on websites such as Chegg, Course Hero, OneClass, Stuvia, StuDocu and other similar sites. Unauthorized sale or commercial distribution of such material is a violation of the instructor’s intellectual property and the privacy rights of students attending the class and is prohibited.
Inclusiveness: It is my intent that students from diverse backgrounds and perspectives be well served by this course, that students’ learning needs be addressed both in and out of class, and that the diversity that students bring to this class be viewed as a resource, strength and benefit. It is my intent to present materials and activities that are respectful of all diversity including, but not limited to, gender, sexuality, disability, age, socioeconomic status, ethnicity, race, and culture. Your comments (in the discussion posts and in person) related to the class and content will be encouraged and appreciated. I expect all members of the class to contribute to a caring, inclusive learning environment that promotes empathetic listening, encourages productive participation and sharing, and engenders growth among us all. Please let me know ways to improve the effectiveness of the course for you personally or for other students or student groups.