Syllabus Schedule Project A Project B Home
MGS 8040: Data Mining
Syllabus for Summer 2019
Instructor: Dr. Satish Nargundkar Office Hours: By appointment |
E-Mail : snargundkar@gmail.com CRN: 53355 Tower Place 200, Room 404 Thursdays 5:30 – 9:45 PM |
Prerequisites: You must already have knowledge of basic statistics, including
Regression Analysis, to succeed in this course.
Optional Textbooks:
2.
Data Science for Business: What you need to know about data mining
and data-analytic thinking, by Foster Provost, Tom Fawcett, O'Reilly Media, July 2013. Print ISBN: 978-1-4493-6132-7 ISBN 10:1-4493-6132-3 Ebook ISBN: 978-1-4493-6131-0 ISBN 10:1-4493-6131-5
ISBN-13: 978-0470650936,
Wiley.
4.
Making Sense of Data II by Glenn Myatt & Wayne
Johnson, John Wiley& Sons, 2009.
5.
Multivariate Data Analysis by Hair,
6.
http://statsoft.com/textbook/stathome.html.
7.
The Little SAS Book by Delwiche
and Slaughter.
Course Catalog Description
This course covers various
analytical techniques to extract managerial information from large data
warehouses. A number of well-defined data mining tasks such as classification,
estimation, prediction, affinity grouping and clustering, and data
visualization are discussed. Design and implementation issues for corporate
data warehousing are also addressed.
Detailed Course Description
Data mining supports
decision making by detecting patterns, devising rules, identifying new decision
alternatives and making predictions. This course is organized around a number
of well-defined data mining tasks: description,
classification, estimation, prediction, and affinity grouping and clustering. Students will learn to use techniques such as
Rule Induction (classification trees), Logistic Regression, and Discriminant
Analysis. Data visualization techniques will be used whenever possible to
reveal patterns and relationships.
Students will use commercially available software tools to mine large
databases.
The course is organized into
3 broad areas as follows:
1) Context/Data: Decision Support for Strategic Decision-making. Preliminary data analysis
2) Predictive Analytics: Predictive models and evaluation: Discriminant
Analysis, Trees
3) Segmentation/Association: Techniques like Clustering and Market Basket
analysis.
Learning Outcomes/Course Objectives
Upon completion of the course,
students will be able to work on real-life projects using relatively large
datasets, to build and evaluate prediction and classification models, and
segment populations.
Specifically, students will
learn to:
1. Apply analytics techniques
within a general framework for decision support within organizations.
2. Interpret business
requirements, organization structure, and translate that into data mining
projects that help an organization meet their decision support needs.
3. Collect data, perform
preliminary analyses including data aggregation, variable
creation/transformation, and data cleaning.
4. Split data into training and
validation samples. Use visual techniques to describe data.
5. Create Cross-tabulations for
bivariate analysis.
6. Explain in your own words the
assumptions of various techniques such as Cluster Analysis, Multiple
Regression, Discriminant Analysis, Logistic Regression, and Artificial Neural
Networks.
7. Build multiple regression,
discriminant analysis, and Logistic models for forecasting.
8. Validate classification
models using the Kolmogorov-Smirnov (K-S) test.
9. Interpret the output of
Classification tree algorithms like CART and CHAID.
10. Segment data using Cluster
Analysis, and interpret the output.
11. Discuss issues of
implementation of the results of various techniques.
12. Develop methods to monitor
the ongoing performance of implemented models.
13. Present an analytics project
report to top management in plain language, with implications for business
decision making clearly stated.
Methods of Instruction:
Students will be walked
through an entire real-life project in the financial services industry as the
course progresses. This will be done through a combination of lectures and
discussion of cases, plus guest lectures from industry experts. The team-based
project will help you put most of the concepts together and apply them to
another dataset in an industry of your choice.
Grading:
|
|
|
Course Average |
Grade |
Course Average |
Grade |
Assignments |
20% |
|
94-96,
97+ |
A,
A+ |
77-79 |
C+ |
Tests (2) |
60% |
|
90-93 |
A- |
73-76 |
C |
Team Project |
20% |
|
87-89 |
B+ |
70-72 |
C- |
|
|
|
83-86 |
B |
60-69 |
D |
|
80-82 |
B- |
Less
than 60 |
F |
Late work will get partial
credit only, with 10% less for each day of delay.
Software: Students are encouraged (not
required!) to do project work in SAS in order to develop a marketable skill.
You may choose other software - SPSS is available at GSU and R is free online.
General Policies:
1. Students are expected to
attend each class (who knows, you may actually enjoy the class!), arrive on
time and participate in class discussions.
2. Turn off cell phones, pagers, stereos, TVs, etc. when in class. Treat the instructor and each other with courtesy.
Course Assessment:
Your constructive assessment
of this course plays an indispensable role in shaping education at