EMBA 8150 Project
- Think of something that
you may want to predict or classify in your business (the business of one
of the group members). If you can find data for that variable and
potential predictor variables, that is ideal. If not, use a dataset
available online. A couple of sources are:
- UCI Machine Learning
Repository
- Kaggle
- Find a suitable dependent
variable (numeric or categorical) to predict from the dataset you pick – numeric is ideal, since we did
that in class. If you pick a categorical dependent, make sure you only use
one with two categories, and then follow the posted example of
classification – this will be more difficult for you since we did not
explicitly discuss how to do this in class.
- Build a prediction (numeric
dependent) or classification (categorical dependent) model and interpret.
Written Report
Guidelines
- Introduction
– what is the goal of the project?
- Data
- Source,
variables (put data dictionary in appendix)
- Sample
Size
- Dependent
Variable, Outcome period, Sample Time Frame (be aware that the outcome
period can be 0 if the prediction is for right now rather than the
future. For example, predicting a home price based on area, number of
bedrooms, etc., would have an outcome period of 0 since the predicted
price is estimated for the current moment).
- Data
Preparation, if any was needed (aggregation, variable creation, data
cleaning)
- Methodology
- Preliminary
Analysis – compute Means and Standard Deviations of each variable in the
dataset so you get a sense of what the data look like.
- Show
scatterplots of each variable against the dependent (if dependent is
categorical, then do a pivot table instead, to show how the categories
are spread across the independent variable values).
- Do a
Regression to predict of classify
- Results
- Show
scorecard (results of final regression in plain English – write out the
model and interpret it)
- (put
actual regression results only in
appendix)
- Evaluate
using R-square and SE, or using the tabulation of classification results
by score group.
- Implementation
- Discuss
how the client should implement your model – If a classification model,
what cutoff scores you recommend, what strategies/decisions go with the
cutoffs.
Oral Presentation
Change the order for the oral presentation:
Present the project as you might to a client. As with the
written report, start with an introduction (even though the client knows what
their problem is, you discuss it anyhow). However, immediately follow that with
results and recommendations. Only go through data analysis details if asked for, or if some aspect of the data is
necessary to go through to understand the recommendations.