EMBA 8150 Project

 

 

 

  1. Think of something that you may want to predict or classify in your business (the business of one of the group members). If you can find data for that variable and potential predictor variables, that is ideal. If not, use a dataset available online. A couple of sources are:
    1. UCI Machine Learning Repository
    2. Kaggle

 

  1. Find a suitable dependent variable (numeric or categorical) to predict from the dataset you pick – numeric is ideal, since we did that in class. If you pick a categorical dependent, make sure you only use one with two categories, and then follow the posted example of classification – this will be more difficult for you since we did not explicitly discuss how to do this in class.  

 

  1. Build a prediction (numeric dependent) or classification (categorical dependent) model and interpret.

 

Written Report Guidelines

 

  1. Introduction – what is the goal of the project?

 

  1. Data
    1. Source, variables (put data dictionary in appendix)
    2. Sample Size
    3. Dependent Variable, Outcome period, Sample Time Frame (be aware that the outcome period can be 0 if the prediction is for right now rather than the future. For example, predicting a home price based on area, number of bedrooms, etc., would have an outcome period of 0 since the predicted price is estimated for the current moment).
    4. Data Preparation, if any was needed (aggregation, variable creation, data cleaning)

 

  1. Methodology
    1. Preliminary Analysis – compute Means and Standard Deviations of each variable in the dataset so you get a sense of what the data look like.
    2. Show scatterplots of each variable against the dependent (if dependent is categorical, then do a pivot table instead, to show how the categories are spread across the independent variable values).
    3. Do a Regression to predict of classify

 

  1. Results
    1. Show scorecard (results of final regression in plain English – write out the model and interpret it)
    2. (put actual regression results  only in appendix)
    3. Evaluate using R-square and SE, or using the tabulation of classification results by score group.

 

  1. Implementation
    1. Discuss how the client should implement your model – If a classification model, what cutoff scores you recommend, what strategies/decisions go with the cutoffs.

 

 

Oral Presentation

 

Change the order for the oral presentation:

Present the project as you might to a client. As with the written report, start with an introduction (even though the client knows what their problem is, you discuss it anyhow). However, immediately follow that with results and recommendations. Only go through data analysis details if  asked for, or if some aspect of the data is necessary to go through to understand the recommendations.