Stats SA3 Flashcards
What does the data say will happen?
Predictive Analytics
What has happened or what is happening now?
Descriptive Analytics
Why it happened?
Diagnostic Analytics
What will likely happen?
Predictive Analytics
Predictive Analytics Process Order
Project Design, Data Sampling, Data Exploration, Data Modification, Model Validation, Model Development
Kickoff meeting
Understand modeling objective
Define acceptance criteria
Document data and deployment requirement
Project Design
Data extraction
Apply filters and exclusions
Identify external data sources
Data Sampling
Exploratory data analysis
Identify data dependencies and correlations
Identify trends or anomalies in the data
Data Exploration
Data Cleaning
Data augmentation and transformation
Feature selection
Data Modification
Model performance review
Feedback based on business knowledge and inputs from subject matter experts (SME’s)
Model Validation
Apply different modeling techniques and select final methodology
Model Development
Linear Regression Analysis Formula
y = 6x + a + ε
Dependent Variable (Value to be predicted)
y
Beta coefficient (Rate multiplied to X)
6
Independent variable (Value driving prediction)
x
Alpha intercept (Baseline figure for y)
α
Error term (Balancing figure)
ε
Reasons for Inclusion for the Error Term (1) :
To account for unexplained variability in the dependent variable for other relevant independent variables, which may not have been included in the model
Reasons for Inclusion for the Error Term (2) :
To capture measurement error in both the dependent and independent variables
You can have more than one predictor variable (x1 - xn)
Multiple Linear Regression
You still need to investigate the model’s _______
goodness-of-fit
You need to prove if your predictors are _______
significant
The _________, R^2, is a goodness-of-fit measure
coefficient of multiple determination
_____ is a figure of merit; the higher the ____, the better is the success of the model in explaining the variation in the response using the set of predictors
R^2