P1.F.4.3 Data Analytics - Analytic Tools Flashcards
Type of Data Analytics
P1.F.4.3 Data Analytics - Analytic Tools
- Descriptive: Who, what, when and where?
- Diagnostic: Why something happened?
- Predictive: What will happen? (forecast)
- Prescriptive: What should happen? Greatest value because its exercises can lead to decisions that can create value.
Predictive Analytic Techniques
P1.F.4.3 Data Analytics - Analytic Tools
- Find exploratory variables that correlate to dependent variable.
Example: calculate regression equations - Decide which data to include or exclude
Example: outlier: outside the norm. - Derive regression line supported by backtesting
- Validating the fit: split into two groups; one to derive and one to test
- Compare with other models
Exploratory Data Analysis
P1.F.4.3 Data Analytics - Analytic Tools
- An exercise undertaken without an existing hypothesis regarding the data.
- Goal is to find a new and useful relationship among variables
Limitations of Data Analytics
P1.F.4.3 Data Analytics - Analytic Tools
- Doesn’t explain causation or address motives
- Lacks qualitative measures
- May encourage transactional focus instead of relationships
- Doesn’t lead to perfect decisions
- Confirmation bias must be overcome
Data Analytic Model Challenges
P1.F.4.3 Data Analytics - Analytic Tools
- Will never reconcile exactly
- Employing the right level of detail
- Increasing variables increase costs and complexity
- Randomness always seems present
- Choosing and sampling population
Data Analytic Model Types
P1.F.4.3 Data Analytics - Analytic Tools
- Clustering: define variables and visually displays them
- Classification: puts observations into categories
- Regression: study of relationships among variables
- Multiple regression: more than one explanatory variable
Sensitivity Analysis
P1.F.4.3 Data Analytics - Analytic Tools
- Refers to the degree to which changes in input variables affect output.
- Shows which variable are critical and how to measure them.
- Demonstrates overall quality and data sufficiency
- Models should be built to accommodate
- End results demonstrates model trustworthiness
Sensitivity Analysis Benefits & Limitations
P1.F.4.3 Data Analytics - Analytic Tools
Benefits
- Demonstrates model veracity
- Spotlights important variables to control
Limitations
- Only shows what to discard
- Overhead cost that doesn’t add to the value chain
Simulation Models
P1.F.4.3 Data Analytics - Analytic Tools
- Systematic way of dealing with uncertainty
- Repeatedly test model with randomized inputs
- Demonstrates range and probability of outputs
- Vast applications
Simulation Model Benefits & Limitations
P1.F.4.3 Data Analytics - Analytic Tools
Benefits
- Makes decisions in the face of uncertainty.
- Helpful in replacing intuition, prejudice and flat out guessing
- Creates confidence around best-case, worst-case and most likely scenarios
Limitations
- Can’t predict human responses or behaviors to changes
- Can’t model casual links that affect a particular result in the real world
- Accuracy depends on input quality
What-if & Goal Seeker
P1.F.4.3 Data Analytics - Analytic Tools
- Both tools to run scenarios to understand possibilities
- Prepare for best/worst case
What-if
- Starts with changes in input
- What will happen if we change this?
Goal-seeking
- Starts with output goal
- If we want to change the result, what needs to happen?
Regression - Simple & Multiple
P1.F.4.3 Data Analytics - Analytic Tools
- Find dependent variable
- From one or more independent (explanatory) variables
- Contains constants
Simple: one explanatory
Multiple: more than one explanatory
Least Squares Line
P1.F.4.3 Data Analytics - Analytic Tools
The line that minimizes the vertical distances between itself and the data points.
Least Squares Line Equation
P1.F.4.3 Data Analytics - Analytic Tools
Observed value = Fitted value + Residual
- Fitted value: vertical line distance between x-axis and the line
- Observed value: actual point
- Residual: difference between fitted value and observed value
Regression Equation Calculations
P1.F.4.3 Data Analytics - Analytic Tools
y = a + bx
y = the mean y value a = the optimal y-intercept b = the optimal slope (variable coefficient) x = the mean x value
- b (numerator) = (mean x value - x value) - (mean y value - y value)
- multiply x difference by y difference
- b (denominator) = (mean x value - x value) squared
- add the values