Lecture 7 Flashcards
Prescriptive analytics
What is prescriptive analytics?
prescriptive analytics is concerned with what
foresight can be obtained after a predictive model has been built.
What is the general framework for prescriptive analytics?
Although a general framework can be built, the details of the specific methodology are domain- and technology-specific.
The general framework is:
1. Identify alternative decisions and objectives
2. Model and simulate alternative decisions
3. Select an optimal decision
4. Perform Analysis
Although a general framework can be built, the details of the specific methodology are domain- and technology-specific.
How does it vary in practice ?
In practice, as highlighted by the literature review, there is no single approach to prescriptive analytics.
The developed approaches are often domain-specific and techniques specific.
Usually, the combination of multiple approaches is employed.
The focus of the course is improving the robustness of the model
What are some techniques for prescriptive analytics?
The techniques are:
* What-If analysis
* Simulation (and ML)
What is a What-If analysis ?
It’s a qualitative approach to determine:
* Exceptional situation
* Reaction
* Likelihood of happening
* Consequence of happening
* Recommendation
For each of the previous 4, the What-If questions have to be developed and answered
What is Simulation (and ML) ?
Machine Learning is by definition an inductive process:
A ML algorithm defines the optimal hypothesis (i.e. predictive model) by learning from the existing data.
On the other hand, in the conventional modelling approach, a human defines a model, tunes its parameter and produces new data (i.e.
simulation results)
How do you integrate simulations and ML ?
Several integrations of ML and simulation models can exist, based on the integration point.
The focus of this course is on:
ML model as generator:
Once a ML has been fit on the existing data and has provided good performances, it can also
be used to make predictions on unseen data (i.e.
counterfactual analysis)
What is the machine learning pipeline in practice?
ML in practice goes through the following steps:
1. Raw Data
* Collection
* Download
* Scraping
2. Data Preprocessing
* Data quality (cf. Diagnostic)
* Missing data
* Categorical variables
3. Train-test split
* Single validation
* Cross-validation
4. Model fit
* Fit on training data
* Test on testing data
5. Performane Evaluation
* Performance metric choice
* Evaluation on validation data
How do you make a model more robust in the ML pipeline?
You change the pipeline such that:
ML in practice goes through the following steps:
1. Raw Data
* Collection
* Download
* Scraping
2. Data Preprocessing
* Data quality (cf. Diagnostic)
* Missing data
* Categorical variables
* Dimensionality reduction
3. Train-test split
* Single validation
4. Model fit
* Fit on training data
* Test on testing data
* Ensembling
5. Performane Evaluation
* Performance metric choice
* Evaluation on validation data
* Cross-validation
What is dimensionality reduction ?
Dimensionality reduction is the process of transforming the original dataset:
From n columns/features
To k < n columns features
Through feature extraction and feature selection. The original features might or might not be preserved. The original variables are
lost, and replaced by projection/compressed
equivalents.
Feature selection: A subset of the original variables is chosen for the model
What is feature construction in dimensionality reduction ?
Manual feature extraction.
What is feature learning in dimensionality reduction ?
Automatic feature extraction
What is feature transformation in dimensionality reduction ?
Usually denotes less sophisticated transformations over the features, like re-scaling data, bucketing, etc
What is feature engineering in dimensionality reduction ?
Sometimes it is used as a synonym for feature extraction, although contrary to extraction, there seems to be a relatively universal consensus that engineering involves not only creativity constructions but pre- processing tasks and naïve transformations as well.
What is feature selection ?
A subset of the original variables is chosen for the model. Uses different methods:
Filter methods
Wrapper Methods
Embedded Methods
Ex: Chi squared test, information gain and correlation coefficient scores.