11 - Validating simulation models Flashcards
Motivation: Why validate simulations?
- when not rigorously calibrated and validated, simulations are neither a reliable research method nor a reliable tool for practical decision support
Definition: Verification, Validation, Calibration
(Steps, that subsume validation)
Verification
Does the code do what the specification asks?
- code could include faults
Definition: Verification, Validation, Calibration
(Steps, that subsume validation)
Structural Validation
Does the model correctly represent the problem space?
- are relevant relationships included in the model etc?
Definition: Verification, Validation, Calibration
(Steps, that subsume validation)
Input/Output validation
Does the simulation correctly represent the problem space?
- some kind of description of the status quo based on data
- parametrise the shape of the distribution
Calibration: adjusting input values to get valid output values
-> danger: overfitting the model when calibrating it
-> agent-based models are subject to over parametrization -> customer learning -> we can make assumptions but we don’t know the parameter for sure (no input parameters)
Simulation Data
Different layers of data
Input data
Process data
Output (result) data
Simulation Data
Different layers of data
Input data
- collected of inferred from empirical status quo
- e.g. customer arrival rate, range of products a company offers
Simulation Data
Different layers of data
Process data
- generated during the simulation process, provides insights into the simulation model that are not empirically available or relevant for the simulation purpose
- process or event logs from the real world can be compared
Simulation Data
Different layers of data
Output (Results) data
- indicators calculated for validation or what-if analysis, matching empirical indicators
- in the status quo these are the indicators we are interested in -> what happens when we change them? Outcome?
Simulation data
Input data
Assumption:
- Structure of relevant simulation components and parameters has been determined
Direct observation:
- for transparent systems
- example: machine run times
- or: what is the actual outline of the shop
Indirect inference:
- based on empirical process and result indicators
- example: customer choice
- > we know what happened but not why, so we apply data analysis to find a model how things work in the real world
Simulation data
Input data: scenarios
Stochastic scenarios
Worst- and best case scenarios
Qualitatively discrete scenarios
Simulation data
Input data: scenarios
Stochastic scenarios
- follow empirical distributions
- stochastic input scenarios lead to stochastic process and result data
- > when we are uncertain about data -> consider stochastic scenarios
Simulation data
Input data: scenarios
Worst- and best-case scenarios
- model extreme cases
- results indicate the modeled systems’ robustness
- > robustness = does it behave the same in each of these cases?
Simulation data
Input data: scenarios
Qualitatively discrete scenarios
- model discrete alternative cases
- test robustness and the necessity of individualized strategies
Simulation data
Example: railway ticket as simulation input
Directly observable:
- supply: products, capacity, price categories, availabilities
- demand: historical sales
Indirect inference (also: estimation):
- customer loyalty (e.g. we have the name on the ticket -> we can analyse how often they bought in the past)
- reference prices
- willingness to pay
- > can use this to calibrate customers in a model
Simulation data
Input data on the future
- Insights:
- analyzing empirical data to parametrize the simulation input data is based on insights, the pre-condition for predictive analytics - Forecast:
- to simulate future scenarios, the future values of input data have to be forecasted
Simulation data
Process data
Simulation systems are fully transparent - all data that is generated in the process of the simulation can be observed
We may look at: empirically-available process data
- forecasts
- plans
- documented events
And we can compare it to: simulation-exclusive process data (= generate process data that wouldn’t have been availble in the real world):
- decisions
- learned experiences
- communication
Simulation data
Result data
Usually empirically-available
- system reports
- transaction data
Simulation-exclusive
- long-term developments
- what-if developments
- internal agent- and system-states: e.g. customer satisfaction, emergent strategies
Simulation dara
Result data
Usage
- Output validation: compares result data to empirical data
- sensitivity and meta-modeling: analyse relationship in the “black box”
- data farming: simulations generate artificial transaction data -> data of the real world is not sufficient so we need more data
Simulation data
Data farming
= simulation models run thousands of times can provide insights into the different consequences of different options
- generates as much reproducible data as desired (amount is only limited to time and storage space)
- success depends on validity - which is difficult to determine for human, social, cultural and behavioral modeling
Simulation data
Data farming
Which types of data farming are there?
Simple: Monte Carlo
- Create data from known distributions
Mechanic: Discrete event-based white box
- create data given varying scenarios
- look at everything that happens in the real world -> input data
Emergent: Agent-based black box
- create data based on assumptions and theories
- we don’t know what’s actually going on in the empirical agents
Simulation data
Data farming with Monte-Carlo Simulations
Problem: Idea: Chance: Risk: Example:
Problem:
- empirical data set is not large enough to allow for significant statements about variable relationships
Idea:
- create additional data based on distributions fitted to the empirical sample (fill in potential gaps)
Chance:
- draw more meaningful conclusions from the enriched data set
Risk:
- enriched data set is “tainted” by a priori assumptions about distributions (gaps are filled based on assumptions)
Example:
- survey responses on the influence of consultative committees
Verification
aims to ensure that the code does what the specification asks - identify and eliminate “bugs”
Problem: when to stop testing?
- testing takes a creative and destructive mind
- avoid testing your own code
- test cases should be formulated by subject matter experts (don’t use the same people to develop and test -> won’t notice mistakes)
- test cases only prove the absence of those errors that they were designed to test
Structural validation
Does the model correctly represent the problem space?
- systematically explicate model components and relationships
- expert walkthrough with mode stakeholders
Questions:
- Are all concepts and structures relevant to the problem included?
- is the model structure consistent with relevant knowledge of the system?
Structural validation
ODD protocol
- used to describe individual-based models, agent based models, simulation models
Overview:
- Purpose
- Entities, state variables, and scales
- Process overview and scheduling
Design:
4. Design Concepts (Basic principles, emergence, adaptation, objectives, learning, prediction …)
Details:
- Intitialization
- Input data
- Submodels
Input data validation
Input validation
- sometimes defined as an aspect of structural validation definition
- design input probability distributions that match the empirical observations
- design input parameter values that match expected future scenarios
Questions:
- Do inputs correspond to empirical observations?
- Are the parameter values consistent with descriptive numerical knowledge?
- > parametrizing can show that there’s a lack of information
- > e.g. building a simulation to test different settings for the shop floor -> you find out that manufacturer doesn’t know the machine runt time
- > if input variables cannot be validated, output cannot be validated either
Output validation
- Does the simulation output match empirical observations?
- also called behavioral validation
- > run the simulation model and generate output data -> you have to wait for the last minute to do output validation
- > compare if the simulation does what we observe in the real world
- > e.g. model the queue precision correctly
Output validation
Questions of output validation
What is the average distance between empirical observations and simulation results?
- percentage: MAPE
- absolute: RMSE
- bias
- > How much error is acceptable?
Do confidence intervals overlap?
-> How much confidence is enough?
Does a meta-model fitting the empirical observations also fit the simulation? (E.g. a regression model)
- for this, empirical input and output information needs to be available
- > What fit suffices for the empirical model, anyway?
- > How closely does the regression model match?
Output validation
Cross-validation
- the data set is split into training, validation and test set
- to calibrate the simulation, you use the training and validation set
- then you use the test set for test validity