6. Alignments Flashcards
What is an optimal alignment?
A possible explanation for the differences between a model and event log
If the model has been discovered then we can use alignments to measure the quality of the algorithm used for discovery (based on the match in behaviour)
If the model is man-made then we can use alignments to find real-world problems in the process
Cost 0 = perfect fitness
There may be more than one optimal alignment
What is conformance checking for?
To compare a model and event log
How to do conformance checking?
Compare the event log and model. Log and model moves (i.e. moves that don’t happen in both) have cost 1. We want the lowest cost possible
What is the synchronous product?
- Model on top
- Events from log on bottom, put places in between (and before/ after them). Labels for log moves should have arrows in front, model moves arrow at end, sync no arrows
- Put event between places before model move and log move with same name that goes to places after them (like a cross)- if multiple then do both
- Model and log moves = cost 1, synchronous moves = cost 0, tau moves = cost 0
- Add initial marking
DOUBLE CHECK HAVE ALL THE RIGHT INPUT PLACES
How can you find the shortest path in a synchronous product marking graph?
Using Dijkstra’s algorithm
Build the graph breadth-first search from the initial state until the final state is found (and hence all other states at a distance less than the final state)
However, this only works if there is no x such that there is an infinite number of states reachable with distance < x (so that it is computable)
So we cannot have a tau transition which keeps adding tokens into the net as this will give us infinite increasing markings with cost 0
To avoid this we can check if the model is sound (can also add a small amount of cost to tau but often not needed)
Using A star algorithm
Same as Dijkstra but for every state we underestimate the cost of reaching the final marking (so expected cost = cost so far + underestimate)- stops us from having to visit paths we know are going to end up being longer
Use marking equation, minimise c^Tx (cost function) such that m + Cx = mf (final marking) to estimate cost
x can be integer vector but also a non integer vector- (note solution is not an executable sequence- indication of cost)
O(log(size q) + alpha), polynomial in model size (for linear programming)
What is the highest possible cost?
All log moves + the shortest sequence of model moves
How to calculate trace/ alignment-based fitness?
1 - (cost of trace)/ (cost of best of worst cost- all log moves + shortest model moves)
Log fitness- then sum all costs (not average trace fitness)
HAVE TO MENTION COST
How to calculate token based fitness?
1- 0.5(missing/consumed)-0.5(remaining/produced)