1. Extracting and Pre-Processing Event Logs Flashcards

1
Q

Why do we have to extract event logs?

A
  • Event data is recorded as it occurs and thus never grouped into traces or event logs
  • An event records multiple attributes (not just the name)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an event table/ stream?

A

A raw logging format for events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you show that an event e attribute a is undefined?

A

π(e,a) = ⊥

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do we require for each event?

What do we require for each event in an event table?

What do we need (in addition) for an event log?

A
  • The attribute time is defined
  • e has a value for some attribute other than time (so has at least one meaningful observation)
  • Event table = finite sequence of events that all have the same attribute a defined (this could be an activity name of other measurement)
  • A case identifier (this is usually an entity type attribute but could technically be any attribute except time- depends on the question we want to answer)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between a general attribute and an event type?

A

Event types (i.e. users, orders, customers, deliveries) refer to specific/ unique entities or objects

To figure out if something is an entity type we need domain knowledge and additional context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

If we use id as a case identifier c and π_id(e) = c, how do we describe this?

A

event e is correlated to case c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If all events correlated to a case c carry the same value v for a certain attribute x then what do we call x?

A

A case attribute of c

If that applies to every case c then x is a global case attribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a trace?

A

A sequence of events correlated to a case and ordered by time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a structured event log?

What structure does it have?

A

Set of cases where each case is associated with exactly one trace with this case as a case attribute

Hierarchical structure
1. Cases
2. Case attributes as children (one of them being the trace)
3. Each event as children including their attributes (i.e. timestamp, observed activity, case identifier)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can cases share events?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the difference between an activity and an event?

A

An event e ∈ E describes that a specific discrete ob-servation has been made (by a sensor, a system, a human observer, etc.). E.g. from an event table: it is an event that has a time: 19/12/2018 15:46, when payment was received. Handled by user System. The attribute time is thus defined and carries a value for some other attribute.

An activity is a specific action that can be executed or observed (so think it will be more generic like - making payment)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an event attribute?

A

An attribute only specified for certain types of events (i.e. with a certain kind of activity) or where the value is specific to the event (i.e. not all events in the same case share the same value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Is an event atomic?

A

Yes, event should have single timestamp (not a start/ end timestamp)

To handle this we introduce lifecycle transitions- start/ complete (describe status of longer running activity- for example, two events with same activity name but with different lifecycle transitions- a start and end)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an event classifier?

A

Function that maps each event to a value. The value of an event classifier for a specific event will be an event class. If two events have the same value then they belong to the same event class.

For example, the standard event classifier is the activity name classifier class.

We can make a simple trace from any event classifier as a sequence of the values (ignoring the missing values).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If we want to use a particular event classifier for an analysis but an event has no value defined for that particular classifier, what should we do?

A

Omit the event from our analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the simple event log?

What is another way to call the simple traces in the simple event log?

A

List of all the simple traces for the selected event classifier with an annotation of the number of times that particular sequence occurs marked in the list.

Simple traces in the simple event log = trace variants

17
Q

What are the three basic pre-processing operations we might want to do on an event log?

A
  1. Selection of certain cases that satisfy a particular property
  2. Projection removes events that do not satisfy a particular property
  3. Aggregation groups multiple subsequent events with the same property together (for example, multiple instances of Eat Food together might be reduced to one Eat Food)
18
Q

What are some potential limitations of event logs?

A
  • the timestamp info is often not reliable (for example, if only recording on day level and we have three activities on same day, how do we order them? Potential solution is partial ordering)
  • they can only have a single case identifier. But there are many more relations present, graph-based data structures allow us to represent these
19
Q

Can you use timestamp as a case identifier?

A

Technically yes but then have nothing to order the events with (so basically no)

20
Q

Why do we always study finite traces?

A
  • We only have finite time beforehand
  • Interested in the start to end dynamics
21
Q

How to build a structured event log?

A
  • Group by case ID (ignoring events which have empty values in the case ID column)
  • Order within case ID by timestamp
  • Identify then the trace for each case ID and any global case attribute
22
Q

Why do we not record event logs directly in organisations?

A

The extra processing power required to do it in production is often not worth it (so need to extract ourselves afterwards)