Hard cards Flashcards
The constraint that all primary keys must have non-null values is referred as
Entity integrity rule
What are 4 data requirements for relational databases
- Every column must be single valued
- Entity integrity rule
- Referential integrity
- All non-key attributes must describe a characteristic of the entity identified by the primary key.
What is Entity Relationship Modelling (ERM)
ERM is a modelling notation used to model the characteristics and relationships of data. Useful to formalize and visualize the structure of data for implementation of databases.
In text mining, stemming is the process of:
Reducing multiple words to their base or root.
What does pivot do?
Swap the attributes in the horizontal and vertical fields.
What does a slice do?
Filter on one item form a dimension. Such as one product, one life cycle, or one client name.
What can be done with a roll-up
Insert a KPI diagram for showing the total sales.
How is the accuracy calculated?
TP+TN / ALL
How is the precision calculated?
TP / TP+FP
How is the true positive rate calculated?
TP / TP+FN
What does the information about coverage in Celonis mean?
The amount of cases which are represented by the shown model.
What does it mean when the start note has multiple arrows going to different places?
The process has different start activities depending on the different cases that are represented.
What does a misfitting model result in for auditors?
Misfitting model –> false negative audit results (compliance violations are not detected)
What does a imprecise model result in for auditors?
Imprecise model –> false positive audit results (compliance violations are indicated that did not occur in reality).
What does the tokenize operator do?
Splits text into tokens, often removing non letters.
Break up the text in individual parts which we then can analyze.
When is the conversion criteria met?
The convergence criterion is usually met when the assignment of points to clusters becomes stable.
What is escalation of commitment?
A pattern of behavior in which an individual or group will continue to rationalize their decisions, actions and investments when faced with increasingly negative outcomes rather than alter their course.
Many things come together here: sunk cost fallacy, status quo bias, omission bias, confirmation bias, etc.
What is the formula for the fitness?
Fitness = Re-playable cases / total cases
What does process mining encompas?
Techniques, tools and methods to discover, monitor and improve real processes by extracting knowledge from event logs.
What are some of the use cases of clustering?
- Identify natural groupings of customers;
- Identify rules for assigning new cases to classes for targeting/diagnostic purposes;
- Provide characterization, definition, labeling of populations;
- Decrease the size and complexity of problems for other data mining methods;
- Identify outliers in a specific domain.5