ETL Data Pipelines Flashcards

1
Q

What is process Intelligence ETL?

A

The data ingestion component. It automates data extractions and transformations from external source systems and loads it directly to SAP Signavio Process Intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is this setup on process intelligence?

A

there is not need to configure a staging environment, if you’re extracting from on-prem it requires an additional setup on the system side

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

[creation of ETL Data Pipelines] What happens during the Extract phase?

A
  1. configure data sources
  2. configure integrations
  3. click extract
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

[creation of ETL Data Pipelines] What happens during the Transform phase?

A
  1. configure business objects [sql scripts]
  2. preview sql scripts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

[creation of ETL Data Pipelines] What happens during the Load phase?

A
  1. select or create a process
  2. click run
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does SAP Signavio ETL use to carry out ETL?

A

they use standard connectors and provides an interface to extract, transform and load data. All interaction stays within the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the 3 main components of PI ETL

A

1- data source management
2. integration management
3. data model management

[image 6]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do the the 3 main components do?

A

they are the integrated features of ETL, which together setup data pipelines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is data source management

A

the framework to manage online data sources. it includes credential mgmt and scheduling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does data sources establish?

A

a connection to the source system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is integration management

A

the framework to define what, how and when to extract data. it includes pseudonymisation (techniques that replace, remove or transform information that identifies individuals, and keep that information separate) and partitioning schemas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to extract, when the system is on prem?

A

On-premises Extractors are needed, it can then be set up under integrations where the specific tables and schedules for continuous loads can be defined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the two options for integration?

A
  1. Simple method
  2. intricate method
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the simple method?

A

select the tables and fields through a graphical interface.
you can add a partition strategy in case of large tables and define field using a delta criteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the intricate method?

A

write your own extraction scripts with SQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When integrating, what is the default extraction time and why?

A

midnight (although continuous loads can be customised)
because data extraction should be done when the load on the source system has the least impact

17
Q

What is data model management

A

the framework to transform your data into an event-log starting from connected integrations and extractions.
This is also when you connect your data to an investigation to start the process analysis.
It includes process orientated data modeling, SQL editors for the transformation and live previews of transformed data.

18
Q

What do you do in a data model?

A

define how the ETL data pipeline extracts and transforms data, and where to load the data.

19
Q

5 steps to creating and using data model

A
  1. create a new data model
  2. select source system
  3. select the data model template
  4. select integration
  5. select the configured data source
20
Q

A data model in SAP Signavio ETL has different sections - what is shown in the extraction section?

A
  1. the connector - aka how we accessed the data on the source system eg. sap ecc
  2. integration - what data we extracted from a source system - we can add new data or use a pre selected model template
21
Q

A data model in SAP Signavio ETL has different sections - what is shown in the transformation section?

A

[it looks like a BPMN model]
1. process events - representing main events
2. business objects - transformation rules for the case attributes and events

22
Q

What is an event collector?

A

the scripts for an event

23
Q

How can you adjust transformation rules

A

using SQL