Data Ingestion & Transformation Flashcards

1
Q

Which is a feature of Azure Synapse pipelines?

A

Monitoring of Spark Jobs for data flow

This is a feature of Azure Synapse pipelines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Kusto?

A

A query that allows you to interact with data

Kusto is a query language that allows you to interact with data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the definition of “wrangling data flow?”

A

Utilizing Power Query for code-free data interpretation

This is a great definition for wrangling data flow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does “shredding JSON” mean?

A

Parsing data into columns

Yes! This is the definition of Shredding JSON.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

You want to create a Spark linked service in Data Factory. What do you need to do to create a Spark cluster?

A

Nothing; it is automatically created for you just-in-time by Data Factory

To create a Spark cluster in Data Factory, you don’t need to do anything. It is automatically created for you just-in-time by Data Factory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some uses for T-SQL?
A) Perform code-free transformations on data types, or create aggregates.
B) Perform orchestration services, such as creating alets or amonitoring data pipeline activities.
C) Filter or alter data and return the query results as a data table
D) Create tables for results or save datasets.

A

C) Filter or alter data and return the query results as a data table

D) Create tables for results or save datasets.

This is a use for T-SQL.

This is a use of T-SQL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which activity is NOT possible with Azure Data Factory?
A) Data streaming
B) Data movement
C) Control
D) Data transformation

A

A) Data streaming.

This is NOT a valid activity with Azure Data Factory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a conditional split?

A

Routes data rows to particular streams based on specified conditions.

This is the definition of a conditional split.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Choose 2 ways we go about cleansing data.

A) Use the Clean Missing Data module
B) Spark Cleaner Hive
C) Data wrangling services
D) Mapping data flows

A

A) Use the Clean Missing Data module
D) Mapping data flows

This is a way to clean data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Put the following activities in order:
Write Tests
Check Row Counts
Count Activities
Publish Pipeline

A

A) Write tests, publish pipeline, count activities, and check row counts.

This is the correct order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly