Data Ingestion & Transformation Flashcards

Question 1

Q

Which is a feature of Azure Synapse pipelines?

Answer

A

Monitoring of Spark Jobs for data flow

This is a feature of Azure Synapse pipelines.

Question 2

Q

What is Kusto?

Answer

A

A query that allows you to interact with data

Kusto is a query language that allows you to interact with data.

Question 3

Q

What is the definition of “wrangling data flow?”

Answer

A

Utilizing Power Query for code-free data interpretation

This is a great definition for wrangling data flow.

Question 4

Q

What does “shredding JSON” mean?

Answer

A

Parsing data into columns

Yes! This is the definition of Shredding JSON.

Question 5

Q

You want to create a Spark linked service in Data Factory. What do you need to do to create a Spark cluster?

Answer

A

Nothing; it is automatically created for you just-in-time by Data Factory

To create a Spark cluster in Data Factory, you don’t need to do anything. It is automatically created for you just-in-time by Data Factory.

Question 6

Q

What are some uses for T-SQL?
A) Perform code-free transformations on data types, or create aggregates.
B) Perform orchestration services, such as creating alets or amonitoring data pipeline activities.
C) Filter or alter data and return the query results as a data table
D) Create tables for results or save datasets.

Answer

A

C) Filter or alter data and return the query results as a data table

D) Create tables for results or save datasets.

This is a use for T-SQL.

This is a use of T-SQL.

Question 7

Q

Which activity is NOT possible with Azure Data Factory?
A) Data streaming
B) Data movement
C) Control
D) Data transformation

Answer

A

A) Data streaming.

This is NOT a valid activity with Azure Data Factory.

Question 8

Q

What is a conditional split?

Answer

A

Routes data rows to particular streams based on specified conditions.

This is the definition of a conditional split.

Question 9

Q

Choose 2 ways we go about cleansing data.

A) Use the Clean Missing Data module
B) Spark Cleaner Hive
C) Data wrangling services
D) Mapping data flows

Answer

A

A) Use the Clean Missing Data module
D) Mapping data flows

This is a way to clean data.

Question 10

Q

Put the following activities in order:
Write Tests
Check Row Counts
Count Activities
Publish Pipeline

Answer

A

A) Write tests, publish pipeline, count activities, and check row counts.

This is the correct order.

Question 11

Q

Data Ingestion & Transformation Flashcards

(11 cards)