3. DP-203 Data Integration with Data Factory Flashcards
Much of the functionality of Azure Data Factory appears in Azure Synapse Analytics as a feature called pipelines. You can use it to integrate data pipelines between which of the following?
Select all options that apply.
-Spark Pools
-Apache Hive
-SQL Pools
-SQL Serverless
-Spark Pools
-SQL Pools
-SQL Serverless
Pipelines enable you to integrate data pipelines between SQL Pools, Spark Pools, and SQL Serverless.
Which of the following provides a cloud-based data integration service that orchestrates the movement and transformation of data between various data stores and compute resources?
Azure Data Factory
ADF has native functionality to ingest and transform data, sometimes it will instruct another service, such as Databricks, to perform the actual work required on its behalf. Which of the following terms best describes this process?
Orchestration
Which of the following terms describes analyzing past data patterns and trends by looking at historical data and customer insights?
-Descriptive Analytics
-Prescriptive Analytics
-Predictive Analytics
-Descriptive Analytics
Microsoft Azure provides a variety of data platform services that enables you to perform different types of analytics. Predictive analytics can be implemented through which of the following features?
Select all options that apply
-HDInsight
-Azure Data Lake Storage Gen2
-Machine Learning Services
-Azure Databricks
-HDInsight
-Machine Learning Services
-Azure Databricks
Data integration includes extraction, transformation, and loading of data. It is commonly referred to as Extract-Transform-Load or ETL.
At which stage in the ETL process is the splitting, combining, deriving, adding, and removing data carried out?
Transform
You are creating a new Azure Data Factory instance. The instance name must be unique within which of the following?
Globally within Azure
How would you define an Azure factory dataset?
A dataset is a named view that points to, or references, the data.
How would you define an Azure Data Factory pipeline?
A pipeline is a logical grouping of activities that together perform a task.
How would you define an Azure Data Factory ‘activity’?
Activities typically contain the transformation logic or the analysis commands of the Azure Data Factory’s work.
What are the three categories of activities within Azure Data Factory that define the actions to be performed on the data?
Select all options that apply.
-Data movement
-Linked Service
-Data transformation
-Control
-Data transformation
Data transformation activities can be performed natively within the authoring tool of Azure Data Factory using the Mapping Data Flow. Alternatively, you can call a compute resource to change or enhance data through transformation or perform analysis of the data.
-Control
You can use the control flow to orchestrate pipeline activities, including chaining activities in a sequence, branching, defining parameters at the pipeline level, and passing arguments while invoking the pipeline on-demand or from a trigger.
When graphically authoring ADF solutions, you can use the control flow within the design to orchestrate which of the following pipeline activities?
Select all options that apply.
-Execute Pipeline Activity
-WebActivity
-ForEach Activity
-Parameters Activity
-Execute Pipeline Activity
-WebActivity
-ForEach Activity
Which of the following processes will allow data to be extracted and loaded in its native format?
Select all options that apply.
-ELTL
-ETL
-ELT
-ETLL
-ELTL
-ELT
Select all options that apply.
Pipelines in Azure Data Factory typically perform the four distinct steps. Identify these steps?
-Connect and Collect
-Publish
-Transform and Enrich
-Monitor
-Data Analysis
-Connect and Collect
-Publish
-Transform and Enrich
-Monitor
To create and manage child resources including publishing in the Azure portal, which Data Factory role must you belong to at the resource group level or above?
-Data Factory User
-Data Factory Contributor
-Data Factory Writer
-Data Factory Reader
-Data Factory Contributor