Azure Data Factory Flashcards

1
Q

What does Azure Data Factory Do?

A

It stores data in the Azure Data Lake Storage; you can analyze, transform, and publish the organized data from here. You can also use Apache Spark or Hadoop with the data alos.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does a typical orchestration look like for Azure Data Factory?

A

Dataset -> PipeLinbe -> OutputData -> LinkedService -> (Azure data lake, Block Storage, SQL)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an Azure Data Factory Linked Service

A

Contains information needed to conect to external data sources (Like a SQL data connection string).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an Azure Data Factory Gateway?

A
  1. Connect your on-prem to Azure Cloud.
  2. It consists of a client agent thet is installed on-prem and then connects to Azure Data Factory.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does Azure Data Factory help us perform?

A

Orchestrating the moving, transforming, and loading of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What methods can we use to build the Azure Data Factory Pipelines

A

CLI
API
Powershell
Ci/CD
Portal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is an Azure Data Factory Pipeline?

A

It is a series of tasks like copying, transforming, and storing the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

I want a code method to create Azure Data Factory Data Pipelines. What are my options, and explain?

A

Use a CiCD such as GitHub or Azure DevOps to hold the code to create your Azure Data Factory Pipeline.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How would you connect with an on-prem SQL in Azure Data Factory?

A

A linked service can connect with an SQL Database, a supported type.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How would you connect with an on-prem SFTP using Azure Data Factory?

A

Use a link service in Azure Data Factory; it supports SQL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How would you connect with a CosmoDB database using Azure Data Factory?

A

Use a link service in Azure Data Factory; it supports CosmoDB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How would you connect with an REST API in Azure Data Factory?

A

Use a link service in Azure Data Factory; it supports REST API.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

List the supported link service types for Azure Data Factory?

A

Azure Blob Storage
Azure Data Lake Storage Gen1 and Gen2
Azure SQL Database
Azure Synapse Analytics (formerly SQL Data Warehouse)
Azure Cosmos DB
Amazon S3
Amazon Redshift
Google BigQuery
Oracle Database
SQL Server
MySQL
PostgreSQL
SAP HANA
Salesforce
REST
SFTP
File System

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe a linked service in Azure Data Factory.

A

It connects to external data like file systems, SQL, and SAP and is used to pull datasets into Azure Data Factory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What can we use to trigger a pipeline in Azure Data Factory?

A

Schedule Trigger: This allows you to run pipelines on a recurring schedule.

Tumbling Window Trigger: Useful for time-based workflows, executing pipelines at periodic time intervals.

Event-based Trigger: Responds to events, such as file creation or deletion in Azure Blob Storage.

Manual Trigger: Allows you to start a pipeline run on-demand.
Custom Events Trigger: Reacts to custom events published to an Azure Event Grid topic.

Storage Event Trigger: Responds to specific Azure Blob Storage or Azure Data Lake Storage Gen2 events.

REST API: Programmatically trigger pipelines using the ADF REST API.

PowerShell: Use Azure PowerShell cmdlets to trigger pipeline runs.

Azure CLI: Trigger pipelines using Azure Command-Line Interface commands.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

I require the ability to self host my

A
17
Q

What is an integration

A
18
Q
A
19
Q

What is an Integrated Runtime for Azure Data Factory?

A

It is the Azure Data Factory Runtime that performs pipeline actions like flow, transform, movement, etc.

20
Q

List the types of Integrated Runtime for Azure Data Factory?

A

Azure managed
Self-hosted
SSIS

21
Q

What must we have from an auth perspective when using Azure Data Factory to access other Azure services? This should be the least-managed type.

A

Managed Identity

The Azure Data Factor Managed Identity is required to provide authorization to access the services needed.

22
Q

What types of destinations can i send Azure Data Fastors dataset to?

A

Azure Data Storage:

Azure Blob Storage
Azure Data Lake Storage Gen1 and Gen2
Azure Files

Azure Databases:

Azure SQL Database
Azure Synapse Analytics (formerly SQL Data Warehouse)
Azure Database for MySQL
Azure Database for PostgreSQL
Azure Database for MariaDB
Azure Cosmos DB

Other Microsoft Services:

Microsoft Dynamics 365
Power BI

Relational Databases:

SQL Server (on-premises or on Azure VMs)
Oracle Database
IBM DB2

NoSQL Databases:

MongoDB (on-premises or Azure Cosmos DB’s API for MongoDB)

File Systems:

HDFS (Hadoop Distributed File System)

Generic Protocols:

OData
ODBC

Analytics Platforms:

Azure Databricks
Azure HDInsight (Hadoop, Spark, etc.)

SaaS Applications:

Salesforce
SAP HANA