Data Factory Flashcards
What are the three types of activities available in Data Factory?
data movement activities, data transformation activities, and control activities
What is the name of the runtime environment that runs SSIS packages on Azure?
Azure-SSIS IR (integration runtime)
SSIS = Azure SQL Server Integration Services
What are the main tasks done by Azure Data Factory pipelines (data-drive workflows)?
Connect and collect
Transform and enrich - Data Flow
Publish
Monitor
What is the name of a fast and highly scalable data exploration service?
Azure Data Explorer
You have an Azure virtual machine named VM1 that runs Windows Server 2019 and contains 500 GB of data files.
You are designing a solution that will use Azure Data Factory to transform the data files, and then load the files to Azure Data Lake Storage.
What should you deploy on VM1 to support the design?
The self-hosted integration runtime in Azure
The integration runtime (IR) is the compute infrastructure that Azure Data Factory uses to provide data-integration capabilities across different network environments. For details about IR, see Integration runtime overview.
A self-hosted integration runtime can run copy activities between a cloud data store and a data store in a private network. It also can dispatch transform activities against compute resources in an on-premises network or an Azure virtual network. The installation of a self-hosted integration runtime needs an on-premises machine or a virtual machine inside a private network.
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/create-self-hosted-integration-runtime