Ch. 1 - Get started with data engineering on Azure Flashcards

Question

What is a SQL Dedicated Pool?

Answer 1

Enterprise-scale relational database instances used to host data warehouses in which data is stored in relational tables.

Answer 2

Enables querying of various serverless SQL pool formats (CSVs, JSON, and Parquet files) SELECT TOP 100 * FROM OPENROWSET( BULK 'https://mydatalake.blob.core.windows.net/data/files/*.csv', FORMAT = 'csv', PARSER_VERSION = '2.0', FIRSTROW = 2) AS rows The PARSER_VERSION is used to determine how the query interprets the text encoding used in the files. Version 1.0 is the default and supports a wide range of file encodings, while version 2.0 supports fewer encodings but offers better performance. The FIRSTROW parameter is used to skip rows in the text file, to eliminate any unstructured preamble text or to ignore a row containing column headings. https://learn.microsoft.com/en-us/training/modules/query-data-lake-using-azure-synapse-serverless-sql-pools/3-query-files

Answer 3

Used to store unstructured data such as videos, audio, metadata, log files, text, and binary. It can be accessed via Representation State Transfer (REST) - one way of implementing web service endpoints.

Answer 4

Means that all directories within a storage account will contain metadata describing their structure and content which improves operations such as moving an entire directory structure to a different parent directory, as only the metadata will need to be updated.

Answer 5

This makes operations such as renaming and deleting directories atomic and quick. For example, if you have 100 files under a directory in Blob Storage, renaming that directory would require 100 metadata operations.

Answer 6

Azure Files offers fully managed file shares in the cloud that are accessible via the industry standard Server Message Block (SMB) protocol, Network File System (NFS) protocol, and Azure Files REST API. Azure file shares can be mounted concurrently by cloud or on-premises deployments. SMB Azure file shares are accessible from Windows, Linux, and macOS clients. File shares are provisioned via Azure Storage Account V2 or premium File Share Tier.

Answer 7

Used to store a large number of messages that can be accessed asynchronously between the source and the destination. Asynchronously means a sender can add a message to the queue and the receiver can retrieve it later when ready

Answer 8

Storage queues can be used for simple asynchronous message processing. They can store up to 500 TB of data (per storage account) and each message can be up to 64 KB in size You access messages from anywhere in the world via authenticated calls using HTTP or HTTPS. A queue may contain millions of messages, up to the total capacity limit of a storage account. Queues are commonly used to create a backlog of work to process asynchronously.

Answer 9

A fully managed enterprise messaging service used to reliably transfer business data (sales, purchases, inventory movements, journals) among decoupled systems through Azure Queues.

Answer 10

Azure tables refer to key-value stores for structured, non-relational data,. Azure Table Storage = cost-effective basic option Azure Cosmos DB (the premium service) offering advanced features like global distribution, flexible consistency models, and serverless compute

Answer 11

Virtual hard disks that are mounted to an Azure VM, available standard HDD, standard SSD, premium SSD, and ultra disks.

Answer 12

Ties all resources such as VMs, storage accounts, and databases together securely in a private network. Provides 4 main services: Security: provides secure connectivity within azure, using basic VNet, VNet Peering, and Service endpoints. Networking: Provides networking beyond the Azure Cloud and into the internet and hybrid clouds using express routes, private endpoints, and point-to-site and site-to-site VPNs. Filtering: Provides networking filtering/firewalls that can implemented either via network or app security groups. Routing: provides network routing abilities that allow configuration network routes using route tables and Border Gateway Protocol.

Answer 13

Azure Synapse is Microsoft’s unified analytics platform that integrates data ingestion, data warehousing, and big data analytics in one environment, enabling teams to ingest, transform, and analyze data at scale

Ch. 1 - Get started with data engineering on Azure Flashcards

(38 cards)