csv Flashcards
What is an analytics workload on Azure?
It refers to processing and analyzing large volumes of data using Azure’s cloud-based tools and services to gain insights and drive business decisions.
What are the common elements of a modern data warehouse?
They include data ingestion, data storage, data processing, and data visualization.
What should be considered for data ingestion and processing?
Factors like data volume, velocity, variety, and veracity, along with the choice of tools and services for efficient and reliable data flow.
What are some options for analytical data stores?
Options include data lakes, data warehouses, and operational databases optimized for analytical queries.
What is Azure Synapse Analytics?
A comprehensive analytics service that brings together big data and data warehousing, allowing seamless integration and analysis of large data volumes.
What is Azure Databricks?
An Apache Spark-based analytics platform optimized for Azure, providing collaborative data engineering and data science workflows.
What is Azure HDInsight?
A fully-managed cloud service from Azure that makes it easy to process massive amounts of data using popular open-source frameworks such as Hadoop, Spark, and Kafka.
What is Azure Data Factory?
A cloud-based data integration service that allows you to create data-driven workflows for orchestrating data movement and transforming data at scale.
What is the difference between batch and streaming data?
Batch data is processed in large volumes at scheduled intervals, while streaming data is processed in real-time as it arrives.
What is Azure Stream Analytics?
A real-time analytics service designed to process and analyze streaming data from various sources in real-time.
What is Azure Synapse Data Explorer?
A fast and scalable data exploration service for interactive data analysis and real-time analytics.
What is Spark structured streaming?
A scalable and fault-tolerant stream processing engine built on Apache Spark, allowing for real-time data processing.
What are the capabilities of Power BI?
Power BI provides data visualization, business intelligence, and interactive reports and dashboards.
What are the features of data models in Power BI?
Features include relationships between tables, calculated columns and measures, hierarchies, and data refresh.
What are appropriate visualizations for data in Power BI?
Visualizations include bar charts, line charts, pie charts, scatter plots, maps, and more, depending on the data and the insights needed.
What is the role of Azure Data Lake Storage in a modern data warehouse?
It provides scalable, cost-effective storage for large volumes of data in its native format.
What are data ingestion methods in Azure Data Factory?
Methods include batch ingestion using Copy Activity and real-time ingestion using Data Flow.
What are some use cases for Azure Synapse Analytics?
Use cases include data warehousing, big data processing, data integration, and machine learning.
How does Azure Databricks integrate with other Azure services?
It integrates with Azure Data Lake Storage, Azure Synapse Analytics, and Azure Machine Learning to provide a comprehensive data solution.
What is the benefit of using Azure HDInsight for big data processing?
It offers a fully managed, open-source analytics service that supports Hadoop, Spark, and other frameworks.
What are the main components of Azure Stream Analytics?
The main components are input (data sources), query (SQL-like language for stream processing), and output (data sinks).
What is the difference between Azure Data Lake Storage Gen1 and Gen2?
Gen2 offers improved performance, security, and integration with other Azure services compared to Gen1.
What is a Power BI dashboard?
A Power BI dashboard is a single-page, often called a canvas, that uses visualizations to tell a story.
What are the key differences between Azure SQL Database and Azure Synapse Analytics?
Azure SQL Database is optimized for transactional workloads, while Azure Synapse Analytics is optimized for analytical workloads.
What is a calculated column in Power BI?
A calculated column is a column that you add to an existing table in Power BI using a DAX formula.
What is the purpose of a data gateway in Power BI?
A data gateway acts as a bridge to provide quick and secure data transfer between on-premises data and Power BI.
What is the role of Azure Data Factory in ETL processes?
Azure Data Factory orchestrates and automates the movement and transformation of data across various data stores and services.
What is the function of the PolyBase feature in Azure Synapse Analytics?
PolyBase allows you to query data from external data sources like Azure Blob Storage and Azure Data Lake Storage using T-SQL.
What is the difference between hot and cold paths in a Lambda architecture?
Hot paths process real-time data with low latency, while cold paths process large volumes of data with higher latency.
What are some benefits of using Power BI Service?
Benefits include collaboration and sharing, automatic data refresh, and integration with other Microsoft services.
What is an example of a real-time data analytics scenario?
Monitoring IoT sensor data in real-time to detect anomalies and trigger alerts.