Instructor's Method - 6/14/2021 Flashcards

1
Q

Relational Storage

A

Options are

  • SQL (Azure SQL, Managed Instance)
  • MPP (Dedicated SQL Pool)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Platform as a Service

A

I need a Database. I don’t have to manage underlying infrastructure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Azure SQL

A

Platform-as-Service

For SQL Service, MySQL, PostgreSQL

There are differences between Azure SQL and on-prem SQL Server

If starting a new project, recommends to use Azure SQL. Cheaper, much more scalable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Database on VM

A

IaaS

SQL Server, Oracle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Azure SQL Managed Instance

A

Platform-as-a-service

Dedicated infrastructure in Azure datacenter

~100% compatible with on-prem SQL Server

If you want lift-and-shift apps in the could that are using SQL server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Symmetric Multi Processing Architecture

A

DB2, Oracle, SQL Server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

MPP architecture

A

Massive Parallel Processing Architecture

Dedicated SQL Pool (earlier it was known as Azure SQL Data Warehouse)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Non-relational Storage

A

Azure Cosmos DB

Azure Storage

Azure Data Lake Store

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Azure Storage

A

Object storage

Cheaper service

Not compatible with Hadoop workloads

Lot of features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Azure Data Lake Gen1

A
Object storage
webHDFS compatible (compatible with Hadoop workloads)

Very less features, faster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Azure Data Lake Gen2

A

Combination of Data Lake Gen1 + Storage

Faster, Cheaper, webHDFS compatible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Batch Processing

A

Azure Databricks

Azure HDInsight

Azure Data Link Analytics

Azure Synapse Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

HDInsight

A

Earlier, it was Hortonworks distribution of Hadoop

MS took HDP and put it on Azure (you get Spark, Hadoop, Storm…)

HDP on Azure is called HDInsight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

4 options to use Apache Spark

A
  1. Download Open source Spark
  2. HDInsight
  3. Azure Databricks
  4. Azure Synapse => also has Spark
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Stream Processing

A

Azure Stream Analytics

Axure Databricks

Azure HDInsight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Orchestration

A

Azure Data Factory

17
Q

Modern Data Warehouse

A

Bring together all your data at scale, and get insights through analytical dashboards, operational reports, or advanced analytics for all the users

18
Q

4 layers of Modern Data Warehouse

A
  1. Ingestion (Extract)
  2. Storage (Load)
  3. Data Preparation (Transform)
  4. Model & Serve (Serve)
19
Q

What Products can be used to Ingest data

A

Azure Data Factory

20
Q

What products can be used to Store data

A

Azure Storage

Data Lake Gen2

21
Q

What products can be used to Prepare Data

A

Azure HDInsight

Azure Databricks

Data Lake Analytics

22
Q

What products can be used to Model and Serve Data

A

Dedicated SQL Pool

Azure Analysis Services

23
Q

What products can be used to Visualize

A

PowerBI

24
Q

Azure Synapse Analytics

A

Does following

  1. Ingestion (Extract)
  2. Storage (Load)
  3. Data Preparation (Transform)
  4. Model & Serve (Serve)
25
Q

Azure Synapse Analytics features

A

Set of multiple integrated Azure Data services

Bring in multiple data sources at one place

Bring all your code at one place

Communication between different compute options

Centralized management, privacy, data, security

26
Q

Synapse Workspace / Studio

A

Storage: Data Lake Gen2

Compute: Dedicated SQL Pools, Apache Spark Pools,
Serverless SQL

Ingestion: Synapse Pipelines (Azure Data Factory integrated into Synapse), Mapping Data Flows (ETL, SSIS)

Platform: Monitoring, Management, Security

Connected Services: Azure Cosmos DB, Power BI, Azure ML

27
Q

Dedicated SQL Pool

A

SQL-based, fully-managed, petabyte-scale cloud data warehouse