DP-900 Flashcards

1
Q

What are the types of Data?

A

Structured Data (Rows and Columns)
Semi-Structured Data (Tags, value tags, No-SQL data)
Unstructured Data (Pictures, Videos, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Relational Data?

A

Structured Data
Defined Schema
Clear relationship between fields and tables

(SQL Data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Non-relational data?

A

Semi-Structured Data
No defined schema
Hierarchy is defined by tags and keys

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is OLTP?

A

Online Transactional Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does ACID stand for?

A

Atomicity - means that a transaction is “all or nothing.”

Consistency - ensures data integrity and prevents violations of database constraints or rules.

Isolation - allows multiple transactions to occur concurrently without interference.

Durability - ensures that committed transactions persist, even in the face of errors or failures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Azure SQL Database?

A

A fully managed, highly scalable PaaS database service that is designed for the cloud. This service includes the core database-level capabilities of on-premises SQL Server, and is a good option when you need to create a new application in the cloud.

Use this option for new cloud solutions, or to migrate applications that have minimal instance-level dependencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Azure SQL Managed Instance?

A

A platform-as-a-service (PaaS) option that provides near-100% compatibility with on-premises SQL Server instances while abstracting the underlying hardware and operating system. The service includes automated software update management, backups, and other maintenance tasks, reducing the administrative burden of supporting a database server instance.

Use this option for most cloud migration scenarios, particularly when you need minimal changes to existing applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Azure SQL VM?

A

A virtual machine running in Azure with an installation of SQL Server. The use of a VM makes this option an infrastructure-as-a-service (IaaS) solution that virtualizes hardware infrastructure for compute, storage, and networking in Azure; making it a great option for “lift and shift” migration of existing on-premises SQL Server installations to the cloud.

Use this option when you need to migrate or extend an on-premises SQL Server solution and retain full control over all aspects of server and database configuration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Azure Database for MySQL?

A

a simple-to-use open-source database management system that is commonly used in Linux, Apache, MySQL, and PHP (LAMP) stack apps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Azure Database for MariaDB?

A

a newer database management system, created by the original developers of MySQL. The database engine has since been rewritten and optimized to improve performance. MariaDB offers compatibility with Oracle Database (another popular commercial database management system).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Azure Database for PostgreSQL?

A

a hybrid relational-object database. You can store data in relational tables, but a PostgreSQL database also enables you to store custom data types, with their own non-relational properties.

Most popular database for modern apps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Azure Cosmos DB?

A

Azure Cosmos DB is a global-scale non-relational (NoSQL) database system that supports multiple application programming interfaces (APIs), enabling you to store and manage data as JSON documents, key-value pairs, column-families, and graphs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 3 Core Storage types within Azure Storage

A

Blob containers - scalable, cost-effective storage for binary files.

File shares - network file shares such as you typically find in corporate networks.

Tables - key-value storage for applications that need to read and write data values quickly. Semi Structured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Azure Data Factory?

A

Azure Data Factory is an Azure service that enables you to define and schedule data pipelines to transfer and transform data.

You can integrate your pipelines with other Azure services, enabling you to ingest data from cloud data stores, process the data using cloud-based compute, and persist the results in another data store.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Azure Synapse Analytics?

A

Azure Synapse Analytics is a comprehensive, unified data analytics solution that provides a single service interface for multiple analytical capabilities, including:

Pipelines - based on the same technology as Azure Data Factory.
SQL - a highly scalable SQL database engine, optimized for data warehouse workloads.
Apache Spark - an open-source distributed data processing system that supports multiple programming languages and APIs, including Java, Scala, Python, and SQL.
Azure Synapse Data Explorer - a high-performance data analytics solution that is optimized for real-time querying of log and telemetry data using Kusto Query Language (KQL).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Azure Databricks?

A

Azure Databricks is an Azure-integrated version of the popular Databricks platform, which combines the Apache Spark data processing platform with SQL database semantics and an integrated management interface to enable large-scale data analytics.

17
Q

What is Azure HDInsight?

A

Azure HDInsight is an Azure service that provides Azure-hosted clusters for popular Apache open-source big data processing technologies, including:

Apache Spark - a distributed data processing system that supports multiple programming languages and APIs, including Java, Scala, Python, and SQL.
Apache Hadoop - a distributed system that uses MapReduce jobs to process large volumes of data efficiently across multiple cluster nodes. MapReduce jobs can be written in Java or abstracted by interfaces such as Apache Hive - a SQL-based API that runs on Hadoop.
Apache HBase - an open-source system for large-scale NoSQL data storage and querying.
Apache Kafka - a message broker for data stream processing.

18
Q

What is Azure Stream Analytics?

A

Azure Stream Analytics is a real-time stream processing engine that captures a stream of data from an input, applies a query to extract and manipulate data from the input stream, and writes the results to an output for analysis or further processing.

19
Q

What is Azure Data Explorer?

A

Azure Data Explorer is a standalone service that offers the same high-performance querying of log and telemetry data as the Azure Synapse Data Explorer runtime in Azure Synapse Analytics.

20
Q

What is Azure SQL Edge?

A

A SQL engine that is optimized for Internet-of-things (IoT) scenarios that need to work with streaming time-series data.

21
Q

What is Normalization

A

Is the process of organizing data in a database. Includes creating tables, and establishing relationships between those tables to protect the data and make the database more flexible by eliminating redundancy.

22
Q

What is Extract, Transform and Load (ETL)

A

Extract, transform, and load (ETL) is a data integration methodology that extracts raw data from sources, transforms the data on a secondary processing server, and then loads the data into a target database.

ETL is used when data must be transformed to conform to the data regime of a target database.

23
Q

What is Cognitive analytics?

A

Uses AI and machine learning to analyze complex data sets and simulate human thought processes.

24
Q

What is Descriptive analytics?

A

“What’s Happening”

Analyzes historical data to gain insights into past events and trends.

Used to understand what has happened in the past and identify patterns and trends that can be used to inform future decision-making.

25
Q

What is Predictive analytics?

A

“What will happen”

Makes predictions about future events based on historical data.

Used to identify patterns and trends that can be used to make more informed decisions and improve business outcomes.

26
Q

What is Diagnostic analytics?

A

“Why is this happening”

Seeks to understand why something happened by analyzing historical data and identifying root causes of specific outcomes.

Used to uncover the underlying factors that led to a particular event or trend.

27
Q

What is Prescriptive analytics

A

“What should we do”

Suggests a course of action to optimize outcomes based on insights gained from descriptive, diagnostic, and predictive analytics.

used to help organizations make more informed decisions by providing actionable insights and recommendations.

28
Q

What is the difference between ETL and ELT?

A

ETL transforms data before loading it on the server, while ELT transforms it afterward.

ETL is an older method ideal for complex transformations of smaller data sets. It’s also great for those prioritizing data security. On the other hand, ELT is a newer technology that provides more flexibility to analysts and is perfect for processing both structured and unstructured data.

29
Q

What is Extract, Load and Transform (ELT)

A

ELT loads raw data directly into a target data warehouse, instead of moving it to a processing server for transformation.

With ELT data pipeline, data cleansing, enrichment, and data transformation all occur inside the data warehouse itself. Raw data is stored indefinitely in the data warehouse, allowing for multiple transformations.

30
Q

What are SQL Statement Types?

A

DML (Data Manipulation Language) Used to query and manipulate data in a database. Examples: SELECT, INSERT, UPDATE, DELTE

DDL (Data definition Language) Used to define or modify the structure of database objects. Examples: CREATE, ALTER, DROP

TCL (Transaction Control Language) Used to control transactions in a database. Examples: COMMIT, ROLLBACK, SAVEPOINT

31
Q

What is DML (Data Manipulation Language)

A

Used to query and manipulate data in a database. Examples: SELECT, INSERT, UPDATE, DELTE

32
Q

What is DDL (Data definition Language)

A

Used to define or modify the structure of database objects. Examples: CREATE, ALTER, DROP

33
Q

What is DCL (Data Control Language)

A

Used to control access to data in a database. Examples: GRANT, REVOKE

34
Q

What is TCL (Transaction Control Language)

A

Used to control transactions in a database. Examples: COMMIT, ROLLBACK, SAVEPOINT

35
Q

What are the 3 types of Blob Storage

A

Block- are used for storing large amounts of data, and are optimized for efficient uploading and updating of large files.

Page- are designed for random read-write operations, and are commonly used for storing virtual hard drive files and other frequently accessed data.

Append- are optimized for frequent appending to the end of a file, and are useful for scenarios like logging and data collection.

36
Q

What is a Treemap

A

a chart that divides the chart area into rectangles that represent the different levels and relative size of the hierarchy

37
Q

What are SQL aggregate functions?

A

functions used to perform calculations on a set of values and return a single value as a result.

Examples: Count, Sum, Avg, Max, Min