Azure Data Fundamentals Flashcards
Data Classification Types
structured, semi-structured, or unstructured
Structured data
1) typically tabular data that is represented by rows and columns in a database.
2) Databases that hold tables in this form are called relational databases
3)
Semi-structured data
1) information that doesn’t reside in a relational database but still has some structure to it.
2) Examples include documents held in JavaScript Object Notation (JSON) format
3) Other forms: key-value stores (similar to a relational table, except that each row can have any number of columns) and graph databases (store and query information about complex relationships. A graph contains nodes (information about objects), and edges (information about the relationships between objects))
unstructured data
audio and video files, and binary data files might not have a specific structure
Storage of structured data
typically stored in a relational database such as SQL Server or Azure SQL Database.
Storage of unstructured data
use Azure Blob storage (Blob is an acronym for Binary Large Object)
Storage of semi-structured data
Azure Cosmos DB
Data processing solutions categories
analytical systems, and transaction processing systems
analytical system
1) designed to support business users who need to query data and gain a big picture view of the information held in a database
2) capturing raw data, and using it to generate insights
3) data ingestion, data transformation, data querying, and data visualization
transactional system
1) records transactions
2) often high-volume, sometimes handling many millions of transactions in a single day. The data being processed has to be accessible very quickly. The work performed by transactional systems is often referred to as Online Transactional Processing (OLTP)
Normalization
1) process of organizing data in a database
end result of the normalization process is that your data is split into a large number of narrow, well-defined tables
Non-relational database
One to many relationships in tables with duplicates
Relational database
Model for holding data. A primary use of relational databases is to handle transaction processing
Transactional database
A transactional database must adhere to the ACID (Atomicity, Consistency, Isolation, Durability) properties to ensure that the database remains consistent while processing transactions.
ACID
1) Atomicity guarantees that each transaction is treated as a single unit, which either succeeds completely, or fails completely.
2) Consistency ensures that a transaction can only take the data in the database from one valid state to another.
3) Isolation ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially.
4) Durability guarantees that once a transaction has been committed, it will remain committed even if there’s a system failure such as a power outage or crash.