db-900 core data concepts Flashcards
db-900 azure data fundamentals
What are the three ways you can categorize data?
Structured
Semi–structured
Unstructured
What is tabular data?
data that is stored as rows and columns, in one or more ‘table’.
A row represents an entity and a column represents an attribute of that entity.
What makes data ‘structured’?
it is tabular and adheres to a fixed schema.
What makes data ‘semi–structured’?
it contains entities which have some regularly occuring attributes but there is variation. Sometimes those attributes are missing or there are multiple values for a givent attribute, etc
What is an example of a format that is useful for ‘semi structured’ data?
JSON – because it allows you to define fields for an entity but does not need to adhere to a predefined schema.
What are some examples of ‘unstructured’ data?
audio, video, and images
What are the two broad categories of data stores?
File stores and Databases
What are some common ways to store files?
BLOB, CSV, XML, JSON, and optimized file formats like: Avro, ORC, and Parquet
What is XML?
a human readable semistructured format that stores data in tags.
what is replacing XML?
JSON
What is the best format for storing large objects like videos, audio, and images?
BLOB
Binary Large Object
How is file storage different from a database?
The difference is that one deals with records rather than files
What is NoSQL?
databases that are not relational
What are the 4 common types of non–relational databases?
Key–value
Document
Column Family
Graph
In a key–value database, what format does the value have to be in?
In this type of database, it doesn’t matter what the format of the value is. It can be numerical, text, etc
In a document database, what format does the value have to be in?
JSON
What are the two types of data processing?
Transactional and Analytical
What is OLTP?
Online Transaction Processing
What does OLTP track?
Transactions, which are often CRUD operations
What does a transaction ensure?
ACIDity
What does ACID stand for?
Atomicity
Consistency
Isolation
Durability
What is atomicity?
All sub–components of a transaction must succeed in order for the transaction to take place. It is binary, either all of it completes or none of it does.
How do you know a transaction is consistent?
When a transaction takes the database from one valid state to another valid state. If you were to transfer funds from one account to another, the total number of funds remains the same, because it is subtracted from one and added to another
How do you know your transactions are ‘isolated’?
When the transaction does not interfere with another transaction. If I run a transaction to transfer funds from one account to another, and I also run a transaction to get the number of funds from all accounts, that second transaction should not get one account total before the transfer and one account total after the transfer.