1.7 Databases (ACID and More) Flashcards
What is a transaction?
In the context of databases and data storage systems, a transaction is any operation that is treated as a single unit of work, which either completes fully or does not complete at all, and leaves the storage system in a consistent state.
The classic example of a transaction is what occurs when you withdraw money from your bank account. Either the money has left your bank account, or it has not — there cannot be an in-between state.
What are A.C.I.D. properties?
ACID is an acronym that refers to the set of 4 key properties that define a transaction: Atomicity, Consistency, Isolation, and Durability. If a database operation has these ACID properties, it can be called an ACID transaction, and data storage systems that apply these operations are called transactional systems. ACID transactions guarantee that each read, write, or modification of a table has the following properties:
Atomicity - each statement in a transaction (to read, write, update or delete data) is treated as a single unit. Either the entire statement is executed, or none of it is executed. This property prevents data loss and corruption from occurring if, for example, if your streaming data source fails mid-stream.
Consistency - ensures that transactions only make changes to tables in predefined, predictable ways. Transactional consistency ensures that corruption or errors in your data do not create unintended consequences for the integrity of your table.
Isolation - when multiple users are reading and writing from the same table all at once, isolation of their transactions ensures that the concurrent transactions don’t interfere with or affect one another. Each request can occur as though they were occurring one by one, even though they’re actually occurring simultaneously.
Durability - ensures that changes to your data made by successfully executed transactions will be saved, even in the event of system failure.
Why are ACID transactions a good thing to have?
ACID transactions ensure the highest possible data reliability and integrity. They ensure that your data never falls into an inconsistent state because of an operation that only partially completes.
For example, without ACID transactions, if you were writing some data to a database table, but the power went out unexpectedly, it’s possible that only some of your data would have been saved, while some of it would not.
Now your database is in an inconsistent state that is very difficult and time-consuming to recover from.
What is an RDBMS?
A relational database management system is a collection of data items organized in tables.
What does ACID stand for in the context of relational databases?
Atomicity, Consistency, Isolation, Durability.
What is Atomicity in ACID properties?
Each transaction is all or nothing.
What is Consistency in ACID properties?
Any transaction will bring the database from one valid state to another.
What is Isolation in ACID properties?
Executing transactions concurrently has the same results as if the transactions were executed serially.
What is Durability in ACID properties?
Once a transaction has been committed, it will remain so.
Name a technique to scale a relational database.
Master-slave replication, master-master replication, federation, sharding, denormalization, SQL tuning.
What is master-slave replication?
The master serves reads and writes, replicating writes to one or more slaves, which serve only reads.
What happens if the master goes offline in master-slave replication?
The system can continue to operate in read-only mode until a slave is promoted to a master.
What is a disadvantage of master-slave replication?
Additional logic is needed to promote a slave to a master.
What is master-master replication?
Both masters serve reads and writes and coordinate with each other on writes.
What is a disadvantage of master-master replication?
You’ll need a load balancer or application logic changes to determine where to write.
What is federation in the context of databases?
Federation splits up databases by function to reduce read and write traffic.
What is sharding?
Sharding distributes data across different databases such that each database manages a subset of the data.
What is denormalization?
Denormalization attempts to improve read performance at the expense of some write performance.
What is SQL tuning?
SQL tuning involves optimizing queries to improve performance.
What does BASE stand for in NoSQL databases?
Basically available, Soft state, Eventual consistency.
True or False: NoSQL databases generally support true ACID transactions.
False.
What is a key-value store?
A key-value store is an abstraction similar to a hash table that allows O(1) reads and writes.
What is a document store?
A document store is centered around documents (XML, JSON, binary) where a document stores all information for a given object.
What is a wide column store?
A wide column store’s basic unit of data is a column (name/value pair), grouped in column families.
What is a graph database?
In a graph database, each node is a record and each arc is a relationship between two nodes.
List some advantages of using NoSQL databases.
- Semi-structured data
- Dynamic or flexible schema
- Non-relational data
- Very high throughput for IOPS.
Fill in the blank: SQL databases are better for _______ data.
[structured]
Fill in the blank: NoSQL databases are better for _______ data.
[semi-structured]
What are some examples of data well-suited for NoSQL?
- Rapid ingest of clickstream and log data
- Leaderboard or scoring data
- Temporary data, such as a shopping cart.