Databases Flashcards
What are the two primary types of consistency model?
Immediately available and eventually available
What is volatile storage?
Volatile storage is temporary storage usually held in-memory by a cache. It is not persistent and therefore lost if the cache is restarted.
What is another name for horizontal database scaling?
Sharding
How is a sharded database implemented?
Typically with a hash function to allocate a shard for storing the requisite data. Each shard contains different data, unlike a replicated database.
What determines how data is distributed in a sharded database?
The sharding key.
What is CAP theorem?
CAP theorem states it is impossible for a distributed system to simultaneously provide more than two of three guarantees: Consistency, Availability, Partition tolerance.
What are the two types of database scaling?
Vertical and horizontal scaling.
How do you vertically scale a database?
You ‘scale up’ a database by adding more power (CPU, RAM, DISK).
How do you horizontally scale a database?
You horizontally scale a database using a technique called sharding. The practice of adding more servers.
What function is used to find a database shard?
A hash function.
What is the most important factor to consider when implementing sharding?
The selection of the sharding key. This is also known as the partition key - consisting of one or more columns that determine how data is distributed.
What is the difference between UNION and INTERSECT?
Both union and intersection are the two fundamental operations through which sets can be combined and related to each other. In terms of set theory, union is the set of all the elements that are in either set, or in both, whereas intersection is the set of all distinct elements that belong to both the sets.
What is the difference between OLTP and OLAP?
An OLTP database is an on-line transaction processing database designed for persisting transactions, and typically follows a 3NF approach. An OLAP database is an on-line analytical processing database used to provide analytics such as in a data warehouse.
What is a data lake?
A data lake is a system or repository of data stored in its natural/raw format, usually object blobs or files.