Databases Flashcards
Vertical Scaling
adding compute (CPU) and memory (RAM, Disk, SSD) resources to a single computer
Horizontal Scaling
adding more computers to a cluster
Relational Databases
a database that uses a relational data model, which organizes data in tables with rows of data entries and columns of predetermined data types. Relationships between tables are represented with foreign key columns that reference the primary key columns of other tables.
also called SQL databases because SQL (Structured Query Language) is the standard query language for relational models
Non-relational Database
optimized for specific use cases that need scalability, schema flexibility, or specialized query support. often called NoSQL databases
Index
a table that has a copy of the column of interest and a foreign key reference to the original table. Used when data is frequently accessed by the same column
When to use relational database
- there are many-to-many relationships between entries
- data needs to follow the predetermined schema
- relationships between data always need to be accurate
examples: Oracle, MySQL, PostgresQL
Relational Database drawbacks
hard to scale over distributed clusters (horizontal scaling). Relational databases are most useful when there are many relationships in the data, so no matter what way you split up the data there will be relationships between data entries on different nodes.
When these cross-node relationships get updated, the nodes have to communicate with each other to “normalize” (keep in sync) the data. As a result, the database operations get slower because network communication is slower
also aren’t particularly advantageous if the data doesn’t have a lot of references, doesn’t easily conform to a single schema, or changes shape frequently
Graph Database
many-to-many relationships (graph structure)
fast at following graph edges
suited to complex network analytics
less mature technology than Relational
Document Store
isolated documents
retrieve by a key
documents with different schemas that are easy to update
easy to scale
Key-value store/object store
opaque values
no schema or relationships known to the database
very simple operations
easy to scale
particularly suitable for caching implementations
When the values need to be large, this kind of database is referred to as an Object Store, or Blob Store. In this case, the data might be serialized and optimized for large file sizes. Use cases include videos, images, audio, disk images, and log binaries.
Column-family database
groups related columns for storage (easy to scale)
memory effective for sparse data
Search Engine Database
large amounts of unstructured data
full text search or fuzzy search (meaning the results may not exactly match the search string) service
Time Series Database
data is ordered by time
many data streams
real time entry ordering functionality