16. NoSQL Flashcards
2 main classes of DB workloads
Online Transaction Processing (OLTP) is a class of workloads characterized by high numbers of transactions executed by large numbers of users. This kind of workload is common to “frontend” applications such as social networks and online stores. Queries are typically simple lookups (e.g., “find user by ID”, “get items in shopping cart”) and rarely include joins. OLTP workloads also involve high numbers of updates (e.g., “post tweet”, “add item to shopping cart”). Because of the need for consistency in these business-critical workloads, the queries, inserts and updates are performed as transactions.
Online Analytical Processing is a class of read-only workloads characterized by queries that typically touch a large amount of data. OLAP queries typically involve large numbers of joins and aggregations in order to support decision making (e.g., “sum revenues by store, region, clerk, product, date”).
What is ETL?
extract-transform-load (ETL) - the process of migrating data between databases. E.g. from OLTP to OLAP system.
The main goal of NoSQL databases (vs. SQL)
relax the guarantees and reduce the functionality of SQL databases to handle a higher volume of simple updates and queries
CAP Theorem
The CAP theorem proves that it is impossible for a distributed system to simultaneously provide more than two of the three CAP properties:
• Consistency: Consistency in the distributed systems context refers to ensuring that two clients making simultaneous requests to the database should get the same view of the data. In other words, even if each client connects to a different replica, the data that they receive should be the same.
• Availability: For a system to be available, every request must receive a response that is not an error, unless the user input is inherently erroneous.
• Partition Tolerance: A partition tolerant system must continue to operate despite messages between machines being dropped or delayed by the network, or disconnected from the network altogether.
What is eventual consistency?
Eventual consistency guarantees that eventually all replicas will become consistent once updates have propagated throughout the system.
One of the weakest forms of consistency.
3 examples (most widely-used classes) of NoSQL data models
1. Key-Value Stores - store key-value pairs, provide just 2 operations: get(key) and put(key, value)
2. Wide-Column Stores - compromise between the structured relational model and the complete lack of structure in a key-value store. Wide-column stores store data in tables, rows, and dynamic columns. Unlike a relational database, wide-column stores do not require each row in a table to have the same columns.
3. Document Stores - Key-value stores whose values adhere to a semi-structured data format such as JSON, XML, or Protocol Buffers. The values are called documents
JSON: 3 types of values
• Object: collection of key-value pairs. Keys must be strings. Values can be any JSON type (i.e., atomic, object, or array). Objects should not contain duplicate keys. Objects are denoted using “{” and “}” in a JSON document.
• Array: an ordered list of values. Values can be any JSON type (i.e. atomic, object, or array). Arrays are denoted using “[” and “]” in a JSON document.
• Atomic: one of a number (a 64-bit float), string, boolean or null.