NoSQL Flashcards

1
Q

Strengths of RDBMS (4)

A
  1. Consistency (ACID)
  2. Integration of data with schema normalization
  3. SQL language well known
  4. Robust (40+ years in organizations)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Weaknesses of RDBMS (4)

A
  1. Bad scaling
  2. Prioritizes consistency over latency
  3. Schema rigidity (no evolution)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

No SQL common features (4)

A
  1. No just rows and tables
  2. Freedom from joins
  3. Schemaless or soft-schema
  4. Distributed architecture (cluster)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Can NoSQL systems be used for OLAP?

A

Possibly, but through analytical tools like Spark

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

4 main data models seen in NoSQL

A
  1. Key-value
  2. Document
  3. Wide column
  4. Graph
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Graph data model

A

Focuses on relationship between data elements with vertices representing the entity, arcs representing relationship between entities, and properties descriping the vertices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 examples of graph database queries

A
  1. Find friends of friends
    (user)-[:KNOWS]-(friend)-[:KNOWS]-(foaf)
  2. Find shortest path between A and B
    shortestPath(:KNOWS*…5]-(userB))
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the opposite of Graph Modeling?

A

Aggregate modeling
(key-value, document, wide-column)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do we call tables in the document data model?

A

Collections, which hold a list of documents (often JSON format)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does each document need to contain in the document data model? (2)

A

A set of fields corresponding to Key-value pair and mandatory ID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

{
“_id”: 1,
“name”: “Martin”,
“adrs”: [
{“street”:”Adam”, “city”:”Chicago”, “state”:”illinois”, “code”:60007},
{“street”:”9th”, “city”:”NewYork”, “state”:”NewYork”, “code”:10001}
],
“orders”: [ {
“orderpayments”:[
{“card”:477, “billadrs”: {“street”:”Adam”, “city”:”Chicago”, “state”:”illinois”, “code”:60007}},
{“card”:457, “billadrs”: {“street”:”9th”, “city”:”NewYork”, “state”:”NewYork”, “code”:10001}}
],
“products”:[
{“id”:1, “name”:”Cola”, “price”:12.4},
{“id”:2, “name”:”Fanta”, “price”:14.4}
],
“shipAdrs”: {“street”:”9th”, “city”:”NewYork”, “state”:”NewYork”, “code”:10001}
}]

A

We can query this into different sections, such as Product Collection and Order Collection to return document with just those relevant fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a key contain and what does a value contain in document data model

A

unique string (path, queries, REST calls, ID)
BLOB (binary large object) - HTML, pdf etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is the value considered a black box when querying key-values in NoSQL?

A

There are no indexes on the values, no “where” clauses allowed. Schema information is often indicated in the key

Ex:
Key Value
user: 1234: name Enrico

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Wide-column data model

A

in a RDBMS, data is stored in tables with rows that span a certain number of columns. If a particular record/row needs another column, you have to add it to the entire table. In Wide-column, you don’t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Key-value vs wide-column

A

Key-Value databases are the simplest model and can be thought of as a configuration file or a two-column table of keys with an associated value. Wide-column databases expand that key-value store concept across multiple columns, but only the columns that are needed for that record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to query wide-column data

A

SQL-like language works since its similar to relational model

17
Q

Is it easier to scale aggregate data or graph?

A

Aggregate, because splitting graph data across cluster often means arcs are “cut” meaning several cross-machine links

18
Q

sharding

A

distributing data across different nodes

19
Q

replication

A

creating copies of the data on several nodes

20
Q

3 good practices for sharding

A
  1. Data locality (italian customer data in european data center)
  2. Balance (same amount of data on each node)ù
  3. Related data accessed together (orders for each client stored on same node)
21
Q

Hash data partition strategy

A

equal distribution across nodes, but range queries become inefficient

22
Q

range data partition strategy

A

distribute based off value ranges, can lead to heavy data redistribution

23
Q

Master job in NoSQL

A

manage data and handle write operations

24
Q

Slaves job in NoSQL

A

Enable read operations and become master if the master fails

25
Q

peer-to-peer replication

A

different from master-slave model, each node has same importance and can handle write operations, but two users may update the same value from different replicas…

26
Q

Write conflict mitigation methods (3)

A

Last write wins
conflict prevention - verify that value hasn’t changed since last read
conflict detection - preserve history, merge results and let user decide

27
Q

Consistency in RDBMS (ACID)

A

Atomicity (no partial transactions)
Consistency
Isolation
Durability

28
Q

Consistency in NoSQL (CAP)

A

A distributed system can only guarantee 2 out of the following 3 properties:

  1. Consistency (C)
  2. Availability (A)
  3. Partition Tolerance (P)
29
Q

P.AC E.LC model

A

Refinement of CAP theorem adding another dimension

PA EL= “Prioritize Availability”
PA EC= “Sacrifice consistency only for partitioning”
PC EL = “Enforce consistency during partitioning”
PC EC = “Strong consistency at all times”