NoSQL Flashcards

1
Q

4 points addressed by NoSQL database

A

non-relational
distributed
open-source
horizontal scalable (we increase the computing power by adding more nodes to the system rather than upgrading an individual node)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

6 defining characteristics in NoSQL database

A
  • schema-free
  • easy replication support
  • simple API
  • eventually consistent
  • BASE principles (not ACID principles)
  • huge data amounts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

6 examples of NoSQL DBs

A
hadoop/Hbase
cassandra
Amazon SimpleDB
MongoDB
Apache Flink
Google BigTable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does BASE stand for?

A

Basically Available
Soft state
Eventually Consistent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

6 Characteristics of BASE database

A
  • weak consistency (stale data ok)
  • availability first
  • best effort
  • approximate answers ok
  • aggressive
  • simpler and faster
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is CAP theorem and how does it apply to NoSQL databases

A

the idea that it is theoretically impossible to have all 3 of consistency, availability, and partition tolerance

You can only have at most 2 of these. The NoSQL database you choose to use will be mostly based on which of these two characteristics you need the most

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

CAP Theorem: Consistency

A

all servers in the system will have the same data so anyone using the system will get the same copy regardless of which server answers their request

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

CAP Theorem: Availability

A

the system will always respond to a request (even if it’s not the latest data or consistent or just a message saying the system isn’t working)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

CAP Theorem: Partition Tolerance

A

the system continues to operate as a whole even if individual servers fail or cannot be reached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

4 NoSQL database types

A

column store
document store
key-value store
graph database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

4 basic key-values function calls

A
  • Get(key) - return value
  • Put(key, value) - add a key value pair
  • Multi-get(key1,..,keyN) return list of values associated with list of keys
  • Delete(key) - remove key-value pair from data store
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

2 main issues with Key-value stores

A
  • this model does not provide any traditional database capabilities such as atomicity
  • maintaining unique values for keys may become more difficult as the volume of data increases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a document DB?

A

expands on key-value store idea but keys refer to “documents” which can contain more complex data

documents hold “semi-structured” data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Example of Key-value store

A

AWS DynamoDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

2 examples of document DBs

A

Couch DB

Mongo DB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a column store DB?

A

they store cells in a corresponding column as a continuous disk entry

relational databases store individual rows as continuous disk entries

17
Q

What is the benefit of a column store DB?

A

accessing a single attribute, searching through a single attribute, and aggregation all only require one disk reference

18
Q

what is a column family?

A

a logical grouping of columns. The column entries have IDs that allow columns in the column family to be joined to produce a full picture of the data

19
Q

What is a graph DB?

A

a database based on a graph where data is represented by vertices and the relationships between the data are represented by edges

20
Q

What is the benefit of a graph DB?

A

it’s ideal for representing complex relationships

21
Q

What is MapReduce?

A

a programming paradigm that is a technique for indexing and searching large data volumes

22
Q

The two phases of the MapReduce paradigm

A

the Map phase:
-extracting sets of key-value pairs from underlying data potentially in parallel of different machines

the Reduce phase:
-merge and sort sets of key-value pairs