NoSQL - Week 9 Flashcards

1
Q

Can a NoSQL database support SQL?

A

Yes, some support languages that look a bit like SQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does NoSQL revisit about relational databases?

A
  • The provision of a declared schema
  • Strict transactions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Data Integrity?

A

As enforced by a schema, and relied upon by applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are strict transactional semantics?

A

That concurrent programs do not lead to inconsistencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is important in enterprise applications, but may not be prioritised in all cases over scale and flexibility?

A

Data integrity and strict transactional semantics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What needs are NoSQL solutions typically associated with?

A

Elastic scaling, particularly the ability to grow rapidly for web-scale applications

Simple Operations, in particular accommodating data that tends to be accessed / updated in isolation

Examples: shopping carts, user profiles, blog posts, calendar data, product stock data, customer data, hotel availability, …

The queries don’t refer to all shopping carts or all product data, just to individual chunks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What kind of database is useful if you’re expecting 100,000 users but may get 100 million

A

NoSql

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the 6 abilities that distinguish NoSQL databases?

A
  • To horizontally scale the throughput of simple-operation workloads over many servers.
  • To replicate and distribute data (through partitioning) over (thousands of) servers
  • To expose a simple call-level interface or protocol
  • To offer less strict transactional guarantees
  • To use distributed indexes efficiently for replication-rich, elastic provision of data storage
  • To cope with variations in the structure of objects stored.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the three types NoSQL databases are classified into?

A

Key-Value
Document
Wide Column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a key-value database?

A

Access or update a value given a key; the database doesn’t necessarily provide much functionality for the value e.g. queries

(Redis, Oracle NoSQL, Amazon DynamoDB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a document database?

A

Access or update a document using a key; the database will provide functionality for accessing the value (e.g. queries)

Examples: Couchbase, MongoDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a wide column database?

A

Access or update a collection of column families associated with a key (e.g. CustomerID links to contact details, sales details, marketing details)

Examples: Apache Cassandra, HBase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a Key-Value store?

A

Associate a key with some data.

It is the responsibility of the application to create and operate on the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the API for a key-value store?

A

Put(Key,Value)
Get(key)
Delete(key)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a BLOB?

A

Binary Large OBject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a document store?

A

Associate a key with some data.

Format the data with some recognized format (e.g. JSON)

17
Q

What is the API for a document store?

A

Put(key, document)
Get(key)
Find(key,filter)
Delete(key)

18
Q

What is a wide column?

A

Associate a key with some data structured using column families.

API:
- Put(Key, {column family})
- Get(key)
- Delete(key)
- Find(key, filter)
- Update(key, expression)

19
Q

How do you partition data in NoSQL databases vs Relational databases?

A

Partitioning involves allocating data to different nodes.

In relational databases, partitioning can be vertical (put the data of selected columns together) or horizontal (put collections of rows together)

In NoSQL all the data associated with a key is stored on a single node, so data is horizontally partitioned. This is also known as sharding.

20
Q

Is there a standard language or model for NoSQL?

A

No

21
Q

Is there a standard language or model for relational databases?

A

Yes, the SQL standard

22
Q

What is Amazon Dynamo?

A

A key-value NoSQL database developed to support amazon services, and Amazon hosted applications

It has a different internal model from DynamoDB

Designed for web-scale applications: handling user profiles, shopping carts, game states, leader boards, …

Each service that uses Dynamo has its own instances

23
Q

What are the Stonebreaker-Cattell Rules

A

R1 Look for shared-nothing scalability

R2 High-level languages are good and need not hurt performance

R3 Plan to carefully leverage main memory databases

R4 High availability and automatic recovery are essential for simple-operation scalability

R5 Online everything

R6 Avoid multi-node operations

R7 Don’t try to build ACID consistency yourself

R8 Look for administrative simplicity

R9 Pay attention to node performance

R10 Open source gives you more control over your future

24
Q

What is a shared-memory design, what is it’s problem?

A

e.g. multicore, single-node DBMSs over shared primary and secondary memory

Suffer from contention and starve the cores, forcing designers into sharding.

Can only scale to tens of nodes.

25
Q

What is a shared-disk design, what is it’s problem?

A

(e.g. a multi-core, single node DBMS with private memory per CPU but sharing secondary memory) suffer from complex buffer and lock management needs which limit scalability.

Can only scale to tens of nodes.

26
Q

What is a shared-nothing design, what makes them scalable?

A

(e.g. each node with its own private and secondary memory)

Are scalable if partitioning is load-balancing (does not lead to hotspots) and if operations touch as few partitions as possible (ideally one)

Can scale to hundreds or thousands of nodes.

27
Q

Does Amazon Dynamo follow R1?

A

Runs on normal data center nodes, and is shared nothing.

Support get(key), put(key, value), delete (key) operations that act on a single partition.

Scales to hundred or thousands of nodes