CAP theorem and NoSQL databases Flashcards

1
Q

Which ACID property correspond to the following affirmation: “ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially”

a) atomicity
b) consistency
c) isolation
d) durability

A

c)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SQL vs NoSQL databases. Which statement is true?

a) MongoDB is a well-know relational database
b) NoSQL databases are easier to scale horizontally than relational databases
c) Relational databases are easier to scale horizontally than NoSQL
d) NoSQL databases are the best choice if we want to ensure data consistency

A

b)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are transactions?

A

Transactions are a collection of actions that make consistent transformations of system states while preserving consistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which ACID property correspond to the following affirmation: “when we do something to change a database, the change should work or fail as a whole”

a) atomicity
b) consistency
c) isolation
d) durability

A

a)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which ACID property correspond to the following affirmation: “any given database transaction must change affected data only in allowed way”

a) atomicity
b) consistency
c) isolation
d) durability

A

b)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which ACID property correspond to the following affirmation: “how/when the changes made by one operation become visible to other”

a) atomicity
b) consistency
c) isolation
d) durability

A

c)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which ACID property correspond to the following affirmation: “guarantees that transactions that have committed will survive permanently”

a) atomicity
b) consistency
c) isolation
d) durability

A

d)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is atomicity?

A

Atomicity: transactions are often composed of multiple statements. Atomicity guarantees that each transaction is treated as a single “unit”, which either succeeds completely, or fails completely: if any of the statements constituting a transaction fails to complete, the entire transaction fails and the database is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors and crashes.

When we do something to change a database, the change should work or fail as a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is consistency?

A

Consistency: ensures that a transaction can only bring the database from one valid state to another, maintaining database invariants: any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. This prevents database corruption by an illegal transaction, but does not guarantee that a transaction is correct.

Any given database transaction must change affected data only in allowed ways.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is isolation?

A

Isolation: transactions are often executed concurrently (e.g. reading and writing to multiple tables at the same time). Isolation ensures that concurrent execution transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially. Isolation is the main goal of concurrency control; depending on the method used, the effects of an incomplete transaction might not even be visible to other transactions.

Isolation defines how/when the changes made by one operation become visible to other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is durability?

A

Durability guarantees that once a transaction has been committed, it will remain committed even in the case of a system failure (e.g. power outage or crash). This usually means that completed transactions (or their effects) are recorded in a non-volatile memory.

Durability guarantees that transactions that have committed will survive permanently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

characteristics of relational database management systems (RDBMSs):

A
  • Are ACID compliant
  • RDBMSs put a lot of emphasis on keeping data consistent.
  • They require a formal database schema.
  • New data or modifications are not accepted unless they comply with this schema in terms of data type, referential integrity, etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the disadvantages of RDBMSs?

A
  • May induce overhead and hampers scalability and flexibility.
  • RDBMS cannot handle ‘data variety’ (all types of data under a unified schema of tables).
  • Addition of a new functionality would need all the elements to support the new structure. Change is inevitable.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the difference between vertical and horizontal scaling?

A
  • Vertical scaling: extending storage capacity and/or CPU power of the database server (RDBMSs/SQL).
  • Horizontal scaling: multiple DBMS servers being arranged in a cluster (NoSQL). It needs to be tolerant to partition failures.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the characteristics of NoSQL databases?

A
  • They store and manipulate data in formats other than tabular relations, i.e. non-relational databases.
  • NoSQL databases aim at near-linear horizontal scalability by distributing data over a cluster of database nodes for the sake of performance and availability.
  • Eventual consistency: the data (and its replicas) will become consistent at some point in time after each transaction.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

NoSQL databases are suitable for critical transactions as in a bank system. True or false

A

False, as they are not ACID compliant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a BASE model?

A
  • Basically available
  • Soft state
  • Eventual consistency
18
Q

What does it mean for a database to be basically available?

A

This property guarantees the availability of the data. There will be a response to any request (can be failure too).

19
Q

What does it mean for a database to be in a soft state?

A

The system can change over time, even without receiving input (since nodes continue to update each other).

20
Q

What does it mean for a database to be eventually consistent?

A

The system will become consistent over time but might not be consistent at a particular moment.

21
Q

What is the CAP theorem?

A

A distributed computer system cannot guarantee the following three properties simultaneously:
- consistency (all nodes see the same data simultaneously)
- availability (guarantees that every request receives a response indicating a success or a failure result)
- partition tolerance (the system continues to work even if nodes go down or are added).

22
Q

Can we have databases with the three properties of the CAP theorem?

A

No

23
Q

What properties from the CAP theorem do the traditional relational databases have?

A

Availability and consistency.

24
Q

What properties from the CAP theorem are related to horizontal scalability?

A

Partition tolerance

25
Q

What properties from the CAP theorem are related to vertical scalability?

A

Availability and consistency.

26
Q

What are the types of databases associated with each possible combination of the properties of CAP theorem?

A
  • CA: relational databases
  • CP: MongoDB, HBase, Hive
  • AP: Dynamo, Cassandra, CouchDB
27
Q

What are the types of NoSQL databases?

A
  • Key–value stores
  • Document stores
  • Column-oriented databases
  • Graph-based databases
  • Other NoSQL categories
28
Q

What are key-value stores?

A

They provide much higher performances than RDBMS and they store data as (key, value) pairs.

29
Q

What are applications for key-value stores?

A

Web applications may store user session details and preferences; real-time recommendations and advertising; in-memory data caching to speed up applications by minimizing reads and writes that slower disk-based systems.

30
Q

What are document stores?

A

They store a collection of attributes that are labelled and unordered, representing items that are semi-structured.

31
Q

What are applications for document stores?

A

Content management systems; online profiles in which different users provide different types of information.

32
Q

What are the equivalents for tables, rows and columns in MongoDB?

A

Collections, documents and fields, respectively.

33
Q

What are column-oriented databases?

A

They store data tables as sections of columns of data. They are useful if: aggregates are regularly computed over large numbers of similar data items; data are sparse, i.e., columns have many NULL values.

34
Q

What is the main advantage of using column-oriented databases over traditional relational databases?

A

Relational databases are not efficient at performing operations that apply to the entire dataset, as they need indexes, which add overhead. In column-oriented databases, all values of a column are placed together on disk. And operations such as finding all records that satisfy a certain condition can be executed directly. Null values do not take up storage space anymore.

35
Q

What are the disadvantages in using column-store databases?

A

Retrieving all attributes pertaining to a single entity becomes less efficient; join operations will be slowed down.

36
Q

What are graph-based databases?

A

They apply graph theory to the storage information of records. Graphs consist of nodes and edges. A graph database is a hyper-relational database. JOIN tables are replaced by semantically meaningful relationships. Relationships that can be navigated and/or queried using graph traversal based on graph pattern matching.

37
Q

What are application examples for graph-based databases?

A

Location-based services; recommender systems; social media (Twitter and FlockDB); knowledge-based systems.

38
Q

List other NoSQL categories.

A
  • XML databases.
  • OO databases.
  • Database systems to deal with time series and streaming events.
  • Database systems to store and query geospatial data.
  • Database systems such as BayesDB which let users query the probable implication of their data.
39
Q

Problems with NoSQL databases:

A

Most NoSQL implementations have yet to prove their true worth in the field. Some queries or aggregations are particularly difficult; map-reduce interfaces are harder to learn and use.

40
Q

NoSQL vendors start focusing again on robustness and durability, whereas RDBMS vendors start implementing features to build schema-free. True or false

A

True

41
Q

What is NewSQL?

A

When we blend the scalable performance and flexibility of NoSQL systems with the robustness guarantees of a traditional RDBMS.

42
Q

About atomicity in the ACID properties:

a) a transaction can only bring the database from one valid state to another
b) a transaction either succeeds completely, or fails completely.
c) a transactions is executed sequentially
d) relational databases do not guarantee atomicity

A

b)