CassandraDB Flashcards

1
Q

Explain the main concepts of CassandraDB.

A

1 - Cassandra stores tables;

2 - The information is stored on multiple nodes (distributed architecture, communication/consistency protocls, existence of redundancies (multiples copies of the same data) implying mistake tolerance);

3- - The rows of the same table are stored in distinct nodes (group of rows of a table in a node == partition, approach should minimize the number of partitions needed to read the complete view - idealy 1 partition)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain the relation between CassandraDB and ACID transactions.

A

Cassandra does not support ACID transactions (ACID causes a significant performnace penalty & are not required for many use cases).

However, Cassandra write operation demonstrates ACID properties
(INSERTs, UPDATEs and DELETEs are atomic, isolated and durables & Tunable consistency for data replicated to nodes, but does not handle application integrity constraints.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What’s the Cassandra modelling aproach? (Topics)

A

1 - Application (What queries will be performed?)
2 - Model (Logical model shape)
3 - Data (Load data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Related to the architecture of CassandraDB, what’s are the functions of the memtables and commit log?

A

The ‘memtable’ stores the writings of each family of columns temporarily in memory. Due to this mechanism, writing actions in Cassandra are ‘cheap’.
The commit log, for each node, stores all the writing activity to grant the durability of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What’s happens when a ‘Flush’ of a ‘memtable’ and the relation it has with ‘SSTables’.

A

When the ‘memtable’ happens to be full, the ‘Flush’ happens realising all the data.
The data delete on the ‘Flush’ gets written in disk generating files called ‘SSTables’.
These ‘SSTables’ are immutable (no changes can be applied to it after the ‘Flush’), data from the same partition can be written in different SSTables. This caracteristics explain why read actions are more expensive than write actions (the need to access the ‘SSTables’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain the process behind deleting data in CassandraDB.

A

When data gets deleted, SSTables are immutable so they can not be deleted on the disk side. They get marked as ‘tombstone’ (deleted).
After being marked, ‘tombstoned’ rows get delete during compactation.

‘gc_graces_seconds’ - if some node containing information to be delete is down, there is the possibility to receive the oreder if it gets restored on a default time-space. (if the node does not recover, the data gets unmarked as delete (inconsistencies on the replicas).

TTL - some data can have an expiration date. In this case, after the time to live, the data will be marked as deleted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the relation between Direct read requests and Digest requests in the process of Reading Data in Cassandra.

A

There are 3 types of data requests: Direct read requests, Digest requests and Background read repair requests.

Direct read requests consistes of the coordinator node contacting with other node with correct replication data. In contrast, in the Digest request, the coordinator node contacts with many node with the replicated data, determining the consistency level.

The relation between both request types comes when during the DIgest request, there is a necessity to determine the consistency of each node, using a Direct read request.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the 3 Supervisor Task recommended?

A

1 - Security
2 - Backup and Recovery
3 - Secure the consistency of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain the relation between the Cluster and the Keyspaces.

A

A Cluster can be defined as an instance colletion (nodes) from Cassandra and multiple Keyspaces.
Keyspcaes are groups of families of columns and normally, there is a keyspace per application.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Enumerate the Column struture.

A

The Column is the basic storage unit.

1 - name (binary)
2 - value (binary)
3 - timestamp (i64)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the first step for a Cassandra draw elaboration?

A

Identify what data does the view recieves and what it should return
- Data recieved can be understood by the information given in resource to write the given view.
- Data return can be the data that the user should recieve as response.

(Example) If the given information says ‘the tittles of the video-games bought by a specific user’ => recieving data == specific user & return data == the title of the video-games bought

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the second step for a Cassandra draw elaboration?

A
  • When defining the primary key, it is composed by two columns (partition key and clustering key).
  • CQL views require this specification of the partitions key in the ‘WHERE’ condition, being this condition necessariliy an equility (necessary to not break node struture).
  • Due to the previous topic, the equality conditions made with the entry data will probably take part of the table partition key.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the third step for a Cassandra draw elaboration?

A
  • Clustering key (second component of the primary key) is used to physically order the rows which form a partition.
  • The entries forming the clustering key can be determined by:
  • Return data must be returned by a specific order (the data can be order physically in memory so there is no need to order it in the view process again).
  • The view does verifications by branches (‘data between 20-01-2022 and 02-12-2022 - the date should be added to the clustering key)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly