Cassandra Flashcards

1
Q

What are the peculiarities of Cassandra which differentiate it from other databases?

A
  1. No JOINS (do joins in application code if required)
  2. Data partitioned and sorted on disk based on Primary Key
  3. Leaderless (no zoo keeper, no leader election)
  4. Build-in sharding
  5. No downtime while adding/removing shards from the cluster
  6. Tunable consistency per query (R+W > N)
  7. Schema based (unlike Mongo)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is “Hinted Handoff” in Cassandra?

A

When a node is offline and there is data to be written on that node then the coordinator node proxy for it and saves the write for 3 hrs. During that time if the node comes online the saved data is handoff to the node.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is “Read Repair” in Cassandra?

A

Data on various replica nodes goes out of sync from time to time. E.G Write was done with CL < RF.
Now when a read request is made, the coordinator node is responsible to sync data across nodes. It requests data from multiple replicas. The coordinator picks the latest data and updates that to other replicas as well.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to configure Cassandra for Strong Consistency?

A

In a Cassandra cluster of N nodes, R+W > N ensures strong consistency
R = 1, W = N Slow Write
R = N, W = 1 Slow Read
R & W Acks count can be specified per query (CL).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which node act as the coordinator node in Cassandra?

A

Any node can act as a coordinator and each node is aware of partitioning and can route reads or writes to the correct node.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we define the replication factor in Cassandra? What does RF = 2 mean?

A

While creating the table.
RF = 2 means 1 Original + 1 Copy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What will be the max values of Read and Write Acks or Consistency Level per query?

A

Equal to Replication Factor. Typically, we will choose lesser values to increase the application performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the difference between Primary Key, and Partitioning Key in Cassandra?

A

Primary Key = ((Partition Key), Clustering Column)
The partitioning Key is equivalent to the sharding key and defines the boundary of partitions.
The clustering column specifies the sorting requirement.
The primary key must be unique in a table.
The primary key in SQL databases is any key that is unique.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cassandra does not provide referential integrity across tables (videos, video_by_user). How to achieve that and what are the limitations?

A

Using transactions (aka Log Batch) that offer atomicity using rollbacks. However, these transactions are not isolated. Hence, not ACID.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly