Distributed Systems Flashcards
Data Model
In memory data can be accessed using pointers/references.
Over the wire /from file data encoding/decoding has to happen.
Text - JSON, XML, CSV ( http)
Binary - Thrift, ProtoBuf, Avro
Schema Evolution
Avro - Backward/Forward
Why Replication ?
- To improve high availability
- Faster access
- can move the data closer to the user for faster reads
- To improve read/write throughput
- Multiple followers can serve read requests which improves throughput
Replication Algorithms: Leader based: ( Master-slave or Active-Passive replication) Single-leader Multi leader Leaderless replication
Leader based replication
Leader -> followers
- Sync writes
- client response sends back only when data is written into followers also.
- Async writes
- once leader writes data into storage immediately responds back to the client and asynchronously sends data to its followers
Replication lag
- Eventual consistency
- Read-after-write consistency
- Monotonic reads
Append only Log structured format
SS tables
LSM
Update format
B+ tree
Partition Types
Partition by Key -value
Partition by Key Range
Partition by Hash
Skewed or hot spot ?
When data is not partitioned uniformly, single partition
can have more data that other partitions (Skewed). Because of it single partition will receive high load (hot spot).
Disadvantages of Hashed Parition
It naturally supports primary key. But if multiple/columns are to be searched (secondary indexes) , then it might be difficult..
Secondary Indexes - local store (document)
Store secondary index pertaining to that partition also in same partition. (local indexes)
Disadvantages:
Since each partition can have secondary index value, All partitions have to be queried to construct end result.
Scatter/gather approach - send requests to all paritions and merge the results.
there could be tail latency problems with scatter gather.
Secondary index - term (global)
store secondary indexes in global storage. But partition the global storage also by secondary index key.
for example : if color of the car is secondary key… partition by starting letter of the color or alphabetically…
Advantages : read efficient.. no scatte gather..
disadvantages : slow write, as data might have to be written into different partitions ( primary can into 1 partition, secondary index might go into other partitions
Partition rebalancing
moving data to another node ( rebalance)
- to upgrade the node
- node removed…
Rebalancing
-
Service discovery or request routing
3 different ways to achieve is
- Client connects to any server , the server routes the request from right node and replies back
- Routing tier / partition aware load balancer can be used to route the client requests to the right node . This layer is just a routing layer
- Client aware of the server to connect to.
How to Handle partition changes in request routing ??
- can use distributed coordinator systems like zookeeper to keep the upto date information
Node can connect to zk and update the latest Information.
systems like Kafka , Hbase uses zk as metadata coordinator system
- other than external systems , can use discovery protocol like gossip to update the metadata . Cassandra uses gossip protocol
- disadvantage is more work on the node. Advantage is avoiding external system
Service discovery frameworks
Consul
Eureka
Map reduce - batch processing
Mapreduce jobs Hdfs
Mapper - collects all the data local to the machine
Reducer - brings in all sorted keys together based on partition key
SS table
It is append only log structure
No random writes
Key , value will be maintained in the memory called memtable.
Once memtable reaches its size , flushed into disk sstable
Reads will be scanned in memory first , then disk . So always updated value will retrieved from memory
Ss table compaction
As updates are not happening , if same record has multiple updates over multiple ss tables , then compaction process will create a new sstable by discarding old values
Bloomfilters
To check whether given element in the set or not
- bloomfilter guarantees there will not any false negatives
- but there could be false positives. I e element may or may not be present
- implemented using bit vector and set of hash functions
- as bit vector gets filled larger, more false positives will happen
LSM
Log structured merge trees
Distributed System Characteristics
- Scalability
- Reliability
- Availability
- Manageability
- Efficiency
Scalability
- As system grows with data volume or transactions, it has to be scaled to handle the volume without impacting performance.
Horizontal Scalability:
Adding more nodes to the cluster to handle additional traffic.
MongoDB/Cassandra DB supports horizontal scaling.
Vertical Scalability:
Scale single machine by adding more CPU cores, Memory or Disk.
- This kind of scaling needs downtime and also high cost for single resource.
Reliability
Systems should continue to work even when 1 or few components/systems fail.
To make the system reliable, we can either replicate the data or system.
It will cost more to make the system reliable.