Big Data Lecture 06 Wide Column Stores Flashcards
What are the issues with RDBMS (relation database management systems) and how to solve them?
<ul><li>We hit the limit of storage of one machine --> scale up!</li><li>It is too slow --> cluster, replicate: scale out! (Difficult, very high maintainance costs.)</li><li>Solution: HBase - distributed database system!</li></ul>
How do wide-column stores compare to other methods? What is the only downside?
It marks all the columns green in this!<br></br><img></img><br></br>Small size per item, at around 10MB for optimal performance.<br></br><br></br>*Random access means being able to access any data as we wish without reading everything sequentially.
Who was the pioneer of Wide-Column Stores?
Google with its ‘Big Table’.
What is the motivation for making huge tables? How does it improve on RBDMS? Why don’t we replace RBDMS with it?
What is related stays together (stuff is denormalized), this is because the join operations in query time are very expensive.<br></br><br></br>RDBMS is good for updating stuff, which we cannot do with WCS.
How is each row identified?
It has a unique row ID, by which the records are sorted.
What are column families?
Residual of tables that were joined in the database, they are somehow realated, and they are stored together.<br></br><br></br>They must be pre-specified, but we can later add columns within the family.
What are the possible types of values in WCS?
The values are not typed, they are just byte objects. Some utility functions for reading the data are implemented.
What is the optimal size of an object in WCS?
<= 10 MB per cell, but it can be anything (text, image, webpage, JSON…)
What queries can be executed?
<ul><li>get (per rowID),</li><li>put (inserting per rowID, can also overwrite, can be partial, e.g. per column family),</li><li>scan (linearly scan certain rows and columns),</li><li>delete (per rowID).</li></ul>
What do we refer to as key-value model in general?
Any sort of hash map setting where the key points to the value (e.g. using hashtable).<br></br>
What is column oriented storage?
Columns of a database are physically stored on the disc together.
What are example of Wide Column Stores databases?
Google’s Big Table,<br></br>Apache HBase,<br></br>Cassandra.
How are tables stored in WCS?
In regions: cut of both rows (min inclusive and max exclusive) and columns.
What manages files underneath the WCS?
HDFS!
What is the architecture of HBase?
We have HMaster node (does DDL operations), which rules over processes on RegionServers (does DML operations), these all can be running on the same machine. Each RegionServer is an HDFS client.<br></br><br></br><img></img>
What is DDL and DML?
Data Definition Language (e.g. create table with specified columns),<br></br>Data Manipulation Language (e.g. add this row to this column).
How are regions saved?
HMaster assigns them to RegionServers.<br></br><img></img><br></br>If they are too big, they will be split, nodes are also rebalanced if they are too full, and if node fails, HMaster creates a new one.
How are the data persistently stored?
In an HFile, which is a sorted list of key-value pairs, ordered by:<br></br><ul><li>row,</li><li>column,</li><li>version (in reverse).</li></ul><div><img></img><br></br></div><div><img></img><br></br></div>
How are different users of WCS allowed to modify the same data? Relation to CAP theorem?
They are not, data is locked until the users stops modifying.<br></br><br></br>The data has linear versioning, there are no DAGs.<br></br><br></br>The system is not fully available always to everyone, but it is consistent.
How are nodes managed in WCS?
Using ZooKeeper, distriubuted system that manages all the heartbeats, and locks on files (and race conditions).
What makes WCS data access so efficient?
Each RegionServer is a HDFS client, it has a replica of its own data, so it does not have to request it, it can get it fast in the local system.<br></br><br></br>If RegionServer is assigned responsibility, but it does not have the data, it is fine as over a longer time, the data will be replicated onto this node, which will speed up the system again.
How is each value of the table stored in WCS, what are its parts?
It is yet another key value model!<br></br><img></img><br></br>Each record has its own key, and value, those are first (for memory efficiency) preceeded by key length (32 bits) and value length (32 bits).
How does Gamma code work?
The number of 1s terminated by a 0 tell us how many bits to read, and then pre-pend 1 (all non-zero numbers start with 1 in binary).<br></br><img></img>
How are data accessed in an HFile?
Using this beautiful long key, which stores all the information about the record (row, column family, column, versioning time-stamp and deleting marker).<br></br><img></img><br></br>Note, there is no column-qualifier lenght, because the whole key has a fixed length, so it can be inferred.<br></br><br></br>This key refers to the records.<br></br><img></img>
What are HBlocks?
The number of KeyValues read at once, it has a size of 64kB.
What is a KeyValue?
Smallest unit of physical storage, indexed by row, column, version and sharded by regions and column families for storage.
How do we look up a key in HFile?
We look at the index, which tells us approximatelly where the key will be, and then we search there.<br></br><img></img>
What are the levels of physical storage in WCS?
<img></img>
How is new data added to WCS?
New data is not written, but saved in MemStore. It it store in a tree, nicely sorted (insertion O(log N)), so it can be quickly written onto a disc in linear time.
How do we make the MemStore robust to failure?
There is a write-ahead log, called HLog, (we save everything we write before we process it), so we can replay it in case of failure.<br></br><br></br>These are stored in HDFS, one region per server.
When is MemStore flushed?
<ul><li>Maximum size for one store is reached,</li><li>overall memstore size is reached,</li><li>the write-ahead log is full.</li></ul>
What structures are used in RBDMS and WCS for optimization of search?
<ul><li>RDBMS: B+-trees - they minimize seek-time bound by indexing the data.</li><li>WCS: Log-Structrure-Merge trees - optimize by size for throughput.</li></ul>
How is data compacted using LSM tree?
Like the 2048 game!<br></br><br></br>The MemStore dumps a file, then it always merges whenever we have the file of the same size, so it goes up and up and up.<br></br><br></br>These files are stored at different level, either memory, disc or on further nodes.
How does client communicate with the WCS system?
<ul><li>To create/update/delete a table: talk to HMaster,</li><li>to know which server to talk to: talk to Meta table (cache that locally),</li><li>to get the data, talk to RegionServer.</li></ul>
What APIs are supported?
REST and Java.
How is WCS optimized? (3)
<ol><li>Cache - pre-save HBlocks using standard caching criteria (do not use when reading all the data from a certain block, use batch processing instead, also do not use if doing random access),</li><li>key ranges (know for certain that something is inside the file or not),</li><li>Bloom filters, tell you with certainty that something is not in there, but only maybe if it is in there.</li></ol>
How does WCS know what is where?
It has a meta table, called hbase:meta, separate from Hmaster, that stored in HDFS what is in which table. If Hmaster is lost, it is just relected.
What does lazy splitting mean in HDFS?
We update the meta table, but we only split the HBlocks actually in memory later on, we keep them together, but we read them as if they were separate.
How does HFile compaction synergize with locality?
Small blocks might get distributed in HDFS, but whenever we flush, we bring the data back to the parent node, and we can happily keep shortcircuiting.
What is Spanner?
WCS but with bigger spans,<br></br><ul><li>uses tablets instead of regions (discontinuous ranges row ranges),</li><li>it uses universe master to run other smaller HBase subsystems.</li></ul>
HBase runs on top of HDFS, how so it still fast?
- It uses MemStore and Cache to load frequently used values fast,<br></br>2. it shortcircuits DataNodes with HDFS.