HBase Concepts Flashcards

1
Q

What is a node?

A

A single computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a cluster?

A

A group of nodes connected and coordinated by certain nodes to perform tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a Master Node?

A

A node performing coordination tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Slave Node?

A

A worker node performing tasks assigned to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a Daemon

A

A process or program that runs in the background

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Where is table data stored?

A

In HDFS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is HBase data stored in HDFS?

A

The data is split into HDFS blocks and stored on multiple nodes in the cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an HBase table split into?

A

Regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What serves Regions to clients?

A

Region Servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can a RegionServer have regions for more than one table?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the HBase Master responsible for?

A

1 - Coordinates which regions are managed by each Region Server
2 - Handles new table creation and other housekeeping operations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Can an Hbase cluster have multiple Masters?

A

Yes, for high availability. But only one can be active at a time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What service handles the coordination of the Masters?

A

Zookeeper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When a cluster has multiple Master, how is the active master determined?

A

Upon startup all Masters connect to Zookeeper. The first Master to connect, becomes the active master.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What happens if the controlling Master fails?

A

If you have additional master they will compete to run the cluster again.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What two servers are typically kept together in the slave nodes?

A

The data Node and RegionServer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

List 4 master nodes?

A

Name Node, Secondary Name Node, Master, Zookeeper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are tables comprised of?

A

rows, columns and column families

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How are rows sorted?

A

They are sorted in rowkey order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Can columns in HBase be created on the fly?

21
Q

If a column for a row does not have a value, does it create a column for that row?

22
Q

What is a column family

A

A collection of columns

23
Q

What is the minimum number of column families that a table must have?

24
Q

What delimits the column family from the qualifier

A

A colon (:)

25
Do all column family members have the same prefix?
Yes. e.g. contactinfo:fname and contactinfo:lname
26
Can you specify the tuning and storage settings at the column family level?
Yes
27
Can you specify the tuning and storage settings at the column level?
No
28
Is there a limit on the number of columns that a column family can have?
No
29
How are columns stored within a column family
Columns within a family are sorted and stored together.
30
Give two examples of when having separte column families are useful?
1 - Data that is not frequently accessed together | 1 - Data that uses different column family options (such as compression)
31
What data type is data in Hbase tables stored as?
A byte array
32
Are empty cells stored?
No
33
How are tables physically stored.
They are stored on a per-column family basis
34
Is there a limit to the data that can be stored in Hbase?
It can store anything that can be serialized into a byte array.
35
In Hbase, what is the equivalent of a primary key?
Rowkey
36
What are the three Hbase Operations?
``` 1 - Get 2 - Scan 3 - Put 4 - Delete 5 - Increment ```
37
What does a Get do?
Retrieves a single row using the row key
38
What does a scan do?
Retrieves all rows
39
How can a scan be constrained?
By specifying a start and end key
40
What does a Put do?
Puts a new row identified by a row key
41
Can you do multiple puts at one time?
Yes
42
What does a delete do?
Marks data as having been deleted. It removes the row identified by the row key.
43
When you do a delete is the data removed from HDFS immediately?
No
44
What is does an increment do?
Allows atomic counters. Allows the value to be initially set or incremented. It can be negatively incremented.
45
What server is responsible for the counters' consistency?
RegionServer
46
How is a increment cell stored?
As a 64-bit integer (a long)
47
What is the default number of versions kept by HBase
3
48
How are versions stored?
They are stored by their timestamp in descending order
49
What is responsible for serving table data
A region server