Revature Hadoop Flashcards

1
Q

What is name node?

A

The name node is a component of HDFS that acts as the master server managing the file system namespace and regulates file access. They manage the data nodes on a HDFS cluster and there is only one per cluster, but can be multiple backup name nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a data nodes?

A

Data nodes manage the storage attached to the node it is running on. It is stored in 128 MB blocks by default which are replicated across other data nodes in clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Yarn

A

Yarn stands for yet another resource negotiator. It is a resource management and job scheduling technology used with Hadoops distributed processing framework.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the Yarn component Node Manager

A

The Node manager manages application containers assigned by the resource manager monitoring the resource usage and reports it to the resource manager.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

With the Yarn Node Manager explain containers

A

The containers are a set of resources such as RAM CPU and storage on a single node. The resources are allocated by the Resource manager and monitored by the manager node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain the Yarn component Resource Manager

A

The resource manager is the master node manager that manages the resource allocation and scheduling across ALL nodes in the Hadoop cluster.

.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain map reduce

A

Map reduce is a programming model and processing framework for distributed computing. Its used for process large data sets across clusters of machines in parallels. This is the core processing mechanism for Hadoop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain what map does in hadoop

A

Map splits the data up into multiple smaller chunks and generates key value pairs for each chunk. They are then grouped together by the key to be passed to the reducer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain what reduce does in hadoop

A

The reducer processes each group and gives an output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How are data nodes fault tolerant

A

They are fault tolerant through data replication where multiple copies of the data are available across multiple nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How many name nodes exist in a cluster

A

There is 1 name node per cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the default number of replications for each block?

A

3 by default

How well did you know this?
1
Not at all
2
3
4
5
Perfectly