Revature Hadoop Flashcards
What is name node?
The name node is a component of HDFS that acts as the master server managing the file system namespace and regulates file access. They manage the data nodes on a HDFS cluster and there is only one per cluster, but can be multiple backup name nodes
What is a data nodes?
Data nodes manage the storage attached to the node it is running on. It is stored in 128 MB blocks by default which are replicated across other data nodes in clusters.
What is Yarn
Yarn stands for yet another resource negotiator. It is a resource management and job scheduling technology used with Hadoops distributed processing framework.
Explain the Yarn component Node Manager
The Node manager manages application containers assigned by the resource manager monitoring the resource usage and reports it to the resource manager.
With the Yarn Node Manager explain containers
The containers are a set of resources such as RAM CPU and storage on a single node. The resources are allocated by the Resource manager and monitored by the manager node
Explain the Yarn component Resource Manager
The resource manager is the master node manager that manages the resource allocation and scheduling across ALL nodes in the Hadoop cluster.
.
Explain map reduce
Map reduce is a programming model and processing framework for distributed computing. Its used for process large data sets across clusters of machines in parallels. This is the core processing mechanism for Hadoop
Explain what map does in hadoop
Map splits the data up into multiple smaller chunks and generates key value pairs for each chunk. They are then grouped together by the key to be passed to the reducer.
Explain what reduce does in hadoop
The reducer processes each group and gives an output.
How are data nodes fault tolerant
They are fault tolerant through data replication where multiple copies of the data are available across multiple nodes
How many name nodes exist in a cluster
There is 1 name node per cluster
What is the default number of replications for each block?
3 by default