MapReduce & Yarn Flashcards
JobTracker Description and responsibilities
Single master process coordinating all jobs on the cluster
- Assigns map and reduce tasks to TaskTrackers.
- Monitors job progress.
TaskTrackers Description and responsibilities
Subordinate processes executing assigned tasks
runs tasks on a fixed number of map and reduce slots within a data node.
- Execute map and reduce tasks.
- Report progress to JobTracker.
slot
A slot represents an ability to run one of these “Tasks”
(map/reduce tasks) individually at a point of time
dynamic appraoch:
a job can request for
what it needs rather for an individual slot
Slots flexibility:
jobs never leave their original slots and cannot be
move to free slots
limitations of classical MapReduce
Scalability
Resource utilization
Support of workloads different from MapReduce.
JobTracker responsibilities
*Management of computational resources in the cluster
*Coordination of all tasks running on a cluster
YARN Acronym FOR WHAT ?
Yet Another Resource Negotiator
Analytics Architecture
Edge node
Network switches
Data nodes
porprities :Not only SQL based
High scalability, availability, and flexibility
Compute and storage in the same box for reducing network latency
Right design for semi-structured and unstructured data
Data and Application are in the same machine (Data nodes)
Computer cluster
a group of linked computers, working together
closely so that in many respects they form a single
computer
Computer cluster advantages
High availability : disponibilité élevée
Load balancing : redistribuer vers un autre ordinateur du
cluster
Remontée en charge
Flexibility
Clustered file systems (CFS)
comprises nodes connected via a network
store data with redundancy
Store new data with replication