Big data Flashcards
(9 cards)
1
Q
What are the 3V of big data?
A
- Volume
- Variety
- Velocity
2
Q
Map Reduce Phases
A
- Input
- Map
- Shuffle & Sort
- Reduce
- Output
3
Q
What does the NameNode do?
A
Meta data management
4
Q
What do DataNodes do?
A
Store data of file, split into small chunks
5
Q
What does the JobTracker do?
A
- job scheduling, resource management
6
Q
What do TaskTrackers do?
A
Execution of map/reduce tasks
7
Q
How to handle TaskTracker failure?
A
- detect failure via heartbeats
- re-execute all in-progress tasks on failed node
- re-execute finished map tasks of running MapReduce Jobs
8
Q
How to handle JobTracker failure?
A
- single point of failure, resume from execution log
9
Q
What is a straggler?
A
Slowly running task on a task tracker