MapReduce Algorithm Flashcards
1
Q
Explain data-level parallelism
A
same computing on chunks of data
2
Q
What is MapReduce
A
a systematic approach to algorithm design that is inherently parallelisable
3
Q
What are the characteristics of MapReduce
A
- Data aware: data location is considered in scheduling jobs 2. Simple: programmer is freed from parallelization and concurency control 3. Manageable: programmer is freed from data management 4. Scalable: simply increase number of nodes 5. Fault tolerant: built in redundancy 6. Efficient and atutomatic distribution of data and workload across machines
4
Q
What’s Hadoop
A
it implements the mapreduce algorithm with a single master node and many worker nodes. Client submits a job to master node and the master splits each job into tasks (map/reduce) to worker nodes