MapReduce Algorithm Flashcards

Question 1

Q

Explain data-level parallelism

Answer

A

same computing on chunks of data

Question 2

Q

What is MapReduce

Answer

A

a systematic approach to algorithm design that is inherently parallelisable

Question 3

Q

What are the characteristics of MapReduce

Answer

A

Data aware: data location is considered in scheduling jobs 2. Simple: programmer is freed from parallelization and concurency control 3. Manageable: programmer is freed from data management 4. Scalable: simply increase number of nodes 5. Fault tolerant: built in redundancy 6. Efficient and atutomatic distribution of data and workload across machines

Question 4

Q

What’s Hadoop

Answer

A

it implements the mapreduce algorithm with a single master node and many worker nodes. Client submits a job to master node and the master splits each job into tasks (map/reduce) to worker nodes

MapReduce Algorithm Flashcards

(4 cards)