MapReduce Algorithm Flashcards

1
Q

Explain data-level parallelism

A

same computing on chunks of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is MapReduce

A

a systematic approach to algorithm design that is inherently parallelisable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the characteristics of MapReduce

A
  1. Data aware: data location is considered in scheduling jobs 2. Simple: programmer is freed from parallelization and concurency control 3. Manageable: programmer is freed from data management 4. Scalable: simply increase number of nodes 5. Fault tolerant: built in redundancy 6. Efficient and atutomatic distribution of data and workload across machines
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What’s Hadoop

A

it implements the mapreduce algorithm with a single master node and many worker nodes. Client submits a job to master node and the master splits each job into tasks (map/reduce) to worker nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly