MAPREDUCE Flashcards

1
Q

What is mapreduce and hadoop mapreduce

A

mapreduce is a programming model for processing and generating large datasets.

hadoop mapreduce is an implementation of this model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how does map reduce work

A

1)iterate over numerous records
2)extract data as key-value pair
3)aggregate results
4)save the results in hdfs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is combiner and how it works with map reduce

A

combiner is like a mini reducer works during map phase to pre-aggregate data when the function is associative and commutative.

it happens before shuffling and aggregating.

it reduces intermediate data and reduce network traffic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is partitioning mapreduce

A

the partitioner directs map outputs to appropriate reducer by applying a function on that key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly