Map-Reduce Flashcards

1
Q

Objective

A

Sequentially read a lot of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Phases

A

Map phase
Group by key
Reduce phase
Write the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Map

A

read the input and produce key,value pairs
For each work output its count

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sort & Shuffle

A

performed by the system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Reduce

A

collect values with the same key and produce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is programmer responsible for

A
  • Map function
  • Reduce function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the MapReduce system responsible for

A
  • Partitioning the input data
  • Scheduling the program’s execution across a set of machines
  • Performing the sort by key & shuffle step
  • Handling machine failures
  • Managing required inter-machine communication
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Master node

A

Master node coordinates the
execution:
Task status: (idle, in-progress,
completed)
Idle tasks get scheduled as
workers become available
When a map task completes, it
sends the master the location and
sizes of its intermediate files, one
for each reducer
Master pushes this info to
reducers
Master pings workers periodically
to detect failures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Worker node

A

Worker node performs
map or reduce tasks, as
requested by the
coordinator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Map worker failure

A

Upon detection of the failure
of a worker, map tasks
restarted in different worker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Reduce worker failure

A

Reduce task is restarted in
other worker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Stragglers (slow workers)

A

If a task is taking too long to
complete, it is launched in
other worker. First result used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Master failure

A

MapReduce task is aborted
and client is notified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly