Week 11 (IO patterns) Flashcards

1
Q

why would we not want to load or store data the way hadoop does out of the box

A

inject data from original source w/o storing in hdfs

feeding MR output to next process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

2 general ways to modify the way data is loaded on disk

A

input format: configure how contiguous chunks of input are generated from blocks in hdfs

record reader: configure how records appear in the map phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

2 general ways to modify the way data is stored on disk

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the roles fo input format in hadoop

A

make sure data is there

split input blocks and files into logical chunks to be assigned to a map task

create record reader to be used to create key,val pairs from raw input split

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what type of view does inputsplit represent of the split

A

byte-oriented

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is partition pruning

A

configure if files are loaded into MR based on name of file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the goal of a reccomendation sys

A

predict the rating or preference that a user would give to an item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is collabaritive filtering

A

the process of identifying similar users and reccomending what similar users like

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

in collab filtering, when are users similar

A

if their vectors are close according to some distance measure (jaccard or cosine distance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

big n of collab filtering and then what it eventually ends up being

m = num of customers

n = num of product/catalog items

A

O(MN)

ends up being O(M+N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what does item to item collab filtering do

A

matches each of the users purchased items to similar items

combines those into reccomendation list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly