week 4 Flashcards

1
Q

Collab filtering

A

suggesting complex items wihtout understanding the nature of them but by seeing similarties in users (amy liked it and she has similar prefrence to bob so will recomend to bob)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

content filtering

A

find similarities in items so recomend based on them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

collab filter issues

A

lots data , as much as possible
millions of items need lots computing power
requires one user who has seen movie or done variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

content filter requires little data to start what are some issues

A

 Can be limited in scope
 Not present something different just similar items that user already likes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

clustering helps come up with better

A

predictive models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

2 types learning algorithim

A

supervised- : gives you some outcome
 Trying to predict something
unsupervised- don’t predict anything just group data into similar groups to build better predictive models or market segmentation etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

cluster is repped by what normally

A

cnetroid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how to find distance between centorid

A

take mean of coordinates of all the points in cluster and will give you the centroid for that cluster

or use any other distance metrics we dicussed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

why normalize data

A

so all distance is relevant as distance highly influenced by scale of variables so normalize it

can greatly change clusters if normalized different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  • Hierarchal closeting starts as
A

assume each point own cluster then merge into one eventually

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

vertical lines in dendogram smallest at

A

bottom once move along height will incease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how to decide how many cluster we want

A

 To do this draw horizontal line and number of vertical lines crossed is amount of clusters to have

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

cluster process

A

o 1st thing is to calculate distance between all points
o Then find out point that has minimum distance between them and merge. Continue to do so until 1 cluster
 calculate 2 points centroid then use centroid as distance for next point on dendogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

where should we draw line to decide clusters

A

1-depends on problem itself move line slight and changes cluster= very sensitive to info
2- if no specific want use dendogram and choose cluster in this case more robust to inaccuracy in data. good to have some robustness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly