unsupervised learning Flashcards

1
Q

Unsupervised methods

A

there is NO specific target variable
1. affinity grouping: associations, market-basket analysis: which items are commonly purchased together?
2. similarity matching: which other companies are similar to ours?
3. clustering: Do my customers form natural groups: certain groups behave a certain way?
4. sentiment analysis: what is the sentiment of my users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Supervised methods:

A

there is a specific target variable
1. Predictive modelling
2. causal modeling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

clustering vs classification

A

Clustering: finding groups in data, organize data into groups: high similarity within each group
low similarity across the groups: to organize the info
classification: attempts to predict which of a small set of data the individual belongs to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

methods to measure distance

A
  1. euclidean distance: data points with numeric attributes
    a physical distance between two data points
  2. manhattan distance: e.g. map walking distance
  3. jaccard distance: treats two objects as set of characteristics. useful when dealing with problems that involve large sets of characteristics that may not be “symmetrically” important. e.g. text mining
  4. cosine distance: encounter in the text mining or recommendation engines
  5. levenshtein metrics (edit distance): text mining. Applications: autocorrect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly