Regression and Clustering Flashcards

1
Q

Ridge

A

Shrinks coefficients towards 0 but never exactly 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Lasso (L1)

A

Some are exactly 0

  • Like embedded feature selection.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Regularisation

A

if $w$ is not REGULARISED. THEN they can explode. And you get overfitting. Regularisation is a penalty to keep $w$ under control.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

K-Means steps

A
  1. Start with k random cluster centres/centroids
  2. Assign each object to the nearest centroids
  3. Compute the new centroid for each cluster as the mean of the objects assigned to the cluster
  4. Repeat step 2 until no change to the centroids
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

K-Means pros

A

Advantages:

  • Simple
  • Flexible
  • Scales well to a large dataset(features and samples)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

K-Means cons

A

Disadvantages:

  • Have to specify K
  • Categorical data
  • Need to re-run to obtain clustering for different k
  • Stochastic (non-deterministic). Having different starting centroids will produce different results.
  • Usually convert to local optima
How well did you know this?
1
Not at all
2
3
4
5
Perfectly