ML-08 - Anomaly detection Flashcards

1
Q

ML-08 - Anomaly detection

Define anomaly detection.

A

The process of identifying extreme points/observations that deviate significantly from normal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ML-08 - Anomaly detection

Is anomaly detection supervised or unsupervised?

A

Typically unsupervised/semi-supervised.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ML-08 - Anomaly detection

What is the algorithm for fitting a Gaussian-based anomaly detection model?

A
  • Fit parameters mu and sigma to training data.
  • Find the threshold value/vector epsilon.

(If known data is available, determine epsilon from that)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ML-08 - Anomaly detection

What is “Gaussianization of features”?

A

A transformation applied to the features to make them look like a normal dist.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ML-08 - Anomaly detection

Why would we use Gaussianization of features?

A

Because the choice of features has a huge effect on the anomaly detection algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

ML-08 - Anomaly detection

What can you do if the probability that the data point is normal is high for both normal and anomalous data? (3)

(See image)

A
  • Add new features
  • Transform/combine existing features
  • Choose features with large/small values for anomalies

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ML-08 - Anomaly detection

Why might you use anomaly detection with/instead of supervised learning?

A
  • SL only good with balanced data.
  • Anomalies can be different from each other -> difficult to learn.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly