Anomaly Detection Flashcards
DTF never results in distance strictly greater than Euclidean distance
True
DTF cannot be applied to sequences of diff length
False
DTF can only be applied to single-variate (one-dimensional/ one feature) sequence
False
DTF normalization is useful
True
What does a low local reachability density mean? (lrd)
It means large average distance
LOF(q) < 1 means what?
Inlier, higher density
LOF(q) > 1 means what?
Outlier, lower density
Advantages of NN?
- used in unsupervised setting
- no assumptions about data distribution
- intuitively appealing, uses distances
Disadvantages of NN?
- computationally expensive when testing
- requires distances, so all disadvantages of distances apply
Advantages of PCA?
- Useful for modeling feature interaction
- Computationally efficient
Disadvantages of PCA?
- Based on assumption that normal/ anomaly are distinguishable in the reduced space
- Context not taken into account
- PCA sensitive to outliers
What are the three types of anomalies?
- Point (point x is strange)
- Contextual (point x strange given set S)
- Collective (set S is strange)
Outliers have no effect on PCA?
False
PCA assumes relationship between variables is linear?
True
LOF uses reachability distance instead of actual distance to lower effect of outliers?
False