week 3 Flashcards

Question 1

Q

anomalies

Answer

A

point anomaly
contextual (sequential) anomaly - anomalous within a context, use sliding windows, learn model for predicting next value, compute expected next value, evaluate residual using real next value, use decision threshold to decide if it is anomaly
collective anomaly - individual points are not anomalous by themselves, treat set of points as datasets, compare these sets (kind of like a sliding window)

Question 2

Q

classification

Answer

A

OSVM - maximize negative space, minimize positive space
Isolation Forest - repeat N times - pick feature f, split randomly, continue until all leaves contain singletons, path length to leaf = isolation score, average isolation score over all trees => anomaly score (goal is to isolate anomalies)

Question 3

Q

NN based

Answer

A

distance-based = a point is anomalous if distant from other points
density-based = a point is anomalous if in low density region
LOF(q) = ratio of average local reachability density of q’s k-nn and local reachability density of q
if LOF(q) < 1 => higher density => inlier
if LOF(q) > 1 => lower density => outlier

Question 4

Q

Clustering based

Answer

A

normal data records belong to large and dense clusters
anomalies do not belong to any cluster or form very small clusters
local anomalies are distant from all other points in the same cluste

Question 5

Q

Clustering based

Answer

A

normal data records belong to large and dense clusters
anomalies do not belong to any cluster or form very small clusters
local anomalies are distant from all other points in the same cluster

Question 6

Q

Spectral techniques

Answer

A

PCA - outliers have variability in the smallest Principle Component (datapoints that vary in unexplained dimensions are anomalous)
Autoencoder - encode data into low dimension, decode it back to high dimension, see difference between original and reconstructed data

Question 7

Q

Sequential (discrete) processes

Answer

A

Markov processes - dependent only on last action
state transition diagrams - if n states possible, nxn matrix describing probability of going from 1 state to another (remember laplace smoothing)
ngrams - instead of future depending only on last action, it can depend on last 2/3/4/… actions => process in sequences

week 3 Flashcards

(7 cards)