Anomaly detection Flashcards
Which of the following metrics is used to create a ROC (receiver operative characteristics) curve?
Select one:
a. Recall
b. F1 score
c. Precision
d. Accuracy
a. Recall
Conserning an isolation forest algorithm, select the wrong statement among the following:
Select one:
a. For each tree, the attribute selection order is random.
b. Classification is performed computing the average depth for a given input samples.
c. Anomalies usually place at the lowest average depth for the trees
d. Node splitting is repeated until every input points is located on a leaf.
c. Anomalies usually place at the lowest average depth for the trees
Which among the following conditions makes the WDAD (Well-Defined Anomaly Distribution) assumption NOT valid
Select one:
a. Impossibility of using deep learning for anomaly detection
b. Adversarial conditions
c. Unknown anomaly
b. Adversarial conditions
Considering a sudden decrease of temperature in July (Northern emisphere), we can state that it is a ….
Select one:
a. Collective anomaly
b. Point anomaly
c. Contextual anomaly
c. Contextual anomaly
Which of the following classification strategies is suitable for anomaly detection.
Select one:
a. Isolation forest
b. All of the mentioned strategies
c. One-class SVM
d. Few shots learning networks
e. None of the mentioned strategies
b. All of the mentioned strategies
Which among the following ones is NOT an anomaly type.
Select one:
a. Systematic anomaly
b. Collective anomaly
c. Contextual anomaly
d. Point anomaly
a. Systematic anomaly
Which among the following conditions makes the WDAD assumption valid?
Select one:
a. Analyst’s attention evolves and focuses on new samples during time
b. Anomalies are created by diverse and not known creation models
c. Anomalies are created by a generation process different from the nominal process
d. Anomalies are created by an adaptive adversarial process (insider threats, cyber security)
c. Anomalies are created by a generation process different from the nominal process
Considering accuracy metrics for anomaly detection, select the most correct statement among the following ones.
Select one:
a. A low false positive rate implies a false alarm black-hole
b. Accuracy only depends on the True Positive and false positive samples
c. F1 score is ratio between an algebraic and arithmetic mean
d. Area Under Curve (AUC) depends on True Positive rate only
c. F1 score is ratio between an algebraic and arithmetic mean
Considering the isolation tree anomaly detection algorithm, which of the following statements is NOT correct
Select one:
a. Final anomaly score depends on the expected depth
b. For a given tree, attribute selection is repeated until every sample in the dataset is a leaf
c. Where score 2^-d\r I low, we have an anomaly (d is the average depth)
d. In the creation of a single classification tree, threshold are chosen randomly
c. Where score 2^-d\r I low, we have an anomaly (d is the average depth)
Which among the following ML and DL architectures is not suitable in an anomaly detection clean data learning set up
a. RNN
b. ARMA models
c. Multi-class SVM
d. Deep auto encoder
c. Multi-class SVM
When WDAD is not valid?
- adversarial conditions (fraud, insider threats, cyber security). It means that there is an attacker and there is an anomaliest and knows that there is detection data. The attacker tries to make anomalies samples look like nominal samples (TTL attack is one of the examples of adversarial conditions).
- The diverse set of modes (new failures, not known) - some modes are not profiled by an attacker. When we have something new we don’t know if it’s nominal or anomalous.
- User’s notion of anomaly changes in time (anomaly = interesting point or new data type).
What is Precision?
Precision = detected true anomalies (TP) / samples classified as anomalies(TP+FP) -> Positive predicted Value (PPV)
What is recall?
Recall = detected true anomalies (TP) / total anomalies (TP+FN) -> True Positive Rate (TRP) or Sensitivity
What is specificity?
Specificity = detected false anomalies(FP) /total nominals(TN+FP)-> False Positive Rate (FRR)
What is the accuracy?
- Accuracy TP+TN/total -> accuracy of the classifier