2 Flashcards
Question 1
In fraud detection, the target fraud indicator is usually
a) easy to determine.
b) hard to determine.
b) hard to determine.
Question 2
A spider construction is an example of a fraud schema often used in
b) Credit card fraud.
c) Insurance claim fraud.
d) Tax evasion.
e) All of the above.
d) Tax evasion.
Question 3
The key novely of CSLogit is that we now
a) maximize the average expected cost and the complexity.
b) minimize the average expected cost and the complexity.
c) maximize the likelihood and the complexity.
d) minimize the likelihood and the complexity.
b) minimize the average expected cost and the complexity.
Question 4
CSLogit uses a
a) ridge regression complexity term.
b) LASSO complexity term.
c) elastic net complexity term.
b) LASSO complexity term.
Question 5
To find the optimal parameters CSLogit uses
a) genetic algorithms.
b) gradient descent.
b) gradient descent.
Question 6
When compared to traditional logistic regression,
a) CSLogit performs better in terms of savings.
b) CSLogit performs worse in terms of savings.
a) CSLogit performs better in terms of savings.
Question 7
Which statement is CORRECT?
a) Supervised approaches can detect only previously known fraud patterns as they occurred in the past.
b) Unsupervised approaches look for unusual anomalous behavior deviating from a norm; hence they can detect previously unknown fraud (also referred to as anomaly detection methods).
c) Both statements are correct.
c) Both statements are correct.
Question 8
Which statement is CORRECT?
a) In OLAP, a roll-up operation aggregates across one or more dimensions. An example of this is the distribution of the amount of a claim and the recency aggregated across all the number of cars.
b) In OLAP, drill-down is the opposite operation of roll-up whereby more detail is asked for by adding another dimension to the analysis.
c) In OLAP, slicing refers to selecting a slice of the OLAP cube along one of its dimensions.
d) In OLAP, a dicing operation fixes values for all the dimensions and creates a sub-cube.
e) All statements are correct.
e) All statements are correct.
Question 9
Which statement is NOT CORRECT?
a) Traditional techniques for detecting outliers can be affected by outliers so strongly that the resulting fitted model may not allow to detect the deviating observations. This is called the masking effect.
b) When using traditional techniques for detecting outliers, some good data points might even appear to be outliers, which is known as swamping.
c) The goal of robust statistics is to find a fit which is different to the fit we would have found without the outliers.
d) It is not the aim to replace traditional techniques by a robust alternative but illustrate that robust methods can give you extra insights in the data and may improve the reliability and accuracy of your analysis.
c) The goal of robust statistics is to find a fit which is different to the fit we would have found without the outliers.
Question 10
The z-score measures
a) how many standard deviations an observation lies away from the median for a variable.
b) how many standard deviations an observation lies away from the mean for a variable.
c) how many standard deviations an observation lies away from the minimum for a variable.
d) how many standard deviations an observation lies away from the maximum for a variable.
b) how many standard deviations an observation lies away from the mean for a variable.
Question 11
The median and interquartile range (IQR)
a) change when outliers are present.
b) do not change when outliers are present.
b) do not change when outliers are present.
Question 12
In a multivariate setting, outliers can
a) not always be detected by simply applying outlier detection rules to each variable separately.
b) always be detected by simply applying outlier detection rules to each variable separately.
a) not always be detected by simply applying outlier detection rules to each variable separately.
Question 13
Which statement is NOT CORRECT?
a) The antifraud rationale behind the use of Benford’s law is that producing empirical distributions of digits that conform to the law is difficult for non-experts. Fraudsters may thus be biased toward simpler and more intuitive distributions, such as the uniform.
b) If a data set complies with Benford’s law, it can still be fraudulent.
c) According to Benford’s law, the probability that the first digit equals 1 is about 4.6%, while it’s 30% for digit 9.
d) Most financial data and accounting numbers generally conform to Benford’s law.
c) According to Benford’s law, the probability that the first digit equals 1 is about 4.6%, while it’s 30% for digit 9.
vice versa
digit 1= 30%
digit 9= 4.6%
Question 14
According to Benford’s law, the first digit d appears with a probability of:
P(d)=log 10 (1/d)
P(d)=log10 (1+ 1/d)
P(d)=log10(d)
P(d)=log10(1-1/d)
P(d)=log10 (1+ 1/d)
Question 15
Which of the following data sets typically does not comply with Benford’s law?
a) Data where numbers represent sizes of facts or events.
b) Data in which numbers have no relationship to each other.
c) Data sets that arise from additive fluctuations.
d) Some well-known infinite integer sequences.
c) Data sets that arise from additive fluctuations.