General Flashcards
What is Pointwise Mutual Information?
It is a measure of the discrepancy between (the coincidence of) the joint probability of 2 RVs, X and Y, and their individual distributions, assuming independence.
PMI(X, Y) = log[ P(X, Y)/ (P(X) P(Y))]
Sensitivity
TPR = TP / (TP + FN)
- recall, probability of detection, true positive rate
- avoiding false negatives
Specificity
TNR = TN / (TN + FP)
- true negative rate
- avoiding false positives
ROC
- graph of the function f with ( 1 - TNR) as independent variable and TPR as dependent variable
1 - TNR = FPR false positive rate, probability of false alarms
TPR - probability of detection true positives
Skeweness
- 3rd momentum that indicates the distribution function is assymetric related to the mean
- ‘positive’ - the tail is longer (or fatter) on the ‘positive’ side of the x-axis
Kurtois
- 4th momentum that indicates the peakness of the distribution function - meaning “higher kurtosis is the result of infrequent extreme deviations (or outliers), as opposed to frequent modestly sized deviations”
‘skewed to normal’ transformation
rule of thumb: having skewness in the range of −0.8 to 0.8 and kurtosis in the range of −3.0 to 3.0, use log or Box-Cox transformation to a normal (symmetric) distribution
- make sure to transform it back i.e using exp() in case of log transformation
multicollinearity
- a situation in which 2 or more explanatory variables in a multiple regression are highly linear
- this affects the independence assumption for the columns of the input matrix X (thus its inverse in OLS)
detect multicollinearity
Perturbing the data: multicollinearity can be detected by adding random noise to the data, re-running the regression many times, and seeing how much the coefficients change (see wikipedia)
transform a random var to a uniform distribution
Using the probability integral transform, if X is any random variable, and F is the cumulative distribution function of X, then as long as F is invertible, the random variable U = F(X) follows a uniform distribution on the unit interval [0,1].
coefficient of determination, R-squared
- indicates how much variation in the data can be explained by the model
- indicates the proportion of the variance of dependent variable that can be explained by the independent variables
- 1: fully explained, 0: not at all explained
- i.e. in linear regression: square of the Pearson sample correlation coeficient (when using also the intercept)