Statistics Flashcards
F (x) = f/n
Relative frequency = frequency/ total number of data
CDF plot
Fx against data point
What does the gradient of a CDF plot mean?
The steepest part of CDF plot corresponds to most number of data points. Shallower CDF -> low number of data points
Skewness?
Which way the curve leans how asymmetrical it is
Positive skew
Skews to the left
Negative skew
Skews right
Interquartile range?
Difference between third quartile and first quartile
Sample space?
The set Ω of all possible outcomes of an experiment
Event?
A sub-set A of the sample space A c Ω
Α n B?
A and B occurring together (intersection)
A u B
A or B occurring (union)
_
A?
Not A (complement)
Mutually exclusive?
Events cannot occur at same time
Independent?
Event is not affected by previous events
Dependent?
Event is affected by other event
Permutations?
Order of outcome is important
Combinations?
Order of outcomes is not important
P(A|B)?
Probability of A given that B has already occurred
P(A|B)P(B) = P(B|A)P(A) = ?
P(A n B)
What does it mean if P(A|B) = P(A)?
Events A and B are independent
Partitioned sample space?
The events are non-empty, non- overlapping whose union forms the whole sample space
ΣP(B|A)P(A) = ? (Total probability law)
P(B)
P(A|B)P(A) / P(B) = ? (bayes law)
P(A|B)
P(A|B n C) P(B|C) P(C) = ?
P(A n B n C)
Sx^2 = 1/(n-1) Σ(xi-x~)^2
Sample variance
Sample deviation = Sx
N= number of data points
x= value (each one)
x~= mean
Vx= Sx/ x~
Coefficient of variation = standard deviation/ mean
dx=1/n * Σ|xi-x~|
Mean absolute deviation = dx
n= number of data points
xi = data points
x~= mean
Unbiased skewness= (n-1)/ (sqrt(n)*(n-2)(σ^3) * Σ(xi-x~)^3
n= number of data points
σ=standard deviation
xi= data point
x~ = mean
Biased skewness = (1/n) * Σ(xi-x~)^3 / σ^3
n= number of data points
xi= data point
x~ = mean
σ= standard deviation
What order is mode median and mean for positive skewness?
Mode - median - mean
In what order is mode median and mean for a symmetric curve?
All equal
In what order is mode median and mean for negative skewness?
Mean-> median -> mode
cov= 1/(n-1) Σ(xi-x~)(yi-y~)
Sample convergence
n= number of data points
xi= each x point
x~ = mean x
yi= y point
y~ = y mean
Cxy =1/(n-1) Σ(xi-x~)(yi-y~) / SxSy
Cxy= sample correlation coefficient (has to be between -1 and 1)
n= number of points
xi= x point
x~ = x mean
yi= y point
y~ = y mean
Sx= x standard deviation
Sy= y standard deviation
What does Cxy= -1 mean?
Perfect negative linear correlation
What does Cxy= 0 mean?
No correlation
What Cxy= 1 mean?
Perfect positive linear correlation
Variable?
A quantity that can vary
Random?
The result will be the outcome of a random experiment
Discrete?
Has a limited number of outcomes
Continuous?
Has a limitless number of outcomes in between two points
PDF=f(x) = dF(x)/ dx
Probability density function = the gradient of CDF plot
What is the area under a CDF plot?
1
The steepest part of a CDF plot is where on a PDF (probability density function) plot?
The peak
Cov(X,Y)
= E[(X-E[X]) (Y-E[Y]))
= E[XY]-E[X]E[Y]
Cov (X,X)?
Var (X)
Var(X+Y)?
Var (X) + Var(Y) + 2Cov(X,Y)
What is the convenience of random variables X and Y?
A measure of the linear dependence between these variables
Cov(cx,Y) = Cov(X,cY)
c*Cov(X,Y)
Cov(X+Y,Z) ?
Cov(X,Z) + Cov(Y,Z)
Cov (X,Y) =
(If symmetrical???)
Cov(Y,X)
Cov(X,c)?
0
Cov(Σ ai Xi, Σbj Yj) =
Σ Σ ai bj Cov(Xi,Yj)