Maths - Stats Flashcards
Skewness
Mean = median = mode (symmetrical)
Mean > median > mode (positive skew)
Data is skewed to the left
Mean < median < mode (negative skew) data is skewed to the right
Skewness formula
3(mean - median) divided by standard deviation
0 = symmetrical
+ = positive skew
- = negative skew
Linear interpolation
Can be used to calculate median, UQ and LQ from a frequency table
= LB + (position in group/number in group X class width)
Standard deviation
Square root of sum of x^2 divided by n - mean squared
When its for a frequency table its sqaure root of sum of x^2 multiplied by frequencies divided by sum of frequencies - mean^2
Sensitive to outliers
Measure of spread
Not affected by transformations
Regression line
Residual - distance from a data point to the regression line
The regression line will minimise the sum of the square of these residuals
For y on x it’s horizontal residuals
For x on y it’s vertical residuals
Extrapolation
Estimating a value outside the data you have
Extrapolated values are unreliable
Outliers
Data points more than 2 SD from the mean
Data point more than 1.5 X IQR more than the UQ or less than the LQ
Standardising scores
A means of comparison between data values from different data sets
Standardised scores = (X - mean)/SD
Conditions of binomial distributions
Two possible outcomes
Fixed number of trials - n
Trials are independent
The probability of success for each experiment is constant
Conditions of geometric distributions
2 possible outcomes - success and failure
Outcome of each trial is independent of the outcome of all the other trials
Probability of each trial is constant
The trials are repeated until a success occurs
Geometric - P(X>x)
= q^x
Geometric - P(X < or = x)
= 1-q^x
Geometric - P(X > or = x)
= q^(x-1)
Conditions for a normal distribution
99.9% of the data within 3 SD from the mean
95% within 2 SD from the mean
A continuous distribution which forms a symmetrical bell curve
Standardising a normal variable
Z = (X-u)/SD
Correlation
The measure of a relationship between two variables, greater correlation means the variable are more closely related
Combinations and permutations differences
Combinations involve making a choice/ selection in which the order is unimportant
Permutations are ordered arrangements of a set of items
Discrete random variables: expected mean and E(X^2)
E(X) = sum of xp
E(X^2) = sum of x^2 p
DRV: Variance
Var (X) = E(X^2) - [E(X)]^2
Standard deviation is just the square root of the variance
Mutually exclusive
Independent
When two events cannot happen at the same time
When one event has no effect on the other
Regression line x on y formula
x = a + by
where a = mean x - mean y b
where b = Sxy/Syy
Geometric - P(X < x)
1 - q^(x-1)
How to tell if two events are mutually exclusive?
They can’t happen at the same time
Hence P (A inersect B) = 0
P(A U B) = P(A) + P(B)
How to tell if two events are independent?
One event has no effect on the other
P(A/B) = P(A)
so P(A intersect B) = P(A) x P(B)