Maths - Stats Flashcards
Skewness
Mean = median = mode (symmetrical)
Mean > median > mode (positive skew)
Data is skewed to the left
Mean < median < mode (negative skew) data is skewed to the right
Skewness formula
3(mean - median) divided by standard deviation
0 = symmetrical
+ = positive skew
- = negative skew
Linear interpolation
Can be used to calculate median, UQ and LQ from a frequency table
= LB + (position in group/number in group X class width)
Standard deviation
Square root of sum of x^2 divided by n - mean squared
When its for a frequency table its sqaure root of sum of x^2 multiplied by frequencies divided by sum of frequencies - mean^2
Sensitive to outliers
Measure of spread
Not affected by transformations
Regression line
Residual - distance from a data point to the regression line
The regression line will minimise the sum of the square of these residuals
For y on x it’s horizontal residuals
For x on y it’s vertical residuals
Extrapolation
Estimating a value outside the data you have
Extrapolated values are unreliable
Outliers
Data points more than 2 SD from the mean
Data point more than 1.5 X IQR more than the UQ or less than the LQ
Standardising scores
A means of comparison between data values from different data sets
Standardised scores = (X - mean)/SD
Conditions of binomial distributions
Two possible outcomes
Fixed number of trials - n
Trials are independent
The probability of success for each experiment is constant
Conditions of geometric distributions
2 possible outcomes - success and failure
Outcome of each trial is independent of the outcome of all the other trials
Probability of each trial is constant
The trials are repeated until a success occurs
Geometric - P(X>x)
= q^x
Geometric - P(X < or = x)
= 1-q^x
Geometric - P(X > or = x)
= q^(x-1)
Conditions for a normal distribution
99.9% of the data within 3 SD from the mean
95% within 2 SD from the mean
A continuous distribution which forms a symmetrical bell curve
Standardising a normal variable
Z = (X-u)/SD