Statistics Flashcards
What are the properties of a normal distribution?
Most common distribution
Most observations are near the centre
Extends to +- infinity
Mean = Median = Mode
Completely described by 2 parameters, mean (ν) and standard deviation (σ)
What is the coefficient of variance, CV?
It is the standard deviation expressed as a percentage of the mean.
The smaller the value of CV, the greater the precision of measurements.
CV = ( S / x̄ ) x 100
What is the standard error of the mean, SEM?
How much the sample mean varies from the population mean.
SEM = S / √n
SEM < SD
The larger the number of samples, the smaller the value of SEM, so the less error there is.
What is a one-tailed hypothesis?
It makes a unidirectional prediction about the outcome of an experiment.
It predicts a difference between experimental conditions in one particular direction.
What is a two-tailed hypothesis?
The scientist at the outset makes no prediction about the direction of the outcome of the experiment.
This means that there is no bias in the experiment, so it is the more commonly used hypothesis.
What is a 2 sample t-test?
An experiment where the same variable is compared in 2 different samples.
Unequal sample sizes → df = nA + nB - 2
Equal sample sizes → df = 2n - 2
What is a paired t-test?
An experiment that compares the means of a given variable in the same sample under different conditions or after different lengths of time.
df = n - 1
What is Pearson’s product moment correlation coefficient, r?
r measures the linear correlation between x and y.
r does not have any units.
Transforming the data has no effect on the value of r.
The range of r is -1 ≤ r ≤ 1
→ When r is close to 0 there is no correlation
→ When r is close to 1 there is a strong positive correlation.
→ When r is close to -1 there is a strong negative correlation.
When is Spearman’s rank correlation, rs , used?
When the normal distribution is clearly violated.
What is the line of best fit by least squares (LS)?
LS estimates a and b of y = a + bx by minimising the sum of the squared errors.
The values of a and b that make Q ( Q = ∑ ei2 ) the smallest is the least squares estimate.
This gives the best or most accurate line of best fit, and this should always pass through the mean of x and the mean of y.
What is the regression coefficient, b?
b is the gradient calculated by least squares.
- ∞ ≤ b ≤ + ∞
the units of b are (units of y)/(units of x)
The significance test for r and b are the same.
What are the 3 assumptions of regression analysis?
Linearity → the linear relationship y = a + bx is correct
Normality → the errors and observations are normally distributed
Variance homogeneity → those normal distributions have the same variance