2. Statistical inference Flashcards
What is a sample?
SAMPLE
- A subset on units
- From a population of interest
- Used to estimate a population parameter
What is a random sample?
RANDOM
- Every unit in the population must have an equal chance of being sampled
- Selection of units must be independent
What is a sample of convenience?
A collection of units that are easily available to research
- E.g. volunteers with a predefined set of traits.
Limits how much the sample reflects the population
What is sampling error?
The deviation of an estimate from its population parameter
What is bias?
A systematic departure of the estimate obtained by sampling and the true population parameter
What is precision?
The spread of estimate resulting from sampling error
What is accuracy?
The extent to which an estimate reflect a sampling bias
What is a frequency distribution?
Histograms
- that can be used to display discrete frequency distribution data
- using actual data from a sample
What is a probability density function?
It estimates data from the whole population
What are the main measures of Central Tendency on a normal/gaussian distribution?
Mean
- Sample mean notation: xbar
- Population mean notation: μ
Median
Mode
What are measures of distribution spread?
Range
Variance
Standard deviation
What is range?
The difference between the maximum value (Xmax) and the minimum value (Xmin)
What is variance?
The sum of squares over the sample size/population
Measures distance of the deviation from the mean
What is standard deviation
This puts the distance of the deviation from the mean back into its original unit
root of variance
How do you calculate the sample variance and standard deviation?
Sample variance
S^2 = sum of ((difference of value from mean)sqrd) over (sample-1)
Sample deviation
S = √(S^2)
How do you calculate the population variance?
same as sample variance but divide by population size rather than (sample-1)
N instead of n-1
Why is n-1 used?
reflects the variance of the sample better by reducing bias in the estimate
How do you calculate the standard error?
sample deviation (s) over root sample size (√n)
How do you calculate the CV?
Sample deviation (S) over mean value (Y-bar)
How do you calculate the Interquartile range?
The difference between the third quartile and the first quartile
- if 0.25n or 0.75n is an integer, you are able to take the average of the value and the one ahead of it
What is a sampling distribution?
the skew (or lack of ) of data
skewed distributions show asymmetry towards one direction
(Median tends to be better for skewed frequency distributions)
data may be skewed due to bias/error/ or a small sample size
How do you calculate a 95% confidence interval?
mean ± 2(SE)