Biostats - Lecture 12,13,14,15 Flashcards
What is sampling
A smaller/more manageable version of a larger group
Containing the characteristics of the larger population
What is wrong about sampling a WHOLE Population
It is costly and difficult to investigate an entire population
Why do we have stastistcs in health science
Understand health of the population but not limited to
Sample distribution of 30
More tight together
Sampling distribution of 10
More spread out
Categorical-variables
Type of data that may be divided into groups
Continuous variables
A variable that can take an uncountable set of values or infinite set of values, eg: height, weight, age
Standard diviation
- larger spread = larger STD
- it is a common method for measuring the amount of variation in a set of values, it tells us how far away each observation is from the mean or average
- a higher STD indicates greater variability in the dataset, while a lower STD indicates less variability
How can errors occur
When there is too less of a sample or there is bias in a sample
Population mean
The average value of the variable in the entire population
Population SD
A measure of spread of the variable in the entire population
Sample mean
The average value calculated from a sample population
Sample SD
A measure of spread of the verable calculated from a sample population
Sample distribution
Sample distrustion is centred on the population mean when there is no bias and the standard error is the SD of the sampling distribution, it measures the variability of sample means around the population mean
Population proportion
The proportion of individuals or items in the entire population that possesses a particular characteristic or belonging to to a particular category
Sample proportion
The proportion of individuals or items in a sample of the population that possesses a particular characteristic or belong to a particular category
Populations with different variability so talking about the Population SD
as the population SD increases it gets more spread out
Normal distribution
Symmetric bell curve
What is in a normal distribution
95% will lie within 1.96 standard errors of the population mean
Standard error
When the sample size is larger ( greater than 30 ) the sampling distribution tends to be a normal distribution. This is known as the CENTRAL LIMIT THEOREM
SE= S/ square root n
95 confidence interval
General formula = plus minus 1.96 x SE
For means this becomes = mean (plus minus) 1.96 x s/ square root n
We also say : 95% confident that the true population mean lies bwteeen the lower and upper confidence interval
Confidence intervals increasing sample size
Gets more squished together