Week 5 Flashcards
what is bias?
data becomes less reliable as randomly sampling because people make wrong assumptions
what are the two types of errors in biostats?
give a new example of both.
- errors that make our answers more uncertain eg. more variability eg. not sure if the finger landed on the water or the land (globe). Usually unavoidable.
- erros that move us further away from the truth ie. we get the wrong answer bc of bias. Avoidable by using random sampling.
discuss bias and sample size:
it doesnt matter how big the sample size is, if the if the sample is skewed then the data will be wrong.
eg. if you interview on the street, the type of people that will come up to speak and going to be the more strongly opinionated people, so interviewing many of them will not change the fact that it is a skewed sample.
bias often comes into play when:
when are choosing the participants
what is important to consider when choosing the sample from the population?
to have a selection of people from all types in your population. eg. if it is NZ residents having all ages, genders, different backgrounds, ethnicities.
what are the 3 groups of people that will be in your population/sample?
- people who are in your population but wont participate in sample
- people in population who are in your sample
- people who arent in your population but end up in your sample (ineligible)
what is important to consider when gathering data from your sample?
The means in which you do it by. Dont use landlines to collect data from uni students.
what is a covenience sample? what is its flaw?
only collecting an easily available sample eg. 1 location. what if some of the population does not frequent that location? while this is random sampling, it is also bias.
what is the flaw of random sampling?
it can miss out people groups as it is not concious.
eg. Thalidamine found to help nausea, not tested on pregnant women, later found to cause birth defects.
what are the different terms and their signs?
sample mean: x sample size: n standard deviation of all observations in sample: s standard error: SE population: backwards u with long stalk population standard deviation o-
population and sample is described by:
sample distribution described by:
proportion.
- the sampling distribution is centred on population proportion (when there is no bias)
standard error:
varibility/standard deviation of sampling distribution.
how does sample size effect the bell curve?
small sample size = more skewed/less smooth, unsymmetric cruve.
big sample size = small standard deviation, symmetric curve.
the more variability in the population:
the more uncertainty in sampling distribution. wider curve.
the bigger the sample size:
the smalled the standard deviation.
what do you need to know to know what the bell cruve will look like?
- variability of population
2. sample size
what are some exmples of 2 group populations?
drug a vs drug b
male vs female
drug vs placebo
smoker vs non smoker
Describe normal distribution:
- symmetric bell curve
- we know certain things about the distribution
- if we have the mean and standard deviation then we can draw the curve.
- requires large sample size
- 95% averages will lie within 1.96 standard errors/deviations from the mean
what is the fromula for standard error?
SE = s/root of n
this means you can find the range or spread of different samples you might have gotten.
what is the fromula for 95% confidence interval?
estimate of the SE +- 1.96 x SE
this formula ensures that if we did repeated sampling 95% of the intervals would contain the true population mean
what can you start from if you only have this thing?
the SE
what is the confidence interval made of?
lower confidence interval (using -)
upper confidence interval (using +)
what is the formula for proportion?
proportion +- 1.96 x SE
bigger sample size =
smaller gap between lower/upper confidence intervals.