chapter 18 - statistical hypothesis Flashcards
census
measures information about the entire population - all individuals of interest
- takes lots of time and cost - so more for larger organisations
when is it impossible to carry out a census
- when it impossible to identify or get access to all members of a population eg all the ants in the world
- if the process of collecting data destroys the object being measured eg max load that can be placed on a shelf would mean breaking all shelves
sample
only measured part of a population
population parameter
a numerical characteristic of a population eg its mean variance
simple random sampling
every possible sample has an equal chance of being selected eg random number generation all samples are equally as likely
+unbiased sample
- hard - need a list of the whole pop and everyone to respond
- time consuming and expensive
opportunity sampling
sampling only from the individuals willing to take part - chasing respondents based on their availability and convenience
(non random)
+cheap and convenient
- may introduce bias and not be generalisable
systematic sampling
taking participants at regular intervals from a list of the population (starting point is chosen at random)
+avoids unwanted clustering of data
+easier than using random no. generators
- needs a list of whole pop
- less random as no longer independent
stratified sampling
splitting the population into groups based on factors relevant to the research then random sampling from each group in proportion to the size of that group
+sample is representative of the factors
- needs list of whole pop with info about each member
- time consuming and expensive
- determining factors is not always obvious
quota sampling
splitting the population into groups based on relevant factors then opportunity sampling from each group until a required no. participants are found
(non random)
+ensures sample is representative over the factors identified
- may introduce bias and not be generalisable
cluster sampling
splitting the pop into clusters based on convenience then randomly choosing clusters to study further
+cheaper and easier
- less accurate - clusters may be non representative
lower tail test
take H1 as p<
and use P(X<=
upper tail test
take H1 as p>
and use P(X>= x) so have to do 1-P(X<=x-1)
two tailed
take H1 as p =/=
if they tell you that 45 / 100 people have a car and the question is if it has changed from 36% having a car you use P(X >= 45) as 45% > 36% so use the upper tail and halve the sugificance value when you are using it
hypothesis test answers layout
- define rV X
- distribution assumed (define p) - X~B(n,p)
- H0 and H1 =
- rejection criteria
- test statistic - P(X <=
- P value =
- conclusion (sufficient evidence to reject H0?)
- context (means there is sufficient evidence to suggest…)
critical value/ region
to have succulence evidence to reject H0 there would need to be x or fewer
x = critical value
critical region is X <= x
acceptance region is X >= x+ 1
use binomial CD list