Statistics Flashcards
discrete vs continuous data
discrete:
- set number of values, eg shoe size
continuous:
- can have any value, eg height
definition:
population
total set of possible values that could be selected for the sample
definition
sampling unit
a single member of the population
definition
sample
a selection of sampling units observed to make conclusions about population as a whole
definition
sampling frame
a list of all members of the population
advantages and disadvantages:
sample
advantages
- less time consuming/ expensive
- fewer people to respond
- less data to process than census
disavantages:
* data may not as accurate as census
* may not be large enough to give info abt small sub groups of population
dis/advantages
census
pros
* should give accurate results
cons
* time / expensive
* can’t be used when testing process destroys the item
* hard to process large quantity of data
Systematic sampling definition
A sample is formed by choosing members of a population at regular intervals using a list
stratified sampling
- population divided into specific groups & random sample taken from e/ group
- proportion chosen from group equal to proportion sample size n is of total population N
pros and cons of stratified sampling
PROS
* useful when very diff groups in population
* sample represenative of population structure
* members selected randomly
CONS
* can’t be used if not possible to split population into specific groups
* same cons as simple random
opportunity sampling
sample is formed using available members of population who fit criteria
Pros and cons of opportunity sampling
PROS
* Quick and easy
* useful when list of population not possible
CONS
* unlikely to be representative of population structure
* likely to produce biased results
pros and cons of quota sampling
PROS
* useful when sampling frame not available
* sample will be representative of population structure
CONS
* may introduce bias as some members of the population may choose not to be sampled
in a data set
outliers are
any data points 2 standard deviations more or less than mean
in a box plot
outliers are
any data point that is 1.5x IQR more or less than upper or lower quartile