Chapter 4: Collecting Data Flashcards
Explanation for an outcome
- by chance
- discrimination
–> run simulation to find convincing evidence
Define Population
entire group of your interest
Define Sample
subset
Define Census
Collects data from the entire population
Types of Bad Sample
- Convenience samples
- Bias
- Voluntary Response Sample
Define Convenience Samples
- over/underestimate what you want to find from the population
- introduces bias
- produces samples that don’t reflect the population
Define Bias
- systematically favoring a certain outcome
- when a study very likely to underestimate/overestimate what is being looked at
Define Voluntary Response Sample
- made up of people who choose to answer a general appeal
- usually people with strong emotions
Types of Good Sample
- Simple Random Sample
- Stratified Random Sample
- Cluster Sample
- Systematic Random Sample
Define Simple Random Sample
An SRS of size __n__ is chosen so that every __GROUP__ of __n__ individuals has an equal chance to be selected as the sample
You must
1. numerically label the population
2. use technology to random digit table to get random numbers
Define Sampling WITH replacement
individual can be selected more than once. repeats allowed
Define Sampling WITHOUT replacement
individual cannot be selected more than once. repeats are ignored and not part of the sample.
How to select an SRS
put everything in a pile and pick randomly
- label [ex. 001-100, ignore 000 and 101-999]
- use random number generator [RandInt(1,100,# that you want)] to select # and context
- no repeats
Define Stratified Random Sample
divide population into strata (homogeneous group) and chose an SRS from each group and combine.
- more precise
Difference between Simple and Stratified Random Sample
Simple: Every individual in the population has an equal chance of being selected
Stratified: the population is divided into groups (strata) based on characteristic, and then a random sample is taken from each stratum
How to selected a stratified random sample
explain your choice of strata
- explain the strata [ex. strata can be types of books]
- randomly select __same number__ of __context__ from each __strata context__
“Not every GROUP of the same size has the same chance of being picked”
Define Cluster Sample
create clusters (group) that “are located near each other”
randomly select a few clusters and include each member of the cluster.
- saves money and time
Difference between cluster and stratified random sample
cluster: the groups are heterogeneous since they are physically located near
stratified: you don’t select from every cluster
How to select a systematic random sample
- calculation: (total amount)/(amount you want) = #
- pick a number randomly from 1 - # and add # each time.
ex. arrange all books in order. Randomly select 1 book from the first 40. Then choose every 40th after that.
Define Systematic Random Sample
Your population is somehow ordered
randomly select one of the first k individuals and choosing every kth individual
- good when population is ordered
- easier to conduct
What can go wrong
- undercoverage [members of the population have less of a chance of being chosen or left out]
- nonresponse [chosen individuals can’t be contacted or refuse to participate –> big issue]
- response bias [individuals lie or answer a question they don’t know]
- question wording bias [the way a question is worded or asked influences the response from an individual]
Define Experiment
has treatment
need experiment to know CAUSATION
Define Observational Study
No treatment
need observational study to know correlation
Define Confounding Variable
other possible variables other than explanatory variable that affects response variable