M9: Gathering Data Flashcards
Exploratory Data Analysis
How to look at data and summarize it
Statistical inference
Make statement about the population, based on a random and representative sample and include a measure of how confident we are in the statement.
Experiments
a) Definition
b) Pros
c) Cons
a) Assign subjects ton certain experimental treatments
b) Good to determine causation
c) Difficult / unethical to conduct experiments
Observational studies
a) Definition
b) Pros
c) Cons
a) Do nothing to the subjects but observe x and y
b) Difficult to determine causation due to confounding effects
c) Less restriction than in experiments
Sources to find data (3)
Anecdotal evidence - Not good
Census - Official government data that asks all the population at a national level
Take a sample
Random sample (GOOD)
Subjects chosen by chance
Representative of the entire population
a) Stratified
b) Cluster
c) SRS
Simple random sample
a) Every set of n individuals has the same chance of being selected
b) Usually done with a computer (sampling frame)
c) All the statistical inference procedures require a SRS
Non-probability samples
a) Do not attempt to select participants at random from the population of interest
b) Easy / inexpensive methods ton collect data
c) Not representative of the population
Biased Samples (BAD)
a) Systematically favor certain outcomes
b) Not representative of the population of interest
Examples of biased samples (2)
a) Volunteer - Restaurants / Websites / Call-in shows
b) Convenience - In the class / At the mall
Sample surveys
a) Personal interview (More expensive but likely to answer)
b) Telephone interview
c) Questionnaires
Formula of margin of error using a random sample n
1/sqrt(n)
a) Add or substract to the estimate
Potential bias in sample surveys (4)
a) Undercoverage
- Missing parts of the population
b) Nonresponse bias
- Different views
c) Response bias
- Bad memory
- Don’t tell the truth
d) Wording of questions
- Avoid leading/confusing questions