Module 1 Flashcards
What is an individual?
A thing (person, object, event, etc.) we collect data from or about (AKA observations or subjects)
What is a variable?
A characteristic of an individual
What is Statistics??
the science of data
What does statistics help us do?
- Describe data visually and numerically
- Estimate values with a measure of confidence
- Predict values of unobserved variables
- Decide when a difference of effect is significant
- Understand patterns in randomness
- Learn from data
What is population?
The set of all individuals we are interested in, usually at a fixed point in time
(number of individuals in a population is denoted by N)
What is a population parameter?
A number summarizing some aspect of the population. GOAL! Population parameters are fixed and usually unknowable.
What are the four main population parameters?
- μ, mu, the mean
- p, the proportion
- δ, sigma, the standard deviation
- ρ, rho, the correlation b/w variables
What is a census?
When every individual in the population is studied
What is a sample?
It is a subset of the population that data is actually collected from
What are reasons sampling is hard?
- Expensive in time, money or other reasons
- Lack of access, private records, nonresponse
- Collecting data may be destructive
- The entire population doesn’t exist
What is a sample statistic?
A number summarizing some aspect of the sample. It is used to estimate the population parameter. It is knowable but random
What are the four main sample statistics?
- sample mean, x-bar
- sample proportion, P-hat
- sample standard deviation, s
- sample correlation, r or R
What is stratified sampling?
Partitioning pop. into non-overlapping groups. Take on SRS from each
What are the pros and cons of stratified sampling?
Pros: Ensures significant subgroups are represented. Allows for comparisons b/w groups as well
Cons: Need proportions of subgroups in the sample to be similar to proportions in the population. If not, results need to be reweighted. Cons:
What is cluster sampling?
Partitioning pop. into non-overlapping groups. Randomly select a few groups. Sample everything in those groups.
What are the pros and cons of cluster sampling?
Pros: Convenient and cost effective. Allows for increased sample size
Cons: If clusters are too homogenous there may not be enough diversity to represent the entire population. Increases the probability of an inaccurate answer.
What is systematic sampling?
Where you pick an integer k. Arrange indiv. in some order. Choose a random starting indiv. b/w 1 and k-1. Sample every kth indiv. after that
What are the pros and cons of systematic sampling?
Pros: If order is random, mimics simple random sampling, but doesn’t not need a frame. If order is sorted, mimics stratified sampling.
Cons: If k is too big your sample will be too small. If k is too small, you could waster resources on a larger sample than is necessary
A sample is representative if…
It has characteristics similar to the pop. from which its drawn
What is convenience sampling?
Researchers choose individuals that are easy to access
What are the pros and cons of convenience sampling?
Pros: none, except for ease
Cons: Individuals tend to be too similar and lack diversity present in the larger population
What is voluntary response sampling?
Individuals self select into the sample
What are the pros and cons of response sampling?
Pros: none except for ease
Cons: Biased in favor of individuals with strong opinions
The best sampling techniques are
based on random selection of individuals
The law of Large numbers
Estimates from random samples will approach the true value as the sample size increases
Simple Random Sampling (SRS)
Every sample of a given size is equally likely to
Sampling is done with replacement if… and without replacement if…
Individuals can be selected more than once/individuals cannot be selected again
Pros and Cons of Simple Random Sampling
Pros: simple, but effective. The probability of a representative sample increases with size
Cons: Requires access to a frame- a list of all individuals in the population
What is anecdotal evidence?
evidence from personal observations made in a casual nonsystematic manner
What is bias?
Certain outcomes are systematically favored
What is sampling bias?
Bias due to who or what is in the sample
What is under coverage?
Some individuals in the pop. have 0 probability of being selected
What is nonresponse bias?
Individuals intended for the sample do not respond
What is voluntary response bias?
Individuals w/ strong opinions self-select into the sample
What is response bias?
Something about the questions or how they are asked/worded/ordered influences the responses
What are the 3 main types of response bias
Wording of questions, ordering of questions (priming), and pride/shame/interviewer effect
How do you control pride/shame/interviewer effect?
- Assure of anonymity and confidentiality
- Self-administer sensitive questions
What is a explanatory/independent variable?
One that explains a change in other variables. Also known as the input variable
What is a response/dependent variable?
one that changes in response to other variables
What is an observational study?
Researchers do not manipulate the explanatory variables
What are the pros and cons of observational studies?
Pros: Observational studies allow us to look for relationships when the explanatory variable is difficult or unethical
Cons: No matter how strong the correlation is, you cannot conclude causation.
What is a controlled experiment?
Where researchers intentionally manipulate the explanatory variable
What does the principle controlling mean?
Eliminate variability in other variable that might affect the response
What is the principle of blocking mean?
Group individuals according to a characteristic that might affect the response
What does the principle of randomization mean?
Randomly assign individuals to treatment groups
What does the principle of replication mean?
Study as many individuals as possible under each treatment
Confounding occurs when…
the effects of variables on the response cannot be separated
What is the completely randomized design?
Individuals randomly assigned to treatment groups
What is the randomized block design?
Individuals are partitioned into similar “blocks” before random assignment to treatment
What is the matched pairs design?
Individuals are matched into similar pairs. Each member gets a different treatment