Midterm 1 Flashcards
population
sample
producing data
–pop. - the entire group of individuals that is the target of our interest
–sample - a subgroup of the pop.
–choosing a sample and analyzing date
individual
variable
an entity that is observed (human, classroom, mouse)
–var. - characteristic that is measured on each individual (height)
quantitative variable
categorical variable
measurement
quant. - var. whose possible values are meaningful numbers (cost, height)
cat. - var. whose possible values are non-quantitative categories (gender, opinion)
measurement - value of a variable for an individual (textbook cost for Nathan)
single variable pattern distribution of a variable
summary of data one variable at a time
sample survey
observational study where indiv. report variables values themselves, freq. opinions - make (uncertain) assumptions about the pop. from sample
- explicitly describe pop.
- explicitly describe variable
- select representative sample
larger samples have LESS uncertainty
–sample facts only approx. pop. (uncertainty)
Parameter
statistic
Parameter - numerical fact about the var. in the Pop. (never exactly equal but should be a good representation)
Statistic - the corresponding numerical fact in the SAMPLE
- BAD SAMPLING
- Convenience sampling
- -easiest way (stop ppl in will, pick the 1st truck, 1st 25 chickens) - Volunteer response sampling - (television polls, online, rate prof.) - indiv. select themselves
- quota sampling - force sample to meet specified quotas
- -ex. recruit 200 females and 300 males btw 45-65
- -not random, not good representation
bias - sample favors certain outcomes and not good representative
- SRS
GOOD sampling
- SRS - probability sample - simple random sample
- -each indiv. has a known probability of being selected
- -names in a hat
- -random digit table
- -random # generator
- Cluster sample
GOOD sampling
- cluster - “all of some”
- –used when pop. is naturally divided into groups called clusters (YSA wards, households in city blocks)
- -each cluster is rep. of pop. as a whole
- -random sample of clusters taken - all indiv. inside clusters are included in the sample
- Stratified random sample
GOOD sampling
- Stratified random sample - “some of all”
- –classify pop. into groups (Strata) that are diff. from each other (Age, gender)
- -indiv. within a group (Stratum) share a similar characteristic (all males, all children)
- -select SRS from EVERY group - then combine SRSs
- Multistage sample
GOOD sampling
- Multistage sample - “some of some of some”
- -V.S. = states (SRS and choose 5) —> counties (SRS choose 5) —> people (SRS choose 5)
- -church - areas - stakes - wards - members (SRS randomly choose certain number each time break it down)
Samples have problems due to BIAS
- -under coverage
- -non-response
- -misleading response
- -interviewer influences response
- -question ordering
- -question structure
- -wordking of ?
under coverage
–ind. with no chance of being selected (homeless, phone less)
non-response
–selected ind. refuse to answer (hangups, on vacation, don’t mail back)
misleading response
–ind. give inaccurate answer - have you cheated? do you wash hands? (private surveys avoid this better)
interviewer influences response
–rude, intimidating, subtle clues
question ordering
–happiness question precedes debt question - vise versa
question structure
–open ended (unlimited answers - what is your fav. music?), closed question (limits responses - What is your fav. music btw country and rap?)
wording of ?
–leading phrases, loaded words, ambiguities that influence response
problems with observational studies 2
- -subjects choose which treatment to rec. or which group to belong to
- -lurking variables - influence the response variable
- -passive data collection: observing, measuring, counting, subjects are undisturbed
- -media often improperly attribute cause-effect conclusions to these
Experiment vocabulary
–subject
impose treatments on people rather than observing
–we determine if treatments cause change in response
subject - being tested - indiv. to which treatment is applied
response variable
explanatory variable
response - characteristic measured on each subject (whether has cancer or not)
explanatory - used to predict or explain changes in the response variable (drug to test to see of works on cancer patients)
factor
treatment
factor - planned explanatory variable
treatment - experimental condition applied to subject = value of factor
lurking variables
control
confounding
lurking variables - variables that affect the response variable
control - effort to REDUCE effects of lurking variables
confounding - situation in which effects of lurking variables cannot be distinguished from effects of factors (if there is a lurking variable = confounding exists)
Historical comparison experiment
- a study involves only one treatment
- treated subjects compared to untreated subjects from another study
- -not good bc LOTS of lurking variables and diff. time periods