Collecting Data Flashcards
Numerical Value
range of numerical values, average, sum, and difference
Discrete
numeric values with jumps and it is finite (has a limit)
Ordinal variable
categorical value with natural ordering
Nominal variable
categorical variable without natural ordering
Associated/Dependent variable
2 variables show connection with each other
Independent variables
2 variables are not associated
Continous
all values within a range are possible
Explanatory Variable
variable is used to predict/ explain differences in another variable; also called a factor (has levels like yes/no or low/medium/high)
Response Variable
variable is predicted and explained by the explanatory variable; outcome/results
Sample
subset of cases and often a fraction of the population
Variable
characteristic measured for each individual/case
Anecdotal evidence
data collected in a haphazard fashion
Observational Study
collecting data without interfering with how the data arises
Cohort
group of subjects are share a defining characteristic
Randomized Experiment
individuals are randomly assigned to a group
Placebo
fake treatment
Control group
a group that has no effect on it; used to compare with treatment groups and has no independent variable affecting it
Treatment group
a group that is affected by the independent variable(s); experimental group
Confounding/Lurking variable
one missing important piece of info
Prospective Study
identifies individuals and collects info as events unfold
Retrospective Study
collects data after events have taken place
Undercoverage bias
occurs when some individuals of the population are inherently less likely to be included in the sample than others
Simple random sample
each case in the population has an equal chance of being included and no implied connection between cases in the sample
Convenience sample
individuals who are easily accessible are more likely to be included in the sample
Volunteer sample
people’s responses are solicited and those who choose to participate, respond
Systematic random sample
choosing from population using random starting point and then selecting members according to a fixed, periodic interval
Stratified random sample
randomly sampling from every strata (population divided into groups); strata should correspond to a variable thought to be associated with the variable of interest; individual strata should be homogeneous
Cluster random sample
randomly selecting a set of clusters/groups and then collecting data on all individuals in the selected clusters; individual clusters should be heterogeneous
Multistage sampling or Multistage cluster sampling
2 or more step strategy
Blind/Single blind
keeping subjects uninformed about their treatment
Placebo effect
placebo results in a slight but real improvement in subjects
Direct control
variables are controlled and any other differences in the groups; making groups identical as possible except for the treatment group
Randomization
subjects are randomized and put into treatment groups to account for variables that are not controlled
Replication
replicating the experiment multiple times to get accurate results; gives more data and decreases the likelihood that treatment groups differ on some characteristics due to chance alone
Completely randomized experiment
subjects are randomly assigned to different treatment groups
Blocked experiment
subjects are first separated by variable (thought to affect response variable) into blocks; each block has subjects randomly assigned to treatment groups
Matched pairs experiment
pairs of subjects are matched on as many variables as possible so that the comparison happens between very similar cases