Chpt 1,2--Collecting, Summarizing, Organizing Data Flashcards
population
the entire group of individuals to be studied.
population > sample > individual
individual
a person or object from which we want to collect data; that is in the population being studied.
population > sample > individual
sample
a subset of the population being studied.
population > sample > individual
statistic
a numerical summary of a sample.
(vs. a parameter, which is a numerical summary of a population)
descriptive statistics
when you organize and summarize statistical (sample-based) data.
parameter
a numerical summary of a population
(vs. a statistic, which is a numerical summary of a sample)
convenience samples
sampling methods that are not reliable
inferential statistics
extend results from a sample to a population.
Then, measure the reliability of the result to determine level of confidence.
Margin of error accounts for uncertainty.
the statistical process
- identify the research objective
- collect the data
- describe the data
- perform inference
qualitative variable
categorical; classification based on attribute or characteristic
quantitative variable
numerical; can be added or subtracted to provide meaningful results. How much, how many, how often.
Two types of quantitative variables:
1. DISCRETE variables have a finite number of possible values; result from counting (likely integer values).
2. CONTINUOUS variable have infinite possible values; value is measured (rather than counted).
variable
a variable is a characteristic of the individual being studied.
data
specific values of the variables
levels of measurement
(NOIR)
If the variable
…categorizes, then NOMINAL measurement.
…categorizes AND allows ranking, then ORDINAL.
…if difference in value has meaning, but zero does NOT = absence, then INTERVAL (ex. tempurature).
…if the difference in value has meaning AND zero starting point, then RATIO (ex. #days)
observational study
measures value without attempting to influence either the explanatory or response variables.
Three types of observational studies:
1. cross-sectional studies = snapshot
2. case-control studies = retrospective
3. cohort studies = prospective over a long period of time.
explanatory variable
independent variable
(can affect the value of a response variable)
response variable
dependent variable
(can be affected by an explanatory variable)
confounding vs. lurking variable
confounding variables are considered in a study, lurking variables are not.
census
list of all individuals in a population along with certain characteristics of each individual.
stratified sample
separate population into non-overlapping groups, then obtain simple random sample from each group.
cluster sample
select ALL individuals within a randomly selected collection or group of individuals.
systematic sample
select every kth individual from the population. (ex. surveying grocery store customers)
sampling error
error resulting from sampling–from using a subset of the population to describe characteristics of the population. A result of incomplete information.
vs. NON-SAMPLING ERROR is error from other factors.
sampling bias
the technique used to obtain the individuals for the sample tends to favor one part of the population over another.
non-response bias
individuals selected to be in the sample who do NOT respond to the survey have different opinions than those that do.
response bias
present when responses do not reflect the true feelings of the respondent.
Can be due to interviewer error, wording of questions, order of questions, etc.
randomized block design experiment
an experimental design in which the experimental units are divided into homogeneous groups called blocks; within each block, the experimental units are randomly assigned to treatments
matched-pair design experiment
an experimental design in which the experimental units are paired up based on some criteria
completely randomized design experiment
an experimental design in which each experimental unit is randomly assigned to a treatment group
class width
the difference between lower class limits
class midpoint
sum of consecutive lower class limits, divided by 2.