semester test unit 1-5 review Flashcards
steps to data analysis
- collecting data - asking questions
- describing the data - research and organizing
- making an inference
samples
smaller groups representing a population
sampling variability
difference between data from two more samples
if sampling variability is…
low = samples are representatives high = look at other samples or change the way you are selecting samples
types of sampling bias
- undercoverage bias
- nonresponse bias
- self-selection bias
undercoverage bias
certain number of population is exluded
nonresponse bias
only small portion responds
self selction bias
people with certain opinions are likely to respond
convenience sampling
researchers use an easily available group to form a population
simple random sample
each member has an equal chance of being chosen
census
all members are included
response bias
poorly designed questions affect the answer people give
population parameter
piece of data about a characteristic of an entire population
primary ways to collect data
- surveys
- observational studies
- experiments
stratified random sample
population is broken up into samples and then random samples are taken in each subgroups
strata
subgroups
clusters
samples divided even more
systematic sampling
sampling from population is chose with a pattern
ways to design experiments to reduce bias
- direct control
- blocking
- randomization
- replication
direct control
factors being standardized
blocking
factors that divide the subjects share characteristics
randomization
subjects are chosen randomly
repetition
experiments are repeated
relative frequency distribution
f/n
discrete data
finite number of possibilities
continuos data
infinite number of possibilities
variance higher
= values in data have a wider spread
negatively skewed
= median is greater than mean
positively skewed
= mean is greater than median
symmetric
= median is the same as mean
number of data is being added, subtracted, divided, or multiplied by a constant, the mean will…
change in the same way
number of data is being added, subtracted, divided, or multiplied by a constant, the median will…
change in the same way
+ or - number from data, the standard deviation…
won’t change
x or / number from data, the standard deviation…
will be affected
residual
difference between actual y-value and predicted y-value
residual is +
point is above the line
residual is -
point is below the line
residual is 0
point is on the line
sum of residuals
must be 0
smaller residual =
better model
r^2 close to 0 =
least square regression line does not fit the data
r^2 close to 1 =
least square regression line fits data
lagorithmic equation
log b (a) = c
expontential equation
b^c=a
common lagorithm
base of 10 (not written)