non highlighted Flashcards
categorical variable
a categorical variable is placed an individual into one of several groups or categories
quantitative variable
a quantitative variable has numerical values and it makes sense to find the average value
association
there is an association between two variables if knowing the value of one variable helps predict the value of the other
mean
average value of the observation
median
midpoint of the values, also called Q2
first third quartiles
Q1 has about one-fourth of the observations below it, and Q3 has about three fourths of the observations below it
interquartile range
IQR is the range of middles 50% of the observations IQR =Q3-Q1
standard Deviation
measures the typical distance of the values in a distribution from the mean
variance
average squared deviation
shape
typical shapes of a distribution are roughly symmetric, skewed left and skewed right
center
mean for roughly symmetric distributions, median for skewed distributions
spread
standard deviation for roughly symmetric distributions, IQR for skewed distributions. Range = man-min as a last resort
transforming data by add/subtract a
measure of center (median and mean) and location (quartiles and percentiles) change by a measure of spread don’t change
transforming data by multiply/ divide b
measure of center, location, and spread change by b
Density curve mean and median
the mean is the balance point of the curve. The median divides the area under the curve in half
uniform distribution
a distribution that takes constant height over some interval of values
68-95-99.7 rule
percent of observations that lie within one tow and three standard deviations of the mean in a normal curve
normal probability plot
if the normal probability plot is roughly linear, then the data is apporiximately normal
if the normal probability is not roughly linear then the data is not approximately normal
scatterplot
displays the relationship between two quantitative variables measured on the same individuals
explanatory variable, factor, response variable
if we think that a variable x may help explain, predict or even cause changes in anohter variable y, we call x an explanatory variable and y a response variable
correlation r
meaures the direction and strength
r has no units, is between -1 and +1 and is not the value of the slope