Lecture 3 Flashcards
Central tendency
the tendency of data to cluster, or center, about certain numerical values
variability
the spread of the data
mean
the sum of observed values in a data set divided by the number of observations
median
middle number when the measurements are arranged in ascending (or descending) order; if the number of observations is odd, then the sample median is the observed value exactly in the middle of the ordered list; if the number of observations is even, then the sample median is the number halfway between the two middle observed values in the ordered list
mode
the most frequently occurring data element
x with line over it
sample mean
mu symbol
population mean
M
sample median
n with long tail
population median
the sample mean is often use to estimate…?
the population mean
the accuracy of using the sample mean to estimate the population mean depends on?
size of the sample and variability of data
right skew?
typically the median is less than the mean
if the data set is symmetric then?
the mean equals the median
if the data set is skewed to the left then?
typically the mean is less than the median
when do choose mode?
when calculating measure of center for the qualitative variable
when to choose mean?
variable is quantitative with symmetric distribution
when to choose median?
quantitative variable with skewed distribution
range
the distance between the largest measurement and the smallest measurement in a data set
sample standard deviation
formula is minusing each value with the sample mean and squaring that and then dividing that by n-1 and square root the whole thing
s^2 symbol
sample variance
sigma^2 symbol
population variance
s symbol
sample standard deviation
sigma symbol
population standard deviation
formula for sample variance
same as standard deviation but without the square root
what does the n-1 produce?
produces an unbiased estimator of population variance
what does the empirical rule do?
relates the standard deviation to the proportion of the observed values of the variables in the data set that lie on an interval around the mean (mu)
empirical guideline for symmetric bell shaped distribution
68% lie within 1 standard deviation of the mean; 95% lie within 2 standard deviations of the mean; 99.7% lie within 3 standard deviations of the mean
steps for solving empirical rule questions
draw out a normal curve with a line down the middle and three to either side
write the values from your normal distribution at the bottom; start with the mean in the middle, then add standard deviations to get the values to the right and subtract standard deviations to get the values to the left
write the percent for each section (0.15, 2.35, 13.5, 34)
determine the section of the curve the question is asking for and shade it in
add up the percent in the sections that got shaded
range/4 is equal to
s