Chapter #7b - 10/11/24 Flashcards
summarizing and displaying measurement data
what are the mean and median used to describe ?
the center of a distribution, but only knowing the center may be misleading
what does measuring variability tell us ?
tell us how spread out the distribution is
what percentage of observations in a distribution are greater than the first quartile ?
a) 25%
b) 50%
c) 75%
c) 75%
how to calculate quartiles ?
- Arrange the observations in increasing order and find the median, M
- The first quartile (lower quartile), Q1, is the median of all observations (in the ordered list) below M (exclusive)
- It is one quarter of the way from the bottom of the ordered list
- The third quartile (upper quartile), Q3, is the median of all observations (in the ordered list) above M (exclusive)
- It is one quarter of the way down from the top of the ordered list
what do the quartiles provide ?
a resisteant measure of varability
define “lowest” :
minimum
what are the 5 important terms in the “five-number summary display” ?
- lowest
- highest
- lower quartile
- upper quartile
- media
define “highest” :
maximum
define “median” :
number such that half of the values are at or above it and half are at or below it (middle value or average of two middle numbers in ordered list).
define “quartiles” :
medians of the two halves
how do you calculate median ?
value at position (n+1) / 2 = that #th value
between mean and median which is the “average” ?
mean
in an example with the following numbers what is considered the lower quartile ?
1, 2, 5, 9, 12, 15, 17, 21, 23, 25, 29
lower quartile or 1st quartile is the median of the values below 15. There are 5 values, so Q1 us located at (5+1) / 2 = 3th value below the median
in an example with the following numbers what is considered the upper quartile ?
1, 2, 5, 9, 12, 15, 17, 21, 23, 25, 29
the upper quartile or 33rd quartile is the median of the values above 15. there are 5 values, so Q3 is located at (5+1) / 2 = 3th value above the median
what is median ?
This is the middle value when all the numbers are arranged in order. If there’s an odd number of values, it’s the middle one. If there’s an even number, it’s the average of the two middle ones.
what is mode ?
This is the number that appears most often. If no number repeats, there’s no mode, and if more than one number repeats the same number of times, you can have more than one mode.
what is mean ?
This is the average of all numbers. Add up all the values and then divide by how many values there are.
what does IQR stand for ?
Interquartile Range
what is the IQR
is a measure of how spread out the middle 50% of your data is. It shows the range between the first quartile (Q1) and the third quartile (Q3) of a dataset.
- Q1 (First Quartile): The middle of the lower half of the data (25% mark).
- Q3 (Third Quartile): The middle of the upper half of the data (75% mark).
what term is used to describe the following “ a more resistance measure of variability is given as distance the distance between the quartiles.”
IQR
what is the formula for the IQR ?
IQR = upper quartile (Q3) - lower quartile (Q1)
when do we call an observation an outlier ?
if it falls more than 1.5 xs IQR above the upper quartile to below the lower quartile
what are boxplots ?
is a simple way to visually show the spread and distribution of data (5 key data)
what are the 5 key datas boxplots show ?
- Minimum: The smallest value (not counting outliers).
- Q1 (First Quartile): The 25% mark.
- Median: The middle value (50% mark).
- Q3 (Third Quartile): The 75% mark.
- Maximum: The largest value (not counting outliers).
how many steps to create a boxplot ?
7
what are the 7 steps in creating a boxlpot ?
- Draw horizontal (or vertical) line, label it
with values from lowest to highest in data. - Draw rectangle (box) with ends at quartiles.
- Draw line in box at value of median.
- Compute IQR (interquartile range) = distance
between quartiles. - Compute 1.5(IQR); outlier is any value more
than this distance from closest quartile. - Draw line (whisker) from each end of box
extending to farthest data value that is not an
outlier. (If no outlier, then to min and max.) - Draw asterisks to indicate the outliers.
how do you interpret boxplots ?
- Divide the data into fourths.
- Easily identify outliers.
- Useful for comparing two or more groups.
define “standard deviation” :
represents spread or variability in the values;
define “variance” :
(standard deviation) ^2
what are the mean and standard deviation most useful for ?
symmetric sets of data with no outliers
how to compute the standard deviation ?
- Find the mean.
- Find the deviation of each value from the mean. (Deviation = value – mean)
- Square the deviations.
- Sum the squared deviations.
- Divide the sum by (the number of values) – 1, resulting in the variance.
- Take the square root of the variance. The result is the standard deviation.
what are all the values that a standard deviation can possibly take ?
a) 0 ≤ s
b) 0 ≤ s ≤ 1
c) -1 ≤ s ≤ 1
a) 0 ≤ s