Displaying Quantitative Data Flashcards
Distribution
…slices up all the possible values of a quantitative variable into equal-width bins and gives the number of values (or counts) falling into each bin.
Histogram
Uses adjacent bars to show quantitative distribution by representing the frequency of values falling into each bin.
Relative Frequency Histogram
A relative frequency histogram uses adjacent bars to show the relative frequency of quantitative values falling in each bin.
Gap
A region of the distribution where there are no values.
Stem-and-leaf display
A display that shows quantitative data values in a way that sketches the distribution of the data by separating the first digit with a bar as a label on the left. On the right are lists of digits, one digit for each value in that bin.
Dotplot
Graphs a dot for each case against a single axis.
Shape
To describe this charcteristic of a distribution, look for (1) single vs. multiple modes, (2) symmetry vs. skewness, and (3) outliers and gaps.
Mode
A hump or local high point in the shape of the distribution of a variable. The apparent location can change as the scale of a histogram is changed.
Unimodal
Having one mode. Describes the shape of a histogram when it’s generally mound-shaped.
Bimodal
Having two modes. Describes the shape of a histogram when it has two humps or mounds.
Multimodal
Having more than two modes. Describes the shape of a histogram when it has more than two humps or mounds.
Uniform
A distribution that doesn’t appear to have any mode and in which all the bars of its histogram are approximately the same height.
Symmetric
A distribution when the two halves on either side of the center look approximately like mirror images of each other.
Tails
The parts of a distribution that typically trail off on either side.
Long Tails
Distributions that straggle off for some distance.
Short Tails
Distribution have these if they don’t have data that straggle off for a long distance.
Skewed
A distribution is skewed if it’s not symmetric and one tail stretches out farther than the other.
Skewed Left
A distribution is this when a longer tail stretches to the left.
Skewed Right
A distribution when a longer tail stretches to the right.
Outliers
Extreme values that don’t appear to belong with the rest of the data.
Center
The place in the distribution of a variable that might be used to summarize the entire distribution with a single number such as mean and median.
Median
The middle value, with half the data above and half the data below. It is usually paired with IQR.
Spread (measures of…)
A numerical summary of how tightly the values are clustered around the center. IQR and standard deviation are some measures of this…
Range
The difference between the lowest and highest values in a data set.
Quartile
The median and quartiles (Q1 & Q3) divide data into four parts with equal numbers of data values.
Q1
The lower quartile (Q1) is the value with a quarter of the data below it.
Q3
The upper quartile (Q3) has three quarters of the data below it.
Percentile
The ith percentile is the number that falls above i% of the data.
IQR
Interquartile Range is the difference between the first and third quartile.
IQR = Q3 - Q1
Usually reported along with the median.
5-number Summary
Summary of a distribution reports the minimum value, Q1, the median, Q3, and the maximum value.
Boxplot
A boxplot displays the 5-number summary of Q1, median, Q3, and, whiskers that extend to the non-outlying data values.
Mean
The sum of all data values divided by the count. It is usually paired with the standard deviation.
Resistant
A calculated summary is said to be resistant if outliers have only a small effect on it.
Variance
The sum of squared deviations from the mean divided by the count minus 1.
What Standard Deviation is.
The square root of the average squared difference between each data value and the mean.