CHAPTER 4 Displaying and summarizing quantitative data Flashcards
How are quantitative data represented
1.HISTOGRAM
. The Horizontal axis the bins are joined together with equal widths
. It is a summary of the distribution of quantitative variable but they don;t show the data values themselves
2. STEM AND LEAF
3. DOTPLOTS
RELATIVE FREQUENCY HISTOGRAMS
replaces the counts on the vertical axis with the percentage of the total number of cases falling in each bin.
. The Horizontal axis the bins are joined together with equal widths
STEM AND LEAF
Is like a histogram, but shows the individual values
DOT PLOTS
A simple display that places a dot along an axis for each case in the data ( similar to stem and leaf)
QUANTITATIVE DATA CONDITION
Values of quantative variables whose units are known
MODE
SINGLE VALUE THAT APPEARS MOST OFTEN ( for categorical values)
. for quantitative it is the HUMPS ( UNIMODAL, BIMODAL, MULIMODAL, UNIFORM)
Uniform
Is a histogram in which all the bars are approximately the same height
TAILS
the thinner ends of a histogram
SKEWED
if one TAIL stretches out further than another. the graph is skewed toward the longer tail
OUTLIERS
Extreme values that don’t appear to belong to the rest of the data… unsually high oe low values
GAP
A region that has no values
CENTER OF MEASURE
A single, typical value of a data set
2 types of Center of Measure
- Mean
2. Median
MEDIAN
The middle value
IQR
Upper percentile- lower percentile
It is a reasonable summary of the spread of a distribution except when the dat is strongly bimodal
Q1( lower) and Q3( upper)
are also known as the 25th and 75th percentiles of data ( since the lower quartile falls above 25% of the data and the upper quartile falls above 75% of the data
Q2
median ( 50th percentile)
5 N SUMMARY of a distribution
reports its median, quartiles, and extremes ( maximum and minimum) 1. minimum 2. maximum 3. Q3 4. median 5 Q1
N
The number of data values
bar over a symbol
find the mean
Why is the median considered to be RESISTANT to values
because it resist values that are extraordinarily large or small and ignores their distance from the center
STANDARD DEVIATION (S)
takes into account how far each value is from the mean.
Like the mean the standard deviation is only for symmetric data
VARIANCE (s^2)
The squared average deviation of individual data value from the mean
If data values are far from center ( what happens to the spread)
The spread ( IQR and SD) will be large
If data values are close center ( what happens to the spread)
The spread ( IQR and SD) will be small)
What does measures of value tell
How well other summaries describe the data
When data is skewed what measure of center should be used
The median ( spread: IQR)
When data is symmetric what measure of center should be used
The mean (spread : STD)
STEPS TO REPRESENTING Quantitaive DATA
1 Make a histogram or stem leaf display
- Discuss the center and spread
a. If the shape is skewed report the median and IQR
b. If the shape is symmetric report the mean and standard deviation ( for unimodal symmetric data the IQR is usually larger than the standard deviation. - Discuss any unusual features ( modes, outliers)
median is paired with
IQR
mean is paired with
standard deviation
a Histograms x and axis
x axis - the WHAT that was measured
y axis - the COUNT
BAR GRAPHS -bars
indicate how many cases /counts of categorical data are piled into each category
histogram bars
represent counts of data piled into intervals of quantitative variable
what statistical summaries require the data must be in order
- median
2 quartiles/percentiles, IQR
summary statistics that are resistant to ouliers
- Median
2. IQR
summary statistics that are not resistant to ouliers
used only on symmetric data
mean and standard deviation ( they are sensitive to outliers
3 things to look at for a histogram
1 shape: symmetry, mode, skewed
- center
- spread
MIDRANGE
average of the minimum and maximum values
RANGE
. single value
. maximum-minimum
MEASURES OF SPREAD
range, IQR,variance , standard deviation
They tell how well other summaries describe the data