Chapter #7 - 10/04/24 Flashcards
summarizing and displaying measurement data
what is an outlier ?
is a data point that is significantly different from the other data points in a dataset. It stands out because it is much higher or lower than the majority of the values.
what are the four kids of useful information about a set of qantitative data ?
1) center (typical or average value)
2) unusual values (outliers)
3) variability
4) shape
what is a stemplot ?
quick and easy way to order numbers and get picture of shape and outliers
what is a histogram ?
better for larger data sets, also provides picture of shape and outliers
how big should a good stemplot be? how many stems ?
6-15
how do you create a stemplot ?
1) create and define the stems
2) attach the leaves
what does “splitting stems” mean ?
reusing digits two times
define “shape” in regards to defining a common language about shape ?
Are most values clumped in middle with values tailing off at each end? Are there two distinct groupings? Pictures of data will provide this info
define “symmetric” :
if draw line through center, picture on one side would be mirror image of picture on other side
define “unimodal” :
single prominent peak
define “bimodal” :
two prominent peaks
define “skewed to the right” :
higher values more spread out
than lower values
define “skewed to the left” :
lower values more spread out and higher ones tend to be clumped
define “pressence of outliers” :
refers to the existence of data points in a dataset that are significantly different or distant from the majority of the other observations
define the units : with the example 3.4 …. which is the leaf anf which is the stem ?
stem = 3
leaf = 4
how to create a histogram ?
1) Divide range of data into intervals
2) Count how many values fall into each interval
3) Draw bar over each interval with height = count (or proportion)
what are the 3 nummerical summaries ?
1) center (typical or average value)
2) unusual values (outliers)
3) variability
what is “mean” :
The average of a set of numbers
what is “median” :
The middle value of a set of numbers when they are arranged in order
what is “mode” :
The number that appears most frequently in a set of numbers
what term is the best summary we can have in regards to outliers ?
median
how to find the mean ?
You find it by adding all the numbers together and then dividing by how many numbers there are
how to find the median ?
If there’s an even number of values, the median is the average of the two middle numbers
how to find mode :
here can be one mode, more than one mode, or no mode at all if no number repeats
what are outliers ?
values far removed from rest of data
what is variability ?
how spread out are the values? a score of 80 compared to mean of 76 has different meaning if scores ranged from 72 to 80 versus 32 to 98
if a graph is skewed to the right what does that mean for mean and median ?
mean > median
if a graph is skewed to the left what does that mean for mean and median ?
mean < median
if a graph is symmetric what does that mean for mean and median ?
mean and median are apprximately simmilar
If a distribution is skewed to the left,
a) the mean is less than the median
b) the mean and median are equal
c) the mean is greater than the median
a) the mean is less than the median
define “lowest” :
minimum
define “highest” :
maximum
define “median” :
number such that half of the values are at or above it and half are at or below it (middle value or average of two middle numbers in ordered list)
define “quartiles” :
medians of the two halves
how to create a boxplot ?
1) Draw horizontal (or vertical) line, label it with values from lowest to highest in data
2) Draw rectangle (box) with ends at quartiles
3) Draw line in box at value of median
4) Compute IQR (interquartile range) = distance between quartiles.
5) Compute 1.5(IQR); outlier is any value more than this distance from closest quartile.
6) Draw line (whisker) from each end of box extending to farthest data value that is not an outlier (If no outlier, then to min and max)
7) Draw asterisks to indicate the outliers
how to interpret boxplots ?
- divide the data into fourths
- easily identify outliers
- useful for comparing two or more groups