Module 2 - Section 2 Flashcards
What graphs are best for smaller data sets of numerical variables?
Stem plots and dot plots
What graphs are best for large data sets of quantitative data?
histograms
appearance of a dot plot?
y-axis: frequency
x-axis: name of variable and the values that the data will fall between
. .
. . . . . . .
. . . . . . . . .
values
-dot above where that data point is
-more dots above a point to indicate a frequency more than one
-(i don’t know look at notes if you are confused)
stem
the leading digits of the number in the data
ex: 75 has leading digit or stem 7
100 could have leading digits 100 or 1 (depending on the data)
leaf
the last digit of the number in the data
ex: 75 has leaf 5
a key is required for …
a stemplot
bins
equal-width interval for multiple different numbers of data that are close in values
ex: 70-79 is one bin if 7 is the stem 0-9 are the leaves
appearance of stemplot
stem | leaves 4 |0 5 | 6 |05588 7 |00000455 8 |5 9 |05
Price of Walking shoes
8|5 represents $85
back-to-back stem plots
-used for the comparison of the distribution of two groups
leaves | stem | leaves
-still require key
-leaves get bigger as you move away from stem! pay attention to left side group
left inclusion
-interval notation as [a,b)
so a on the left is included but not b
-used for histograms along the x-axis to organize bins
histogram appearance
- bins on x-axis
- frequency or relative frequency on y-axis
- bars with no spaces between (unless there is an empty bin)
For dot plots, stem plots, and histograms, which does/does not retain all data values
dot and stem plots retain all data values but not histograms
how can we describe the distribution of a plot?
shapes - modes, symmetry or skewness, deviation or outliers
center
spread
mode(s)
number of bumps / humps / peaks
uniform
no modes, square / rectangle appearance
unimodal
a single peak
bimodal
two peaks
ex:heights of adults and children will have two peaks one for adults and one for children
multimodal
rarely occurs (except for covid?) more than two peaks
symmetry
when a graph is symmetrical
if you didn’t get this…I am ashamed lol
non symmetric graphs are
skewed
skewed to the right
positively skewed
peaks quickly and then slowly trickles down to the right
as if the tail end of the peak on the right has been pulled to the right
negatively skewed
skewed to the left
the left tail is extended and longer than the right tail ( if peak is essentially symmetric)
……^. .
Outlier
a deviation that does not follow the overall pattern of the graph
numerical summaries
a few important and meaningful numbers that preserves the relevant features of the data set so that you can draw useful conclusions
y
variable of interest
the variable for which we have sample data
n
the sample size / number of observations of the variable y