Stats M. 2 Flashcards
relative frequency distribution
listing of distinct values and their relative frequencies (proportions or percentages)
Numerical summary for categorical data
Bar Chart
Graphical Summary for categorical data
-bars do not touch each other
Pareto diagram
graphical summary for categorical data
pie chart
graphical summary for categorical data
Mean
sum of observations divided by the number of observations
numerical summary for quantitative data
sensitive to/affected by extreme values
median
the number that divides the bottom 50% of the data from the top 50% of the data
numerical summary for quantitative data
not sensitive to/not affected by extreme values
mode
any value that occurs with the greatest frequency
numerical summary for quantitative data
percentiles
indicate the point below which a certain percentage of observations fall`
quartiles
special type of percentile that divides data into quarters
Q1
25%
Q2
median- 50%
Q3
75%
standard deviation
tells us whether the observations within the data set tend to be close to the mean or far away from the mean
numerical summary for quantitative data
IQR
the difference between Q3 and Q1
tells us about the variability of the middle 50%
numerical summary for quantitative data
range
difference between the maximum and minimum value
numerical summary for quantitative data
Dotplot
graphical summary for quantitative data
histogram
graphical summary for quantitative data
Bars touch
density plot
graphical summary for quantitative data
box plot
graphical summary for quantitative data
5 number summary (minimum, Q1, median, Q3, maximum)
time plots
graphical summary for quantitative data
S.O.C.S
Shape
Outliers
Center
Spread
Shape
Unimodal, bimodal, multimodal
skewness or symmetrical
-left skewed= tail goes to negative side
-right skewed=tail goes to positive side
Outliers
unusual values
Center
-symmetric + no outliers
Report the mean
Center
-skewed +/or outliers
report the median
Spread
-symmetric + no outliers
report the standard deviation
Spread
-skewed +/or outliers
report the IQR
Comparative graphical displays (quantitative + categorical)
SOCS
Histogram + box plot
Bivariate Data
data that contains 2 variables
Association
(Bivariate Data)
a relationship between two variables
Response Variable
(Bivariate Data)
measured to make comparisons between groups
Explanatory Variable
(Bivariate Data)
explains the value of the response variable
contingency table
a frequency distribution for bivariate data (also called a two-way or cross-tabulation table)
conditional proportions
(Bivariate Data)
proportions based on the explanatory variable for the categories of the response variable
(divide each cell count by the corresponding row total)
No association
(Bivariate Data)
values (%) within each column or bar heights of same color are similar
Yes association
(Bivariate Data)
values (%) within each column or bar heights of same color are different
comparative bar chart
a chart that compares the conditional proportion of the response variable within each category of the explanatory variable
Mosaic plots
another comparative chart
Scatterplots
summarize bivariate quantitative data
Positive Association
(bivariate quantitative data)
as values of one variable increase, so do values of the other
Negative association
(bivariate quantitative data)
as values of one variable increase, values of the other variable decrease
No association
(bivariate quantitative data)
no apparent relationship between the two variables
correlation
measure of the strength and direction of the linear relationship between two variable
weak correlation
positive: 0 < r < 0.4
negative: -0.4 < r < 0
moderate correlation
positive: 0.4 < r < 0.8
negative: -0.8 < r < -0.4
strong correlation
positive: 0.8 < r < 1
negative: -1 < r < -0.8