Unit 2: Exploring and Understanding Data Flashcards
Symmetric and Unimodal
CUSS
Center, Unusual Features, Shape, Spread
Contingency Table
Used to compare two categorical variables
Marginal Distribution
Found by looking at the “totals” of one variable in a contingency table.
Conditional Distribution
Found by looking at the individual cells of a contingency table.
Relative Frequency Distribution
Means to find the percentages instead of the frequency
Types of Categorical Displays
Bar Chart, Pie Chart, Segmented Bar Chart, Frequency Table, Contingency Table
Types of Quantitative Displays
Dot Plot, Histogram, Stem-and-leaf Display, Box Plot, Scatterplot
Mean = Median
Data are Symmetric
Mean < Median
Data are Skewed Left
Mean > Median
Data are Skewed Right
Median and IQR
Used when data are skewed
Mean and Standard Deviation
Used when data are symmetric
Standard Deviation
“average” distance the individual data values are to the mean.
µ
population mean
σ
population standard deviation
Fences
UF = Q3 + 1.5*IQR
LF = Q1 - 1.5*IQR
Q1
Quartile 1
25% of data less than Q1
75% of data greater than Q1
Q2
Median
50% of data are less/greater than Q2
Q3
3rd Quartile
75% of data less than Q3
25% of data greater than Q3
Two categorical variables are independent if…
The conditional distributions are NOT different among categories.
The segmented bar chart has different percentages.
Advantages of Stem-And-Leaf
Original data values can be seen
Can still be used for CUSS
Disadvantages of Stem-And-Leaf
Not good for LARGE sample sizes
Can be tedious to make
Descriptions of Shape
Modality - Unimodal, Bimodal, Uniform
Symmetry - Skewed Left/Right or Symmetric
Variance
Square of the Standard Deviation
5-Number Summary
Min, Q1, Median, Q3, Max