Chapter 2 Descriptive Statistics Flashcards
Percentile
Percentile – a data point’s percentile is the percentage of the data smaller than or equal to it. data is ordered.
Stem and Leaf Graphs
Stem and Leaf Graphs (Stem plots) - Comes from the field of exploratory data analysis It is a good choice when the data sets are small.
Line graph
Bar Graph
Histogram
Histogram - A histogram has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents (for instance, distance from your home to school). The vertical axis is labeled either Frequency or relative frequency.
EXAMPLE
The following data are the number of books bought by 50 part-time students in SCSU. The number of books is discrete data since books are counted.
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1
2; 2; 2; 2; 2; 2; 2; 2; 2; 2
3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3
4; 4; 4; 4; 4; 4
5; 5; 5; 5; 5
6; 6
Eleven students buy 1 book. Ten students buy 2 books. Sixteen students buy 3 books. Six students buy 4 books. Five students buy 5 books. Two students buy 6 books.
Boxplots
Boxplots - give a good graphical image of the concentration of the data. They also show how far from most of the data the extreme values are. The box plot is constructed from five values: the smallest value, the first quartile, the median, the third quartile, and the largest value.
Quartiles
Quartiles - are numbers that separate the data into quarters. Quartiles may or may not be part of the data.
Q1 = 25th percentile
Even: 1,2,3,4,5,6,7,8
[1,2]Q1[3,4],[5,6,7,8
Q1 = (2+3)/2 = 2.5
Odd: 1,2,3,4,5,6,7,8,9
(.25)*(number of data) = (.25)*(9)=2.25(round up) = 3
Q1 = 3rd item on the list , which is 3
Q2 = 50th percentile = median
Q3 = 75th percentile
Even: 1,2,3,4,5,6,7,8
[1,2],[3,4],[5,6]Q37,8
Q3 = (6+7)/2 = 6.5
Odd: 1,2,3,4,5,6,7,8,9
(.25) *(number of data) = (.75) *(9) =6.75(round up) = 7
Q3 = 7rd item on the list, which is 7
Q4=100th percentile = maximum value
Median
Median: A number that separates ordered data into halves.
Boxplot Example
1; 1; 2; 2; 4; 6; 6.8; 7.2; 8; 8.3; 9; 10; 10; 11.5
Interquartile Range
Interquartile Range is IQR=Q3−Q1
The IQR can help to determine potential outliers. A value is suspected to be a potential outlier if it is
- less than 1.5*IQR below the first quartile
or
- more than 1.5*IQR above the third quartile
Mean
Mean: A common name for mean is ‘average
Mode
Mode: The value that appears most frequently in a set of data
Ex.
1,2,3,4,5: No mode
1,1,2,3,4: Mode is 1 unimodal
1,1,2,2,3: Mode is 1 & 2 bimodal
1,1,2,2,3,3,4: Mode is 1 , 2, & 3 multimodal
symmetrical distribution
symmetrical distribution - In a perfectly symmetrical distribution, the mean and the median are the same.
Mean = Median = Mode
skewed to the left
skewed to the left - is not symmetrical. The right-hand side seems “chopped off” compared to the left side. The shape distribution is called skewed to the left because it is pulled out to the left.
skewed to the right
skewed to the right - is not symmetrical. The left-hand side seems “chopped off” compared to the right side. The shape distribution is called skewed to the right because it is pulled out to the right.