Module 2 Flashcards
Population Mean
Represented as µ. The population mean applies when the data represent all of the items in a population.
Sample Mean
Represented as (look in interactive lecture). The sample mean is used when the data consist of just a sample taken from the overall population.
Median
another measure of location, is the value for which there are as many values above it as below it, when arranged in order from lowest to highest. If the number of data items is odd, the median is the data item in the middle of the list. If the number of data items is even, the median is the average of the two middle data items. A quick way to find the position of the median in the ordered list is to use the equation n+1/2
, where n is the number of data items. This computation will tell you the position in the list where the median is located.
Mode
It is the value that occurs with the most significant frequency and the final measure of the location to be discussed. If no data items are repeated, then the data set has no mode. If more than one data value has the highest frequency, it will be multimodal.
What does Skewness of the data mean?
The relationship between the mean and the median of a data set. If the median exceeds the mean, the data are skewed to the left, and if the mean, exceeds the median. the data are skewed to the right. If the mean is equal to the median, the data are symmetrical
Second Quartile (Q2)
Is the median of the data set.
First Quartile (Q1)
Is the 25th percentile and can be thought of as the median of the lower half of the data set.
Third Quartile (Q3)
Is the 75th percentile and can be thought of as the median of the upper half of the data set.
Five-Number Summary
To put everything together, we can create what is called a five-number summary, which summarizes the data using the following familiar measures:
- Smallest value
- First quartile
- Median
- Third quartile
- Largest value
Interquartile Range (IQR)
Q3-Q1
Box Plot
Based on this five-number summary, we can create a box plot, which is a convenient way of graphically depicting groups of numerical data through their quartiles. The box plot is a graphical representation of the five-number summary (minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum). The five-number summary must be calculated first before you begin the process of constructing a box plot. The interquartile range, IQR = Q3-Q1, is also utilized.
Relative Frequency Distribution
If you let the number of samples get very large (say, 300 million or more), the relative frequency table becomes a relative frequency distribution.
Range
the difference between the highest and lowest values in a set of data
Interquartile Range (IQR)
A measure of variability that overcomes the dependency on extreme values. This measure of variability is the difference between the third quartile and the first quartile. In other words, the interquartile range is the range for the middle 50% of the data.
Variance
another common measure of dispersion that has a mathematical formula needed for the calculation. If you are finding the sample variance, it is the sum of the squared differences between the n data values and the sample mean divided by (n – 1). The population variance is the sum of the squared differences between the n data values and the population mean divided by N, which is the number of data values in the population ( look up example in lecture)