CHAPTER 4 Flashcards
The difference between the actual value and the average value.
Dispersion
What are the MEASURES OF DISPERSION
range
average deviation
variance
standard deviation
What are the MEASURES OF LOCATION
Quartiles
Deciles
Percentiles
Midhinge
Interquartile range
Quartile Deviation
The difference of the highest value and the lowest value in the data set.
Range
It is the absolute difference between the element and a given point.
Average Deviation
It is a statistical term that provides a good indication of volatility.
It measures how widely values are dispersed from the average.
It is calculatedas the square root of variance.
Standard Deviation
It is the measure of risk
Volatility
It is used to approximate or to give a rough estimate of the standard deviation
Range Rule of Thumb
It is a mathematical expectation of the average squared deviations from the mean.
Variance
It is the mean of the first (Q1) and third (Q3) quartiles in the data set. It is used to overcome potential problems introduced by extreme values (or outliers) in the data set.
Midhinge
A is a measure of statistical dispersion, being equal to the difference between the third and first quartiles.
Interquartile Range (IQR)
Interquartile Range is also called
midspread
middle fifty
It is slightly better measure of absolute dispersion than the range.
It ignores the observation on the tails.
Quartile Deviation (QD)
The difference samples from a population & calculate their quartile deviations, their values are quite likely to be sufficiently different called
sampling fluctuation
It is calculated from the sample data does not help us to draw any conclusion about the quartile deviation in the population.
inference
It is used when one is interested to compare standard deviations of two different units, coefficient of variations can be applied.
Coefficient of Variation (CV)
is a statistical tool that measures dispersion in a data population that states that no more than 1 / k2 of the distribution’s values will be more than k standard deviations away from the mean.
Chebychev’s Theorem
Statistical measure used to describe the distribution of observed data around the mean. It measures the relative peakedness or flatness of a distribution (as compared to the normal distribution, which shows a kurtosis of zero)
Kurtosis
Three Types of Kurtosis
Leptokurtic
Mesokurtic
Platykurtic
are distributions where values clustered heavily or pile up in the center. (k 0)
Leptokurtic
are intermediate distribution w/c are neither too peaked nor too flat. (k = 0).
Mesokurtic
are flat distributions with values more evenly distributed about the center with broad humps and shot tails. (k 0)
Platykurtic
Measures the general shape of the distribution or the lack of symmetry of a distribution.
Ranges from –3 to +3.
It relates the difference between the mean and the median to the standard deviation.
The direction of the long tail of the distribution points the direction of the skewness.
Coefficient of Skewness
Data values are evenly distributed.
The distribution is unimodal.
The mean, median, and mode are similar & are at the center of the distribution.
Symmetrical
Most of the values in the data fall to the left of the mean and group at the lower end of the distribution.
The tail is to the right.
The mean is to the right of the median, and the mode is to the left of the median
Positively Skewed (or Right-Skewed).
The mass of the data values fall to the right of the mean and group at the upper end of the distribution.
The tail to the left.
The mean is to the left of the median, and the mode is to the right of the median.
Negatively Skewed (or Left-Skewed).
A data set should be checked for extremely high or extremely low values. These values are called outliers. Outliers can strongly affect the mean and standard deviation of a variable. One method in determining the outliers is when a data value in a data set is less Q1 – 1.5(IQR) or greater than Q3 + 1.5(IQR)
Outliers
Outliers can strongly affect the mean and standard deviation of a variable.
true or false
true
Introduced by John Tukey in 1970’s..
It gives the following information:
If the median is near the center of the box, the distribution is approximately symmetric.
If the lines are about the same length, the distribution is approximately symmetric.
If the median falls to the right of the center of the box, the distribution is negatively skewed.
If the median falls to the right of the center of the box, the distribution is negatively skewed.
Boxplot (Box-and-Whisker plot)
If the left line is larger than the right line, the distribution
negatively skewed
If the right line is larger than the left line, the distribution is
positively skewed.