Chapter 4 - QMB2100 Flashcards
What is a dot plot?
It is a type of graph that summarizes the distribution of one variable by stacking dots at points on a number line that shows the values of the variable.
What is one advantage of the dot plot?
It groups the data as little as possible and doesn’t lose the identity of an individual observation.
When are dot plots more useful than histograms?
They are more useful when using smaller data sets.
What are quartiles, deciles, and percentiles?
They are measures of position.
What is a quartile?
Values of an ordered data set into four equal parts of observations.
What are deciles?
Values of an ordered data set into ten equal parts of observations.
What are percentiles?
Values of an ordered data set into one hundred equal part of observations.
What is the formula for location of a percentile?
Lp = (n+1)(P/100); where n is the number of observations and P is the Pth percentile.
What are the two methods to find the location of a percentile?
Exclusive method and inclusive method.
Describe the formula for the exclusive method.
Lp = (n+1)(P/100); where n is the number of observations and P is the Pth percentile.
Describe the formula for the inclusive method.
Lp = n(P/100) + 1-(P/100); where n is the number of observations and P is the Pth percentile.
What is a box plot?
A graphic display that shows the general shape of a variable’s distribution. It is based on five descriptive statistics: the maximum, minimum, first and third quartiles and the median.
What is the interquartile range?
The range of values between the first and third quartiles; 50% of a distribution’s values are located within this range.
What are outliers?
A data point that is unusually far from the others. An accepted rule is to classify an observation as an outlier if it is 1.5 times the interquartile range above the third quartile or below the first quartile.
What is the formula for upper outlier boundary?
UOB = Q3 + 1.5(Q3 - Q1)
What is the formula for lower outlier boundary?
LOB = Q1 - 1.5(Q3 - Q1)
What is skewness?
Another characteristic of the shape of the distribution. There are four types: symmetric, positively skewed, negatively skewed, and bimodal.
What are the 2 ways to calculate skewness?
Pearson’s coefficient of skewness and software coefficient of skewness.
What is the formula for Pearson’s coefficient of skewness?
Sk = [3(x̄-median)] / s; where s is the standard deviation and x̄ is the mean. The coefficient can only range from -3 to 3.
What is the formula for software coefficient of skewness?
Sk = n/((n-1)(n-2)) [ Σ ((x - x̄)/s)^3]; where x̄ is the mean, s is the standard deviation and n is the number of observations.
What is univariate?
When studying a single variable.
What is bivariate
When studying the relationship between two variables.
What is a scatter diagram?
Graphical technique used to show the relationship between two variables measure with interval or ratio scales.
What is the correlation coefficient?
Is a statistic that can be calculated to measure the direction and strength of the relationship between two variables. It varies from -1 to 1 and the closer the coefficient is to -1 or 1 the stronger the correlation. 0 would mean no correlation.
What is the formula for the correlation coefficient?
r = Σ [(x - x̄) (y - ȳ)]/[(n - 1) SxSy]; where x̄ is the mean of x variables, ȳ is the mean of y variables, n is the number of observations, Sx is the standard deviation of x and Syis the standard deviation of y.
What are contingency tables?
A table used to classify sample observations according to two identifiable characteristics.
What is the contingency table used for?
When you wish to study the relationship of two variables when one or both are nominal or ordinal scale.