(7) Statistical Concepts and Market Returns Flashcards
LOS 8. a: Distinguish between descriptive statistics and inferential statistics, between a population and a sample, and among the types of measurement scales.
Descriptive statistics summarize the characteristics of a data set;
inferential statistics are used to make probabilistic statements about a population based on a sample.
LOS 8. a: Distinguish between descriptive statistics and inferential statistics, between a population and a sample, and among the types of measurement scales.
A population includes all members of a specified group,
while a sample is a subset of the population used to draw inference about the population
LOS 8. a: Distinguish between descriptive statistics and inferential statistics, between a population and a sample, and among the types of measurement scales.
Data may be measured using different scales:
Nominal scale – data is put into categories that have no particular order
Ordinal scale – data is put into categories that can be ordered with respect to some characteristic.
Interval scale – differences in data values are meaningful, but ratios, such as twice as much or twice as large, are not meaningful
Ratio scale – ratios of values, such as twice as much or half as large, are meaningful, and zero represents the complete absence of the characteristic being measured.
LOS 8. b: Define a parameter, a sample statistic, and a frequency distribution.
Any measurable characteristic of a population is called a parameter
A characteristic of a sample is given by a sample statistic
A frequency distribution groups observation into classes, or intervals. An interval is a range of values.
LOS 8. c. Calculate and interpret relative frequencies and cumulative relative frequencies, given a frequency distribution.
Relative frequency is the percentage of total observations falling within an interval.
Cumulative relative frequency for an interval is the sum of the relative frequencies for all values les than or equal to the interval’s maximum value.
LOS 8. d: Describe the properties of a data set presented as a histogram or a frequency polygon.
A histogram is a bar chart of data that has been grouped into a frequency distribution.
A frequency polygon plots the midpoint of each interval on the horizontal axis and the absolute frequency for that interval on the vertical axis, and connects the midpoints with straight lines.
The advantage of histograms and frequency polygons is that we can quickly see where most of the observations lie.
LOS 8. e: Calculate and interpret measures of central tendency, including the population mean, sample mean, arithmetic mean, weighted average or mean, geometric mean, harmonic mean, median and mode.
The arithmetic mean is the average. Population mean and sample mean are examples of arithmetic means.
The geometric mean is used to find a compound growth rate.
The weighted mean weights each value according to its influence
The harmonic mean can be used to find an average purchase price, such as dollars per share for eual periodic investments.
The median is the midpoint of a data set when the data is arranged from largest to smallest.
The mode of a data set is the value that occurs most frequently.
LOS 8. f: Calculate and interpret quartiles, quintiles, deciles, and percentiles.
Quantile is the general term for a value at or below which is stated proportion of the data in a distribution lies. Examples of quantiles include:
Quartiles – the distribution is divided into quarters.
Quintile – the distribution is divided into fifths.
Decile – the distribution is divided into tenths.
Percentile – the distribution is divided into hundredths (percent).
LOS 8. g: Calculate and interpret 1) a range and a mean absolute deviation and 2) the variance and standard deviation of a population and of a sample.
The range is the difference between the largest and smallest values in a data set.
Mean absolute deviation (MAD) is the average of the absolute values of the deviations from the arithmetic mean:
Variance is defined as the mean of the squared deviations from the arithmetic mean or from the expected value of a distribution.
Standard deviation is the positive square root of a variance and is frequently used as a quantitative measure of risk.
Biased estimator (give example)
In statistics, the bias (or bias function) of an estimator is the difference between this estimator’s expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased. In statistics, “bias” is an objective property of an estimator, and while not a desired property, it is not pejorative, unlike the ordinary English use of the term “bias”.
When we divide by (n −1) when calculating the sample variance, then it turns out that the average of the sample variances for all possible samples is equal the population variance. … Dividing by n does not give an “unbiased” estimate of the population standard deviation.
LOS 8. h: Calculate and interpret the proportion of observations falling within a specified number of standard deviations of the mean using Chebyshev’s inequality.
Chebyshev’s inequality states that the proportion of the observations within k standard deviations of the mean is at least 1 – 1/k^2 for all k > 1. It states that for any distribution, at least.
36% of observations lie within +/- 1.25 standard deviations of the mean.
56% of observations lie within +/- 1.5 standard deviations of the mean.
75% of observations lie within +/- 2 standard deviations of the mean.
89% of observations lie within +/- 3 standard deviations of the mean.
94% of observations lie within +/- 4 standard deviations of the mean.
LOS 8. i: Calculate and interpret the coefficient of variation and the Sharpe ratio.
The Coefficient of variation for sample data, CV = s / Xbar, is the ratio of the standard deviation of the sample to its mean (expected value of the underlying distribution).
The Sharpe ratio measures excess return per unit of risk:
Relative dispersion
The relative dispersion of a data set, more commonly referred to as its coefficient of variation, is the ratio of its standard deviation to its arithmetic mean. In effect, it is a measurement of the degree by which an observed variable deviates from its average value. It is a useful measurement in applications such as comparing stocks and other investment vehicles because it is a way to determine the risk involved with the holdings in your portfolio.
LOS 8. j: Explain skewness and the meaning of the positively or negatively skewed return distribution.
Skewness describes the degree to which a distribution is not symmetric about its mean. A right-skewed distribution has positive skewness. A left-skewed distribution has negative skewness.
Sample skew with an absolute value greater than 0.5 is considered significantly different from zero.
LOS 8. k: Describe the relative locations of the mean, median, and mode for a unimodal, nonsymmetrical distribution.
For a positively skewed, unimodal distribution, the mean is greater than the median, which is greater than the mode.
For a negatively skewed, unimodal distribution, the mean is less than the median, which is less than the mode.