Statistical Concepts and Market Returns Flashcards
1. Distinguish between descriptive statistics and inferential statistics, between a population and a sample, and among the types of measurement scales. 2. Define a parameter, a sample statistic, and a frequency distribution. 3. Calculate and interpret relative frequencies and cumulative relative frequencies, given a frequency distribution. 4. Describe the properties of a data set presented as a histogram or a frequency polygon. 5. Calculate and interpret measures of central tendency,
Measures of Central Tendency
Measures of central tendency provide an indication of an investment’s expected return.
1. arithmetic mean 2. geometric mean 3. weighted mean 4. median 5. mode
Measures of central tendency identify the center, or average, of a data set.
Measures of Dispersion
Measures of dispersion indicate the riskiness of an investment.
1. range 2. mean absolute deviation 3. variance
Two Categories of statistics
- descriptive statistics 2. inferential statistics
Descriptive statistics
used to summarize the important characteristics of large data sets.
Inferential statistics
Inferential statistics pertain to the procedures used to make forecasts, estimates, or judgments about a large set of data on the basis of the statistical characteristics of a smaller set (a sample).
Population
The set of all possible members of a stated group. A cross section of the returns of all of the stocks traded on the New York Stock Exchange (NYSE) is an example of a population.
Sample
a subset of the population of interest.
Types of Measurement Scales
- Nominal scales 2.Ordinal scales 3. Interval scale 4. Ratio scales
Nominal scales
Nominal scales are the level of measurement that contains the least information. Observations are classified or counted with no particular order. An example would be assigning the number 1 to a municipal bond fund, the number 2 to a corporate bond fund, and so on for each fund style.
Ordinal scales
Ordinal scales represent a higher level of measurement than nominal scales. When working with an ordinal scale, every observation is assigned to one of several categories. Then these categories are ordered with respect to a specified characteristics. For example, the ranking of 1,000 small cap growth stocks by performance may be done by assigning the number 1 to the 100 best performing stocks, the number 2 to the next 1 00 best performing stocks, and so on, assigning the number 1 0 to the 100 worst performing stocks. Based on this type of measurement, it can be concluded that a stock ranked 3 is better than a stock ranked 4, but the scale reveals nothing about performance differences or whether the difference between a 3 and a 4 is the same as the difference between a 4 and a 5.
Interval scale
Interval scale measurements provide relative ranking, like ordinal scales, plus the assurance that differences between scale values are equal. Temperature measurement in degrees is a prime example. Certainly, 49°C is hotter than 32°C, and the temperature difference between 49°C and 32°C is the same as the difference between 67°C and 50°C. The weakness of the interval scale is that a measurement of zero does not necessarily indicate the total absence of what we are measuring. This means that interval-scale-based ratios are meaningless. For example, 30°F is not three times as hot as 1 0°F
Ratio scales
Ratio scales represent the most refined level of measurement. Ratio scales provide ranking and equal differences between scale values, and they also have a true zero point as the origin. Order, intervals, and ratios all make sense with a ratio scale. The measurement of money is a good example. If you have zero dollars,
you have no purchasing power, but if you have $4.00, you have twice as much purchasing power as a person with $2.00.
frequency distribution
A frequency distribution groups observations into classes, or intervals. An interval is a range of values. Frequency distributions summarize statistical data by assigning it to specified groups, or intervals. Also, the data employed with a frequency distribution may be measured using any type of measurement scale.
How to construct a frequency distribution
- Define the intervals.
- Tally the observations.
- Count the observations.
Modal interval
For any frequency distribution, the interval with the greatest frequency is referred to as the modal interval.
Relative Frequency
Relative frequency is the percentage of total observations falling within an interval. It is calculated by dividing the absolute frequency of each return interval by the total number of observations.
cumulative absolute frequency
Calculated by adding the frequency of all observations at or below that point.
cumulative relative frequency
Cumulative relative frequency for an interval is the sum of the relative frequencies for all values less than or equal to that interval’s maximum value. Sum of relative frequency percentages
Histogram
The graphical presentation of the absolute frequency distribution.
Populations Mean Formula
Sum all observed values in the population and divide by # of observations in the population
Sample Mean Formula
Sum of all values in a sample population divided by the # of observations in the sample
mode
the value that occurs most frequently in a data set.
Harmonic Mean
A harmonic mean is used for certain computations, such as the average cost of shares purchased over time
Geometric Mean
geometric mean is often used when calculating investment returns over multiple periods or when measuring compound growth rates.
G= (1+X1 * 1+X2 * 1+X3….)^1/n
Properties of arithmetic mean
All interval and ratio data sets have an arithmetic mean.
All data values are considered and included in the arithmetic mean computation.
A data set has only one arithmetic mean (i.e., the arithmetic mean is unique).
The sum of the deviations of each observation in the data set from the mean is always zero.
arithmetic mean
The arithmetic mean is the sum of the observation values divided by the number of observations.
Quantile
Quantile is the general term for a value at or below which a stated proportion of the data in a distribution lies.
Quartile
Quartiles- the distribution is divided into quarters.
Quintile
Quintile- the distribution is divided into fifths.
Decile
Decile- the distribution is divided into tenths.
Percentile
Percentile- the distribution is divided into hundredths (percents)
Formula for the position of any observation at any given percentile
Ly = (n + 1)* y/100
n= # of observations y= percentile
Dispersion
Dispersion is defined as the variability around the central tendency.
Range
range = maximum value - minimum value
Mean Absolute Deviation (MAD)
The mean absolute deviation (MAD) is the average of the absolute values of the deviations of individual observations from the arithmetic mean.
Population Variance
Population variance is defined as the average of the squared deviations from the mean.
Population Standard Deviation
The population standard deviation is the square root of the population variance
Sample Variance
The sample variance is the measure of dispersion that applies when we are evaluating a sample of n observations from a population.
Sample Standard Deviation
sample standard deviation can be calculated by taking the square root of the sample variance.
Chebyshev’s inequality
Chebyshev’s inequality states that for any set of observations, whether sample or population data and regardless of the shape of the distribution, the percentage of the observations that lie within k standard deviations of the mean is at least 1 - 1/k2 for all k > l.
Percentage of observation within standard deviations of the mean
According to Chebyshev’s inequality, the following relationships hold for any distribution. At least:
36% ofobservations lie within ±1.25 standard deviations ofthe mean.
56% ofobservations lie within ± 1.50 standard deviations ofthe mean.
75% of observations lie within ±2 standard deviations of the mean.
89% of observations lie within ±3 standard deviations of the mean.
94% of observations lie within ±4 standard deviations of the mean.
Relative dispersion
Relative dispersion is the amount of variability in a distribution relative to a reference point or benchmark. Relative dispersion is commonly measured with the coefficient of variation
Coefficient of variation
Standard deviation of X / avg value of X
Relative dispersion is commonly measured with the coefficient of variation
Sharpe Ratio
Portfolio Return - Risk free rate of return / Stdev of portfolio returns
Limitations of the Sharpe ratio
(1) If two portfolios have negative Sharpe ratios, it is not necessarily true that the higher Sharpe ratio implies superior risk-adjusted performance. Increasing risk moves a negative Sharpe ratio closer to zero (i.e., higher). (2) The Sharpe ratio is useful when standard deviation is an appropriate measure of risk. However, investment strategies with option characteristics have asymmetric return distributions, reflecting a large probability of small gains coupled with a small probability of large losses. In such cases, standard deviation may underestimate risk and produce Sharpe ratios that are too high.
Skewness affects the location of the mean, median, and mode of a distribution.
positively skewed, unimodal distribution: mode < median < mean.
negatively skewed, unimodal distribution:
mean < median < mode.
The key to remembering how measures ofcentral tendency are affected by skewed data is to recognize that skew affects the mean more than the median and mode, and the mean is “pulled” in the direction of the skew.
Kurtosis
Kurtosis is a measure of the degree to which a distribution is more or less “peaked” than a normal distribution.
Leptokurtic
Leptokurtic describes a distribution that is more peaked than a normal distribution,
Platykurtic
Platykurtic refers to a distribution that is less peaked, or flatter than a normal distribution.
mesokurtic
A distribution is mesokurtic if it has the same kurtosis as a normal distribution.
Sample Skewness
Sample skewness is equal to the (sum of the cubed deviations from the mean) divided by (the cubed standard deviation) and by (the number of observations.)
Sample kurtosis
Sample Kurtosis is equal to the (sum of the deviations from the mean raised to the 4th) divided by (the standard deviation to the 4th) and by (the number of observations.)
Geometric mean return
= (1 + annual return)*(1+annual return)…… ^1/n -1
Arithmetic mean return
= (1 + annual return)*(1+annual return)…… / n
Parameter
Any measurable characteristic of a population