Statistical Concepts and Market Returns Flashcards
Statistics used to summarize important characteristics of large data sets
Descriptive Statistics
The procedures used to make forecasts, estimates or judgments about large data sets on the basis of the statistical characteristic of a smaller set (sample).
Inferential Statistics
The set of all possible members of a stated group
Population
A subset of the population of interest
Sample
A measurement scale that contains the least information. Observations are classified or counted with no particular order. (i.e. binary)
Nominal Scale
A measurement scale that has every observation assigned to one of several categories. Then the categories are ordered with respect to certain characteristics. (I.e. Top 100 stocks of SP500)
Ordinal Scale
A measurement scale that provides relative ranking, plus the assurance that the differences between scale values are equal. (i.e. temperature )
Interval Scale
A measurement scale that provides ranking and equal differences between scale values and a true zero point at the origin. (i.e. money purchasing power)
Ratio Scale
A measure used to describe a characteristic of a population
parameter
used to measure a characteristic of a sample
Sample Statistic
A tabular presentation of statistical data that aids the analysis of large data sets. (For frequency distributions, the interval with the greatest frequency is the modal interval)
Frequency Distribution
Calculated by dividing the absolute frequency of each return interval by the total number of observations
Relative Frequency
*The percentage of total observation that fall within each interval
Summing the absolute frequencies starting at the lowest interval and progressing through the highest
Cumulative Absolute Frequencies
Calculation for Population/Sample Mean
SUM(Xi) / N
Calculation for Sum of Mean Deviations
Sum(Xi-Xbar) = 0
Calculation for Weight Mean
Sum(wi*Xi) = 1
wi == weight for each Xi
- example is a portfolio weight by %stock, bond, cash
Calculation for Geometric Mean
G = (X1X2…*Xn)^(1/n)
*Always less than arithmetic mean
Calculation for Geometric Mean Return
1 + Rg = ((1+Rg1)(1+Rg2)…*(1+Rgn))^(1/n)
Calculation for Harmonic Mean
N / SUM(1/Xi)
*ex. average cost per share
harmonic mean < geometric mean < arithmetic mean
Percentile Calculation
Ly = (n+1) *(y/100) == the number below which observation is the quartile
The variability around the Central Tendency
Dispersion
Mean Absolute Deviation
SUM(abs(Xi-Xbar)) / N
Population Variance Calculation
sigma^2 = SUM(Xi- u)^2 / N
Calculation for Population Standard Deviation
sigma = ((SUM(X-u)^2) / N)^(1/2)
Calculation for Sample Variance
s^2 = SUM(Xi-Xbar)^2 / (n-1)
If n is used, not (n-1), the sigma squared is systematically underestimated
biased estimator
Sample Standard Deviation
s = (SUM(Xi-Xbar)^2/(n-1))^(1/2)
States that for any set of observations, whether sample or population data and regardless of the shape of the distribution, the percentage of the observations that lie within k standard deviations of the mean is at least (1-(1/k^2)) for all k > 1
Chebyshev’s Inequality
36% of all observations lie within +- 1.25 sdev's of mean 56% "" +/- 1.5 sdev's of mean 75% " " +/- 2 sdev's of mean 89% "" +/- 3 sdev's of mean 94% "" +/- 4 sdev's of mean
Using Chebyshev’s Inequality
The amount of variability in a distribution relative to a reference point or benchmark. It is measured with coefficient of variation
Relative Dispersion
Coefficient of Variation Calculation
- measures the amount of dispersion in a distribution relative to its mean
- risk per unit of return
CV = sx/ Xbar = (Std. Dev of x) / (avg. value of x)
Sharpe Ratio Calculation
*measures excess return per unit
s = (Rpbar - Rf) / sigmap
Rpbar = portfolio return Rf = risk-free return sigmap = std. dev. of portfolio returns
*T-bills are convention for risk free return
2 limitations of Sharpe Ratio
- If 2 portfolios have negative Sharpe Ratios, it is not necessarily true that the higher Sharpe Ratio implies superior risk-adjusted performance. Increasing Risk moves a negative Sharpe Ratio closer to Zero.
- Investment strategies with the option characteristics produce Sharpe Ratios to high and underestimate risk because of asymmetric return distributions
Skewness refers to the extent to which a distribution is not symmetrical
- positively skewed == many outliers in the upper region, or right tail
- *negatively skewed == many outliers in the lower region, or left tail.
In positively skewed distribution, (organize mean, median and mode)
mode < median < mean
In negatively skewed distribution, (organize mean, median and mode)
mean < median < mode
The measure of the degree to which a distribution is more or less “peaked” than a normal distribution
Kurtosis
Leptokurtic
More peaked
Platykurtic
less peaked
Mesokurtic
equal or same as normal
This distribution will have more returns clustered around the mean and more return with large deviation from the mean (fatter tails). This is perceived as risk increasing.
Leptokurtic Distribution
If it has more or less kurtosis than the normal distribution
excess kurtosis
Kurtosis for normal kurtosis = 3
Excess Kurtosis = kurtosis - 3
*Generally, more positive kurtosis and more negative skew signify increased risk.
excess kurtosis:
leptokurtic if >0
platykurtic if < 0
Sample Skewness Calculation
Sk = (1/n) * (SUM(Xj-Xbar)^3 / s^3)
s = sample standard deviation
Sample Kurtosis Calculation
== (1/n) * (SUM(Xi-Xbar)^4 / s^4)
Excess kurtosis > +/- 1 is considered large
””
Geometric Mean when measuring the past
Arithmetic Mean when measuring the future.