3. Statistics Flashcards
Population vs. sample
Population = summation of all the elements of interest to a researcher
Sample = set of elements that represent the population
Parameter vs sample statistic
Parameter = measure used to describe a characteristic of population
Statistic = measure that describes a characteristic of the sample
Expected value
anticipated value for an investment at some point in the future
Key characteristics of normal distrbution (5)
Mean = mode = median
Skewness = 0
Kurtosis = 3 –> excess kurtosis = 0
50% of values above mean, 50% below
68% of obs within 1 sd, 95% within 2 sd, 99.7% within 3 sd
Any normal distribution can be standardised by converting into
z-scores –> tell you how many standard deviations from the mean each value lies
What is the distribution of stock prices, market capitalisation, income, etc?
Lognormal –> unable to get stocks for less than $0
* right skewed instead of bell curve
* if you can get a normal distribution by applying log function then original distribution = lognormal
Characteristics of lognormal distribution (3)
Becomes normal by taking log of all values, positively skewed (right skew), extreme values on positive side of dist.
Chi-squared test
Compare sample variance with population variance
Formula = ((n-1) x variance^2)/(sample var^2)
Possible hypothesis pairs for Chi-Squared Test
- Two Tailed H0: S^2 = variance^2
- Right Tailed H0: S^2 <= Variance^2
- Left Tailed H): S^2 >= Variance^2
F-Test
Only one difference between F-Test and Chi-Squared
* check whether two population or sample variances are equal or not
*variance 1/variance 2
What are the critical values
> = 1.96 - statistically significant
-1.96 < x < 1.96 - statistically insignificant
t-distribution (6)
similar to normal (Z) distribution
* symmetric
* mean of 0
* no assumption that the population std dev is known
* defined by df
* most useful for small sample sizes when the population standard deviation is not known
* as sample size increases, becomes more similar to normal distribution
What are the 4 moments in finance?
Mean - average
Variance - degree to which returns vary over time
Skewness - lack of symmetry
Kurtosis - extreme values in either tail
+ve skewness (2)
Mean > median > mode
right skew (income)
-ve skew
mean < median < mode
left skewed (retired)
Why is skewness important for investors?
Shows extreme values not only average
Kurtosis (3)
Measures extreme values in either tail
High kurtosis = fat tails (differs from 3)
Normal = 3
Name for high kurtosis
Leptokurtic - experiences occasional extreme returns (positive or negative)
Name for low kurtosis
Platykurtic - flat tails (less kurtosis than normal distribution)
You can not reach a conclusion about the skew of a distrbution if the skewness test (Zskew) falls within what values? –> kurtosis follows the same rules
+1.96 and -1.96
above +1.96 = +ve
Below - 1.96 = -ve
How often do we experience a 2.5% drop in stock prices?
less than 5% probability (2.427 s.d. away from the mean)
Distribution of daily returns is more
peaked (more obs. close to mean) than normal distribution
actual data has fat tails (higher number of extreme observations)
Covariance
Determines the relationship between the movement of two asset prices, indicating the direction of the linear relationship
-1.0 to +1.0 scale
Autocorrelation
correlation between a series and the lagged version of itself
Does the classic 60/40 (stocks/bonds) portfolio offer true diversification?
No - true diversification is when both assets are risky and have low or negative correlation
Is there diversification benefits of adding junk bonds to equity portfolio?
Yes - junk bonds = bond of very risky companies, risky asset hence, if there is low/negative correlation there is diversification`
Any diversification of adding gold to equity portfolio?
No - gold = safe asset
Any diversification benefits of adding treasury bonds to equity portfolio?
No - treasury bonds = risk-free
Z and t-tests (2)
both hypothesis tests for establishing if there is a significant difference between two groups
* t-test for small sample or when population standard deviation is unknown
Paired t-test (2)
Estimated using difference between paired observations
same sample sizes
Other two t-tests
independent samples, unpaired, equal variances
independent samples, unpaired, unequal variances
Spearman Rank Ordered Correlation (SROCC) (3)
Use when the relationship between two variables is not linear
* rank via a characteristic
* rank converts into linear relationship
Pearsons Product Moment Correlation Coefficient (PPMCC)
Use when unsure whether the relationship is linear or non linear
* estimates a measure of linear relationship between X and Y
* plot empirical observations in a scatter plot and use OLS to fit a line of best fit