Reading 2.8 Flashcards
Test statistic formula
(Estimated value - hypothesized value) / standard error
In Ordinary least squares (OLS) regression, what is beta, alpha and residual?
Beta = slope
Alpha = vertical intercept
Residual = distance from line to the actual value (data point)
What issues should an analyst be concerned about during regression analysis?
1) Outliers
2) Autocorrelation
3) Heteroskedasticity
What is heteroskedasticity?
Error term’s variance is correlated with an explanatory variable
OR
SD of a variable is not constant over time
What are leptokurtic distributions?
Fat - tailed distributions. = have more probability of extreme (rare) events
When does OLS generate unbiased estimates of A&B?
If error terms are:
1) normally distributed
2) uncorrelated
3) homoskedastic
What is homoskedasticity?
Error term / Variance of residual / SD
is constant
When is null rejected?
If p value < significance level (1/5/10%)
OR
T-test > critical value
When is null not rejected?
p value > significance level
OR
T test < critical value
What is P value?
Measure used to determine the likelihood that an observed outcome is the result of chance.
How is p value interpreted?
1) Measures presence of a relationship
2) the larger the t test the smaller the p value
3) reject null if p value less than significance level
How is t statistic interpreted
1) reject null if t test is larger than critical value
2) the larger the t test the smaller the p value
Main critical values (1,2,5,10%)
1% conf level = 2.575
2% сonf level = 2.33
5% conf level = 1.96
10% conf level = 1.645
What does OLS do?
Finds estimates of a&b that minimize sum of squared error terms (residuals)
9 sampling & testing problems
1) selection bias
2) self selection bias
3) survivorship bias
4) data mining
5) data dredging
6) backtesting
7) backfilling
8) cherry picking
9) chumming
Data mining
Trying to find patterns in very large data sets.
Usually just finds false patterns that are random.
Back testing
Retroactively testing your investment strategy using historical data and seeing if produces good returns. Prone to cherry picking because the analyst is already aware of the data.
Data dredging
Conducting multiple tests on same data and selecting only the results that show statistically significant findings