FRM Level 1 Part 2 Flashcards
chapter 1: probability
1. random event and probability
basic concept of probability
- outcome and sample space
2. relationship amount events mutually exclusive events (互斥事件) exhaustive events(便利事件) independent events(独立事件) the occurrence of B has no influence of the occurrence of A
the type of probability
Joint probability is the probability of two events occurring simultaneously.
Marginal probability is the probability of an event irrespective of the outcome of another variable.
Conditional probability is the probability of one event occurring in the presence of a second event.
unconditional probability
p(A)
conditional probability
p(A|B)
join probability
p(AB)
two important rule
multiplication rule
p(AB) = p(A|B)xp(B)
if they are independent
p(AB) = p(A)xp(B)
additional rule
p(A+B) = p(A) + p(B) - p(AB)
if mutually exclusive
p(A+B) = p(A) + p(B)
- discrete and continuous random variable
discrete random variable
number of possible outcomes can be counted
continuous random variable
it can take on any value within a given infinite and finite range
P(X=x) = 0 even thought event X can occur
provability density distribution:
discrete random variable (probability that a discrete random will take on the value X)
continuous random variable
PDF 即 f(x), 他表示X 对应的函数值
p(x1<=X<=x2)(区间与PDF围成的面积即为概率)
cumulative distribution function concept: the probability that a random variable will be less than or a given value ----> F(x) = F(X <= x) characteristic: 单调性递增 有界性:x 逼近负无穷为0,正无穷为1 P(a
Chapter 2 Bayesian Analysis
Total probability theorem
if A1,….,An are mutually exclusive and exhaustive
p(B) = the sum of p(Aj)p(B|Aj) from j = 1 to n
Baye’s Theorm
p(A|B) = p(B|A)/p(B) x p(A)
p(A|B): updated probability 后验概率
p(A): prior probability 先验概率
p(B)用全概率公式计算
3.Basic Statistic
arithmetic mean population mean mu = (the sum of Xi from i=1 to N)/N sample mean x = (the sum of Xi from i=1 to n)/n
median
the middle item of a set of items sorted into ascending or descending order
odd (n+1)/2 and even n/2
mode
most frequently occurring value of the distribution
expected value definition E(x) = X1*p(X=x1)+...+Xn*p(X=xn) properties: if c is any constant, then E(cx + a) = cE(x) + a E(X+ Y) = E(X) + E(Y) if X and Y are independent random variables, then E(XY) = E(X)*E(Y) E(X^2) != [E(X)]^2
- dispersion
variance for data:
population variance
sample variance
standard deviation
population standard deviation
sample standard deviation
variance for random variable formula: var(x) = E[(X - mu)^2] = E(X^2) - [E(X)]^2 properties: if c is any constant Var(X+c) = Var(X) Var(cX) = c^2 *var(X)
if X and Y are independent random variable, then
Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y)
Var(X-Y) = Var(X) + Var(Y) - 2Cov(X,Y)
平方根法则: 期初债券持有量应为(n-1)Y/n,现金持有量为Y/n;期末债券持有量为0,现金持有量为Y/n。平均债券持有量就是(n-1)Y/2n。那么考虑如下的最大化过程(持有债券收益的最大化,控制变量为变现次数n) Max(n-1)Yr/2n-nb 一阶条件为Yr/2n^2-b=0 解得n=√(Yr/2b) 其平均的现金持有量为Y/2n,将上述结果代入得 Md=√(Yb/2r)这就是平方根法则的数学表示,也是鲍莫尔模型的最大化结果
covariance
definition:
the relationship between the deviation of two variables
Cov(X, Y) = E[X - E(X)][Y - E(Y)] = E(XY) - E(X)E(Y)
properties
1. Cov(X, Y) 是正无穷到负无穷
2. if X and Y are independent, then E(XY) = E(X)E(Y) and cov(X, Y) = 0
3. if X = Y, then Cov(X,X) =E[X - E(X)]E[X - E(X)] = var(X)
4. Cov(a+bX, c+dY) = bdCov(X,Y)
5. Var(w1X1, w2X2) = [w1^2]Var(X1) + [w2^2]Var(X2) + 2w1w2Cov(X1,X2)
w1 与w2 分别为X1 与X2 的权重
correlation
definition:
linear relationship between two variable px,y = Cov(x,y)/std(x)*std(y). p 区间在-1 到1 and it is no units
properties
p = 0 indicate the absence of any linear relationship, but perhaps exist non-linear relationship
the bigger of the absolute value, the stronger linear relationship
correlation coefficient with interpretation
p = +1 perfect positive linear correlation
0<p></p>
- Skewness
definition
how symmetrical the distribution around the mean
skewness = E[(X - mu)^3]/std^3
properties
symmetrical distribution:
Skewness = 0
positively skewed distribution (right skew): Skewness>0
outlier in the right tail———-mean>median>mode
negatively skewed distribution (left skew):
Skewness < 0
outlier in the left tail: many financial asset exhibit negative skew (more risky)———-mean
- Kurtosis
definition: the degree of how many weight on extreme points in the tail
Kurtosis =E[(X - mu)^4]/std^4
leptokurtic mesokurtic platykurtic kurtosis >3 = 3 <3 excess kurtosis >0 =0 <0
chapter 4
1. discrete probability distribution
Bernoulli distribution:
definition: trail produces one of two outcomes (success or failure)
properties
E(X) = p1 +(1-p)0
Var(X) = p*(1-p)
Binomial Distribution
definition: The distribution of binominal random variable which is defined as the number of success in n Bernoulli trails
properties
The probability is constant for all trails
the trail are all independent
E(X) = np
Var(X) = np(1-p)
p(x) = P(X = x) = n!/[(n - x)!x!] * p^x(1-p)^(n-x)
as n increase and P —–> 0.5 approximate normal distribution
Poisson distribution
definition: used to model the occurrence of events over time
properties:
f(x) = P(X = x) = (v^x*e^(-v))/x!
v —> the average or expected number of events in the interval
X —> the success number of number of events in the interval
continuous probability distribution
uniform distribution
definition:
the probabilities for all possible outcomes are equal
graph: probability density function f(x)= 1/(b-a) for (a <= x <= b) 0 for otherwise cumulative probability function F(x) = 0 for x <= a (x - a)/(b - a) for a < x < b 1 for x >= b
properties
E(x) =(a + b)/2 Var(X) = (b - a)^2/12
For all a <= x1 <= b: P(x1<=X<=x2) =(x1-x2)/(b-a)
standard uniform distribution: a = 0, b = 1
normal distribution
properties:
completely described by mean and variance
X~N(mean, variance)
Skewness = 0, kurtosis = 3
1. linear combination of independent normal distributed random variables is also normally distributed
2. probabilities decrease further from the mean. But the tails go on forever
some special data
68% confidence interval is [X - 1std, X + 1std]
90% confidence interval is [X - 1.65std, X + 1.65std]
95% confidence interval is [X - 1.96std, X + 1.96std]
98% confidence interval is [X - 2.33std, X + 2.33std]
99% confidence interval is [X - 2.58std, X + 2.58std]
normal distribution —–when mean = 0, std = 1 ———————–>standard normal distribution (standardizing steps Z = (X-mean)/std)
lognormal distribution
definition:
lnx is normal, x is lognormal
Y is normal, e^y is lognormal
properties
if lnX is normal, then X is lognormal
if lnX is normal distribution, e^x is lognormal distribution
chart
Right skewed
Bounded from below by zero
Sampling distribution
student distribution
definition:
if Z is a standard normal distribution and U is a chi-square variable with K degrees of freedom, which is independent of Z, then the random variable X follows a t-distribution with k degree of freedom X
X = Z/sqrt(U/K)
Z: standard normal variable
U: chi-square variable
K: degree of freedom
tips:
chi-square variable could be the sum of squares
Y = S1^2+…+Sn^2
where S1,…, Sn are independent standard normal random variables
properties:
1. symmetrical (bell shaped), skewness = 0
2, defined by single parameter: degrees of freedom (df), and df = n -1, where n is the sample size
3. comparison with normal distribution
fatter tails
As df increase, T-distribution is approaching to standard normal distribution
Given a degree of confidence, t-distribution has a wider confidence interval
As df increase, t-distribution is becoming more peaked with thinner tails, which means smaller probabilities for extreme values
Chi-Square (x2) distribution
definition:
If we have k independent standard normal variables, Z1,…,Zk, then sum of their square S, has a chi-square distribution
S = Z1 +… +Zk
k is the degree of freedom (df = n-1 when sampling)
properties
Asymmetrical, bounded below by zero
As df increase, converage to normal distribution
if the sum of two independent chi-square variables, with k1 and k2, degree of freedom will follow chi-square distribution, with k1 + k2 degrees of freedom
F-distribution
definition:
if U1 and U2 are two independent chi-squared distribution with K1 and K2 degrees freedom, then X follow an F-distribution
X = (U1/K1)/(U2/K2)
properties
As K1 and K2 —–> 正无穷, F-distribution approach normal distribution
if X follow t(k), then X^2 has an distribution: X^2 has an F-distribution
when sampling, df are n1-1 and n2-1
X^2 ~F(1,k)
degree freedom
Df =N−1
Df = degree of freedom
N = sample size
chapter 5 confidence interval and hypothesis testing
1. point estimation
statistical inference:
making forecasts, estimates or judgments about a population from the sample actually drawn from that population
Getting sampling from sampling population
Sample statistic use the sample to estimate parameter
so, we can use population to directly parameter
sample mean & Sample variance
sample mean: s = the sum of (Xi - mean) / (n)
E(X) = mean
var(X)= var/n
sample variance: s = the sum of (Xi - mean) / (n-1)
central limit theorem:
假设条件: simple random sample (即i.i.d), 方差有限不为0,样本量>30
结论 X~N(mean, var(x))
properties of estimator:
unbiased:
the expected value of the estimate equals the parameter
efficient (Best) var(population of X) <= var(sample of X)
if variance of estimator smallest amount all unbiased estimator
consistent: n 越大,参数估计约准确
linearity
- confidence interval
point estimate +or- reliability factor x standard error
known population variance
x +or- Z a/2 sigma/sqrt(n)
unknown population variance
x +or- t a/2 sigma/sqrt(n)
CI with known and unknown population variance
sampling from a: reliability factor
distribution variance small s(n<30) large(>30)
normal known z-statistic z-statistic
normal unknown t-statistic t-statistic
Non-nor known Not available z-statistic
Non-nor unknown Not available t-statistic
Factor affect the width of confidence interval
change in Factors for z-distribution for t-~
Larger alpha smaller smaller
larger n smaller smaller
larger df N/A smaller
larger s larger larger
z-distribution
(x-mean) / standard deviation
Hypothesis test
Null hypothesis —–> Ho
Alternative hypothesis —–> Ha
我们通常将想要的结果放在备择假设
page 7
one tail test vs. two tailed test
type I error vs. type II error
type I error
rejecting null hypothesis when it is true
the probability of making a type I error is equal to alpha, also known as the level of significant of the test
type II error
failing to reject the null hypothesis when it is true
the probability of make a type II error is equal to beta
Power of the test: rejecting the null hypothesis when it is false. the probability of power of test is equal to 1 - beta
summarize page 7
test of population mean and variance
page 8
summary of hypothesis testing
1. mean hypothesis s testing:
1.1 normal distributed population, known as population variance
mean = mean0 (mean of null hypothesis)
Z = (sample of mean - mean0) / [std/sqrt(sample size)]
tip: std = sqrt(population variance)
N(0, 1) normal distribution
1.2 normal distributed population, unknown population variance
mean = mean0 (mean of null hypothesis)
t = (sample of mean - mean0) / [s/sqrt(sample size)]
tip: s = sample variance
tn-1 t-distribution
- variance hypothesis s testing
2.1. normal distributed population
variance = variance0 (variance of null hypothesis)
X^2 = (n-1) x sample variance /variance0 (variance of null hypothesis)
X^(n-1) chi-square distribution
2.2 two independent normal distribution populations
variance of first normal distribution populations = variance of second normal distribution populations
F = sample variance of first one/ sample variance of second one
F(first n - 1, second n -1) F distribution is the ratio of the two independent chi-square distribution
decision rule
1. p-value
definition: the smallest significant level at which the null hypothesis can be rejected
decision rule: rejecting the null hypothesis if p-value <= alpha (单双尾规则一样)
2. 样本统计量 > critical value, 此时就拒绝原假设
Chapter 6: Confidence interval and hypothesis testing
1. regression equation
population
Yi = Beta0 + Beta1Xi +ui
Y: dependent (explained) variable, regressand (回归子)
X: independent (explanatory) variable, regressor (回归元)
Beta0: regression intercept term
Beta1: regression slope coefficient
ui: error term (residual term)
sample
- Ordinary least Square (OLS)
assumption
E(ui|xi) = 0
all (X, Y) observation are independent and identically distributed (i.i.d.)
large outliers are unlikely
principle
minimize the squared residuals (error term)
formula
beta1 = Cov(X, Y)/var(X)
beta0 =Y - beta1 * X
因为回归曲线一定通过(X, Y)
- measure of fit
coefficient of determination (R^2)
R^2 = ESS/TSS = 1 - SSR/TSS
for i in n
Total Sum of Square (TSS) :
TSS = sum[(Yi within population - population mean)^2]
Explained Sum of Squares (ESS)
ESS = sum[(Yi within sample - population mean)^2]
Residual Sum of Square (SSR):
SSR = sum[( Y within population - sample mean)]
characteristic:
R^2 range between 0 and 1. near 1 indicate X is good at predicting Y
for one independent variable: R^2 = px,y^2
standard error of regression
identification:
an estimator of the standard deviation of the regression error ui
formula: SER = sqrt(SSR/[n-2]) = sqrt(the sum of ui^2/[n-2])
judgement:
该指标越小越好
chapter 7: testing hypothesis and confidence intervals in single regression
- testing hypothesis about coefficient and confidence interval
null hypothesis and alternative hypothesis
H0: beta 1 = beta 1, 0 (the hypothesized value)
如果 beta1,0 = 0, 则为显著性(significant)测验
t-statistic:
t = (sample of beta1 - beta1,0)/SE(sample of beta1) 自由度为n-1
decision rule:
reject H0 if t-statistic > t critical or t-statistic < -t critical
p-value < alpha
the meaning of reject H0
regression coefficient is different from the beta1,0, given a level of significant alpha
common format for regression result
test score = 698.9 - 2.28 ClassSize
R^2 = 0.051 SER =18.6
Low R^2 not imply that this regression is good or bad. But it tell us that there has other important factors
page 9 ????
- Binary /Dummy/indicator variable
identification it take on only two value, 0 or 1 formula Yi = Beta0 + beta1Di + ui Di = 0 or 1 Beta0 indicates E(Y|Di = 0) Beta1 indicates E(Y|Di = 1) - E(Y|Di = 0)
- Homoscedasticity and heteroskedasticity
Homoscedasticity
Var (ui|X) = σ^2
This means that the variance of the error term ui is the same regardless of the predictor variable X.
- Homoskedasticity occurs when the variance of the error term in a regression model is constant.
- If the variance of the error term is homoskedastic, the model was well-defined. If there is too much variance, the model may not be defined well.
- Adding additional predictor variables can help explain the performance of the dependent variable.
- Oppositely, heteroskedasticity occurs when the variance of the error term is not constant.
heteroskedasticity
If Homoscedasticity is violated,
e.g. if Var (ui|X) = σ^2*(X), then we say the error term is heteroskedastic.
consequences
1. the OLS estimator is still unbiased, consistent and asymptotically normal. But not efficient
2. influence the standard error of the coefficient
if standard deviation of coefficient small, t-statistic large, Type I error
if standard deviation of coefficient large, t-statistic small, Type II error
how to deal
calculate robust standard errors
use weighted least square (WLS)
Gauss-Markov Theorem
identification
if the three least squares assumptions hold and it the error is Homoscedastic, then the OLS estimator is the best linear conditionally unbiased estimator (BLUE)
limitation
its conditions might not hold in practice
there has other not linear and conditionally unbiased estimators, which are more efficient than OLC
if extreme outlier are not rare. Use least absolute deviation.
In statistics, the Gauss–Markov theorem states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero.[1] The errors do not need to be normal, nor do they need to be independent and identically distributed (only uncorrelated with mean zero and homoscedastic with finite variance). The requirement that the estimator be unbiased cannot be dropped, since biased estimators exist with lower variance. See, for example, the James–Stein estimator (which also drops linearity) or ridge regression.
Zero mean so that the noise does not present a net disturbance to the system. There’s as much positive noise as negative, so they cancel out in the long run. If the mean were not zero, then the noise would appear as an additional dynamic.
The Gauss–Markov assumptions concern the set of error random variables,ei:
They have mean zero: E[ei] = 0
They are homoscedastic, that is all have the same finite variance: Var(ei) = σ^2 < finite for all i
Distinct error terms are uncorrelated: Cov(ei,ej)= 0
i != j
Chapter 8: linear regression with multiple regression
1. omitted variable
identification
1. the omitted variable is correlated with the movement of the independent variable in model
2. the omitted variable is determinant of the independent variable
在统计中,当统计模型遗漏一个或多个相关变量时,会发生遗漏变量偏差(OVB)。 偏差导致模型将缺失变量的影响归因于所包含的变量。
更具体地说,OVB是在回归分析的参数估计中出现的偏差,当假定的规范不正确时,因为它忽略了一个自变量,该自变量是因变量的决定因素,并且与一个或多个包含的自变量相关 变量。
impact
- the assumption of E(ui|xi) = 0 is not hold because Cov(ui, xi) != 0
- OLS estimator is biased, and biased dose not vanish even in large sample
- the large |p|, the larger bias
solution
add omitted valuable into the model
- multiple regression
formula
Yi = Beta0 + Beta1X1i + Beta2X2i + … + BetakXki + ui
the population regression line
E(Y|X) = beta0 + beta1X1i + Beta2X2i + … + BetakXki
the intercept term is the expected value of Yi when Xki = 0
partial effect: beta1 = deltaY/DelaX1, holding X2,…, Xk constant, control variable
Homoscedastic
var(ui|X1i, …, Xki) is constant
OLS method still can be used
- multiple regression assumption
E(ui|X1i, …, Xki) = 0
(X1i,…,Xki, Yi), i = 1,…,n are independently and identically distributed (i.i.d.)
large outlier are unlikely
there is no perfect multicollinearity
- Multicollinearity
perfect multicollinearity
identification: if one of the independent variable is a perfect linear, combination of the other independent variables
impact: product division by zero in the OLS estimates
example: Dummy variable trap:
without beta0 –>N Dummy variables
with beta 0 –> Dummy variables
imperfect multicollinearity
identification : two or more independent variables are high correlated but not perfect correlated
impact: does not pose any problems for OLS estimators (still unbiased); have a high variance
method to detect:
t-test indicates that none of the individual coefficient is significantly different than zero, while the F-test overall significant and R^2 is high;
the absolute value of the sample correlation is greater that 0.7
- measure of Fit
standard error of regression (SER)
回归的标准误差提供了数据点从回归线落下的典型距离的绝对度量。 S以因变量为单位。
R平方提供模型解释的因变量方差百分比的相对度量。 R平方的范围可以从0到100%。
SER = sqrt(SSR/(n-k-1)) = sqrt(sum(e^2)/n-k-1)
SER = sqrt(SSR/[n-2]) = sqrt(the sum of ui^2/[n-2])
(Residual sum of squares) SSR = sum[( Y within population - sample mean)]
tip: k = numbers of independents
coefficient of determination (R^2)
R^2 = ESS/TSS = 1 - SSR/TSS
for i in n
Total Sum of Square (TSS) :
TSS = sum[(Yi within population - population mean)^2]
Explained Sum of Squares (ESS)
ESS = sum[(Yi within sample - population mean)^2]
R^2 increases whenever a regressor is added, unless the estimated coefficient on the added regressor is exactly zero
adjusted R^2
formula: adjusted R^2 = 1 - [(n-1)/(n-k-1) x (1 - R^2)] = 1 - [(n-1)/(n-k-1) x (SSR/TSS)]
nature:
- adjusted R^2 <= R^2
- adjusted R&2 can be negative
- adding a regressor has two opposite effect, adjusted R^2 can increase or decrease
the R^2 or adjust R^2 dose not tell us
- an included variable is statistically significant
- the regressor are a true cause of the movement in the dependent variable
- there is omitted variable bias
- you have chosen the most appropriate set of regressors
chapter 9: Hypothesis test and confident interval in multiple regression
page 13
tips: t-statistic = (estimated regression coefficient - value of estimate under H0)/standard error of estimated coefficient;
tips: the t-statistic has n - k - 1 degrees of freedom where k = number of independents (也就是在多元回归中的因子数)
confidence interval for a single coefficient
the confidence interval (CI) for a regression coefficient in multiple regression is calculated and interpreted the same way as it is simple linear regression
CI = estimated regression coefficient + or - critical t-value x standard error of regression coefficient
joint hypothesis (F-test) in a multiple regression, we cannot test the null hypothesis that all slope coefficient equal 0 based on t-test that each individual slope coefficient equal 0 why? individual test do not account for the effects of interactions amount the independent variable
for this reason, we conduct the F-test
the F-statistic, which is always a one-tailed test, is calculated as:
F = (ESS/k)/(SSR/[n-k-1])
n = number of observation k = number of independent variables ESS = explained sum of squares SSR = sum of squared residuals
identification:
a hypothesis that impose two or more restriction on the regression
null hypothesis and alternative hypothesis
H0: beta1 = beta2 = beta3 = … = betak = 0
Ha: at least one Betaj != 0
the test assesses the effectiveness of the model as a whole in explaining the dependent variable
使用单独的自由度为n-k-1的F检测且不能够用单独的t检测来代替
classification of F-statistics ?????????????????????
with q = 1 restriction: F is the square of the t-statistic
with q = 2 restriction: F = 1/2 x (t1^2 + t2^2 - 2 x pt1,t2 x t1 x t2)/ (1 - pt1,t2^2)
with q restriction (valid only when homoscedastic)
F = (SSR[restricted - SSR[unrestricted]/q)/[SSR[unrestricted]/(n - k[unrestricted] - 1)]
F = [(R[restricted]^2 - R[unrestricted]^2)/q] / [(1 - R[unrestricted]^2)/(n - k[unrestricted] - 1)
page 13
R^2
to determined the accuracy within which the OLS regression line fits the data, we apply the coefficient of determination and the regression’s standard error
the coefficient of determination, represented by R^2, is a measure of “goodness of fit” of the regression
it is interpreted as the percentage of variation in the dependent variable explained by the independent variables
R^2 = (total variation - unexplained variation)/total variation
adjusted R^2
However, R^2 is not reliable indicator of the explanatory power of multiple regression model
why? R^2 almost always increases as new independent variables are added to the model, even if the marginal contribution of the new variable is not statistically significant
thus, a high R^2 may reflect the impact of a large set of independents rather that how well the set explain the dependent
https://analystprep.com/study-notes/frm/part-1/hypothesis-tests-and-confidence-intervals-in-multiple-regression/
chapter 10: modeling trend
linear trend b
Tt = Beta0 + Beta1 x TIMEt (it is a straight line)
non-linear trend:
quadratic trend—
Tt = Beta0 + Beta1 x TIMEt + Beta1 x TIMEt^2
= Beta2 x (TIME + Beta1 / 2*Beta2)^2 + C
—– monotonically (when beta1 > 0, beta2 >0 & beta1 < 0, beta2 < 0)
long-linear trend:
Tt = beta0 x exp(Beta1 x TIMEt)
In(Tt) = In(Beta0) + Beta1 x TIMEt
—— 增长率是常数
estimating & forecasting
estimating:
(beta0-hat, beta1-hat) = argmin sum(yt - Beta0 - Beta1 x TIMEt)^2
forecasting:
y[T+ h] = Beta0 + Beta1 x TIME[T + h] + error[T+h]
forecast model selection criteria
mean squared error (MSE) MSE is related to two other diagnostic previously look at --- the sun of square residuals (SSR) and the coefficient of determination: MSE = SSR / T T = sample size R^2 = 1 - SSR/TSS
tip: the following is penalties form of “penalty time MSE”; e = error; T = time; k = the degree of freedom
unbiased estimate of the MSE —– S^2
this is a degree-of-freedom penalty
S^2 = sum(e^2)/(T - k)
Akaike information criterion (AIC)
AIC = e^(2k/T) x sum(e^2)/T
Scharwz information criterion (SIC)
SIC = T^(k/T) x sum(e^2)/T
consistency
two considerations are required for a model selection criteria to be considered consistent based on whether the true model is include among the regression models being considered
1. when the true model or data-generating process (DGP) is one the defined regression models, then the probability of selecting model approached one as the sample size increases
2. when the true model is not one of the defined regression models being considered, then the probability of selecting the best approximation model approached one as the sample size increases
MSE —> S^2 —–> AIC —-> SIC(consistent)
惩罚力度越来越来大
AIC 具有渐进有效性asymptotically
degree-of-freedom penalties, various model selection criteria page 13
所有指标越小越好
https://analystprep.com/study-notes/frm/part-1/modeling-and-forecasting-trend/
chapter 11: modeling and forecasting seasonality
definition: a seasonal pattern which repeat itself every year
sources: weather, preference, social institution
how to deal:
macroeconomic: non-seasonal fluctuations
business forecast: seasonal fluctuations
modeling:
yt = beta1 x TIMEt + sum(yi x Dit) + errort
注意考虑哑变量多重共线性:
with intercept: 自变量取 S-1
without intercept: 自变量取S
calendar effect :
holiday variation
trading day variation ……………..
https://analystprep.com/study-notes/frm/part-1/modeling-and-forecasting-seasonality/
chapter 12: cycles and modeling cycles: MA, AR, and ARMA models
page 14
https://analystprep.com/study-notes/frm/part-1/modeling-cycles-ma-ar-and-arma-models/
chapter 13: volatility
page 15
https://video.search.yahoo.com/yhs/search?fr=yhs-iba-
1&hsimp=yhs1&hspart=iba&p=Book+2+volatility+in+
FRM#id=1&vid=08caa00f5c3b92940c0e7ab76443
37bd&action=click
chapter 14: correlation and copulas
page 17
https://analystprep.com/study-notes/frm/part-1/correlations-and-copulas/
chapter 15: simulation method
page 17
https://www.youtube.com/watch?v=imxpWMAZMj4