FRM Level 1 Part 2 Flashcards

1
Q

chapter 1: probability
1. random event and probability

basic concept of probability

A
  1. outcome and sample space
2. relationship amount events
mutually exclusive events (互斥事件)
exhaustive events(便利事件)
independent events(独立事件)
the occurrence of B has no influence of the occurrence of A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

the type of probability

A

Joint probability is the probability of two events occurring simultaneously.
Marginal probability is the probability of an event irrespective of the outcome of another variable.
Conditional probability is the probability of one event occurring in the presence of a second event.

unconditional probability
p(A)

conditional probability
p(A|B)

join probability
p(AB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

two important rule

A

multiplication rule
p(AB) = p(A|B)xp(B)

if they are independent
p(AB) = p(A)xp(B)

additional rule
p(A+B) = p(A) + p(B) - p(AB)

if mutually exclusive
p(A+B) = p(A) + p(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. discrete and continuous random variable
A

discrete random variable
number of possible outcomes can be counted

continuous random variable
it can take on any value within a given infinite and finite range
P(X=x) = 0 even thought event X can occur

provability density distribution:
discrete random variable (probability that a discrete random will take on the value X)

continuous random variable
PDF 即 f(x), 他表示X 对应的函数值
p(x1<=X<=x2)(区间与PDF围成的面积即为概率)

cumulative distribution function 
concept:  the probability that a random variable will be less than or a given value ----> F(x) = F(X <= x)
characteristic:
单调性递增
有界性:x 逼近负无穷为0,正无穷为1
P(a
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Chapter 2 Bayesian Analysis

A

Total probability theorem
if A1,….,An are mutually exclusive and exhaustive
p(B) = the sum of p(Aj)p(B|Aj) from j = 1 to n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Baye’s Theorm

A

p(A|B) = p(B|A)/p(B) x p(A)

p(A|B): updated probability 后验概率
p(A): prior probability 先验概率
p(B)用全概率公式计算

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

3.Basic Statistic

A
arithmetic mean 
population mean 
mu = (the sum of Xi from i=1 to N)/N
sample mean
x = (the sum of Xi from i=1 to n)/n

median
the middle item of a set of items sorted into ascending or descending order
odd (n+1)/2 and even n/2

mode
most frequently occurring value of the distribution

expected value 
definition 
E(x) = X1*p(X=x1)+...+Xn*p(X=xn)
properties:
if c is any constant, then E(cx + a) = cE(x) + a
E(X+ Y) = E(X) + E(Y)
if X and Y are independent random variables, then 
E(XY) = E(X)*E(Y)
E(X^2) != [E(X)]^2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. dispersion
A

variance for data:
population variance
sample variance

standard deviation
population standard deviation
sample standard deviation

variance for random variable 
formula: var(x) = E[(X - mu)^2] = E(X^2) - [E(X)]^2
properties:
if c is any constant 
Var(X+c) = Var(X)
Var(cX) = c^2 *var(X)

if X and Y are independent random variable, then
Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y)
Var(X-Y) = Var(X) + Var(Y) - 2Cov(X,Y)

平方根法则:
期初债券持有量应为(n-1)Y/n,现金持有量为Y/n;期末债券持有量为0,现金持有量为Y/n。平均债券持有量就是(n-1)Y/2n。那么考虑如下的最大化过程(持有债券收益的最大化,控制变量为变现次数n)
Max(n-1)Yr/2n-nb
一阶条件为Yr/2n^2-b=0
解得n=√(Yr/2b)
其平均的现金持有量为Y/2n,将上述结果代入得
Md=√(Yb/2r)这就是平方根法则的数学表示,也是鲍莫尔模型的最大化结果

covariance
definition:
the relationship between the deviation of two variables
Cov(X, Y) = E[X - E(X)][Y - E(Y)] = E(XY) - E(X)E(Y)

properties
1. Cov(X, Y) 是正无穷到负无穷
2. if X and Y are independent, then E(XY) = E(X)E(Y) and cov(X, Y) = 0
3. if X = Y, then Cov(X,X) =E[X - E(X)]E[X - E(X)] = var(X)
4. Cov(a+bX, c+dY) = bd
Cov(X,Y)
5. Var(w1X1, w2X2) = [w1^2]Var(X1) + [w2^2]Var(X2) + 2w1w2Cov(X1,X2)
w1 与w2 分别为X1 与X2 的权重

correlation
definition:
linear relationship between two variable px,y = Cov(x,y)/std(x)*std(y). p 区间在-1 到1 and it is no units
properties
p = 0 indicate the absence of any linear relationship, but perhaps exist non-linear relationship
the bigger of the absolute value, the stronger linear relationship

correlation coefficient with interpretation
p = +1 perfect positive linear correlation
0<p></p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. Skewness
A

definition
how symmetrical the distribution around the mean
skewness = E[(X - mu)^3]/std^3

properties
symmetrical distribution:
Skewness = 0

positively skewed distribution (right skew): Skewness>0
outlier in the right tail———-mean>median>mode

negatively skewed distribution (left skew):
Skewness < 0
outlier in the left tail: many financial asset exhibit negative skew (more risky)———-mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. Kurtosis
A

definition: the degree of how many weight on extreme points in the tail
Kurtosis =E[(X - mu)^4]/std^4

         leptokurtic       mesokurtic       platykurtic kurtosis    >3                 = 3                    <3 excess kurtosis >0       =0                     <0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

chapter 4

1. discrete probability distribution

A

Bernoulli distribution:
definition: trail produces one of two outcomes (success or failure)

properties
E(X) = p1 +(1-p)0
Var(X) = p*(1-p)

Binomial Distribution
definition: The distribution of binominal random variable which is defined as the number of success in n Bernoulli trails

properties
The probability is constant for all trails
the trail are all independent
E(X) = np
Var(X) = np(1-p)
p(x) = P(X = x) = n!/[(n - x)!x!] * p^x(1-p)^(n-x)
as n increase and P —–> 0.5 approximate normal distribution

Poisson distribution
definition: used to model the occurrence of events over time
properties:
f(x) = P(X = x) = (v^x*e^(-v))/x!
v —> the average or expected number of events in the interval
X —> the success number of number of events in the interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

continuous probability distribution

A

uniform distribution
definition:
the probabilities for all possible outcomes are equal

graph: 
probability density function
f(x)= 
1/(b-a) for  (a <= x <= b) 
0         for  otherwise 
cumulative probability function 
F(x) =
0                    for  x <= a 
(x - a)/(b - a)   for  a < x < b
1                     for  x >=  b

properties
E(x) =(a + b)/2 Var(X) = (b - a)^2/12
For all a <= x1 <= b: P(x1<=X<=x2) =(x1-x2)/(b-a)
standard uniform distribution: a = 0, b = 1

normal distribution
properties:
completely described by mean and variance
X~N(mean, variance)
Skewness = 0, kurtosis = 3
1. linear combination of independent normal distributed random variables is also normally distributed
2. probabilities decrease further from the mean. But the tails go on forever

some special data
68% confidence interval is [X - 1std, X + 1std]
90% confidence interval is [X - 1.65std, X + 1.65std]
95% confidence interval is [X - 1.96std, X + 1.96std]
98% confidence interval is [X - 2.33std, X + 2.33std]
99% confidence interval is [X - 2.58std, X + 2.58std]

normal distribution —–when mean = 0, std = 1 ———————–>standard normal distribution (standardizing steps Z = (X-mean)/std)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

lognormal distribution

A

definition:
lnx is normal, x is lognormal
Y is normal, e^y is lognormal

properties
if lnX is normal, then X is lognormal
if lnX is normal distribution, e^x is lognormal distribution

chart
Right skewed
Bounded from below by zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Sampling distribution

A

student distribution
definition:
if Z is a standard normal distribution and U is a chi-square variable with K degrees of freedom, which is independent of Z, then the random variable X follows a t-distribution with k degree of freedom X

X = Z/sqrt(U/K)
Z: standard normal variable
U: chi-square variable
K: degree of freedom

tips:
chi-square variable could be the sum of squares
Y = S1^2+…+Sn^2
where S1,…, Sn are independent standard normal random variables

properties:
1. symmetrical (bell shaped), skewness = 0
2, defined by single parameter: degrees of freedom (df), and df = n -1, where n is the sample size
3. comparison with normal distribution
fatter tails
As df increase, T-distribution is approaching to standard normal distribution
Given a degree of confidence, t-distribution has a wider confidence interval
As df increase, t-distribution is becoming more peaked with thinner tails, which means smaller probabilities for extreme values

Chi-Square (x2) distribution
definition:
If we have k independent standard normal variables, Z1,…,Zk, then sum of their square S, has a chi-square distribution
S = Z1 +… +Zk
k is the degree of freedom (df = n-1 when sampling)

properties
Asymmetrical, bounded below by zero
As df increase, converage to normal distribution
if the sum of two independent chi-square variables, with k1 and k2, degree of freedom will follow chi-square distribution, with k1 + k2 degrees of freedom

F-distribution
definition:
if U1 and U2 are two independent chi-squared distribution with K1 and K2 degrees freedom, then X follow an F-distribution
X = (U1/K1)/(U2/K2)

properties
As K1 and K2 —–> 正无穷, F-distribution approach normal distribution
if X follow t(k), then X^2 has an distribution: X^2 has an F-distribution
when sampling, df are n1-1 and n2-1

X^2 ~F(1,k)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

degree freedom

A

Df =N−1
Df = degree of freedom
N = sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

chapter 5 confidence interval and hypothesis testing

1. point estimation

A

statistical inference:
making forecasts, estimates or judgments about a population from the sample actually drawn from that population
Getting sampling from sampling population
Sample statistic use the sample to estimate parameter
so, we can use population to directly parameter

sample mean & Sample variance
sample mean: s = the sum of (Xi - mean) / (n)
E(X) = mean
var(X)= var/n
sample variance: s = the sum of (Xi - mean) / (n-1)

central limit theorem:
假设条件: simple random sample (即i.i.d), 方差有限不为0,样本量>30
结论 X~N(mean, var(x))

properties of estimator:
unbiased:
the expected value of the estimate equals the parameter
efficient (Best) var(population of X) <= var(sample of X)
if variance of estimator smallest amount all unbiased estimator
consistent: n 越大,参数估计约准确
linearity

17
Q
  1. confidence interval
A

point estimate +or- reliability factor x standard error

known population variance
x +or- Z a/2 sigma/sqrt(n)
unknown population variance
x +or- t a/2 sigma/sqrt(n)

CI with known and unknown population variance

sampling from a: reliability factor
distribution variance small s(n<30) large(>30)
normal known z-statistic z-statistic
normal unknown t-statistic t-statistic
Non-nor known Not available z-statistic
Non-nor unknown Not available t-statistic

Factor affect the width of confidence interval

change in Factors for z-distribution for t-~
Larger alpha smaller smaller
larger n smaller smaller
larger df N/A smaller
larger s larger larger

z-distribution
(x-mean) / standard deviation

18
Q

Hypothesis test

A

Null hypothesis —–> Ho
Alternative hypothesis —–> Ha
我们通常将想要的结果放在备择假设

page 7
one tail test vs. two tailed test

type I error vs. type II error
type I error
rejecting null hypothesis when it is true
the probability of making a type I error is equal to alpha, also known as the level of significant of the test

type II error
failing to reject the null hypothesis when it is true
the probability of make a type II error is equal to beta
Power of the test: rejecting the null hypothesis when it is false. the probability of power of test is equal to 1 - beta

summarize page 7

test of population mean and variance
page 8

summary of hypothesis testing
1. mean hypothesis s testing:

1.1 normal distributed population, known as population variance
mean = mean0 (mean of null hypothesis)
Z = (sample of mean - mean0) / [std/sqrt(sample size)]
tip: std = sqrt(population variance)
N(0, 1) normal distribution

1.2 normal distributed population, unknown population variance
mean = mean0 (mean of null hypothesis)
t = (sample of mean - mean0) / [s/sqrt(sample size)]
tip: s = sample variance
tn-1 t-distribution

  1. variance hypothesis s testing
    2.1. normal distributed population
    variance = variance0 (variance of null hypothesis)
    X^2 = (n-1) x sample variance /variance0 (variance of null hypothesis)
    X^(n-1) chi-square distribution

2.2 two independent normal distribution populations
variance of first normal distribution populations = variance of second normal distribution populations
F = sample variance of first one/ sample variance of second one
F(first n - 1, second n -1) F distribution is the ratio of the two independent chi-square distribution

decision rule
1. p-value
definition: the smallest significant level at which the null hypothesis can be rejected
decision rule: rejecting the null hypothesis if p-value <= alpha (单双尾规则一样)
2. 样本统计量 > critical value, 此时就拒绝原假设

19
Q

Chapter 6: Confidence interval and hypothesis testing

1. regression equation

A

population
Yi = Beta0 + Beta1Xi +ui
Y: dependent (explained) variable, regressand (回归子)
X: independent (explanatory) variable, regressor (回归元)
Beta0: regression intercept term
Beta1: regression slope coefficient
ui: error term (residual term)

sample

20
Q
  1. Ordinary least Square (OLS)
A

assumption
E(ui|xi) = 0
all (X, Y) observation are independent and identically distributed (i.i.d.)
large outliers are unlikely

principle
minimize the squared residuals (error term)

formula
beta1 = Cov(X, Y)/var(X)
beta0 =Y - beta1 * X
因为回归曲线一定通过(X, Y)

21
Q
  1. measure of fit
A

coefficient of determination (R^2)
R^2 = ESS/TSS = 1 - SSR/TSS
for i in n
Total Sum of Square (TSS) :
TSS = sum[(Yi within population - population mean)^2]
Explained Sum of Squares (ESS)
ESS = sum[(Yi within sample - population mean)^2]
Residual Sum of Square (SSR):
SSR = sum[( Y within population - sample mean)]

characteristic:
R^2 range between 0 and 1. near 1 indicate X is good at predicting Y
for one independent variable: R^2 = px,y^2

standard error of regression
identification:
an estimator of the standard deviation of the regression error ui
formula: SER = sqrt(SSR/[n-2]) = sqrt(the sum of ui^2/[n-2])
judgement:
该指标越小越好

22
Q

chapter 7: testing hypothesis and confidence intervals in single regression

A
  1. testing hypothesis about coefficient and confidence interval
    null hypothesis and alternative hypothesis
    H0: beta 1 = beta 1, 0 (the hypothesized value)
    如果 beta1,0 = 0, 则为显著性(significant)测验

t-statistic:
t = (sample of beta1 - beta1,0)/SE(sample of beta1) 自由度为n-1
decision rule:
reject H0 if t-statistic > t critical or t-statistic < -t critical
p-value < alpha
the meaning of reject H0
regression coefficient is different from the beta1,0, given a level of significant alpha
common format for regression result
test score = 698.9 - 2.28 ClassSize
R^2 = 0.051 SER =18.6
Low R^2 not imply that this regression is good or bad. But it tell us that there has other important factors

page 9 ????

23
Q
  1. Binary /Dummy/indicator variable
A
identification
it take on only two value, 0 or 1
formula 
Yi = Beta0 + beta1Di + ui   Di = 0 or 1 
Beta0 indicates E(Y|Di = 0) 
Beta1 indicates E(Y|Di = 1) - E(Y|Di = 0)
24
Q
  1. Homoscedasticity and heteroskedasticity
A

Homoscedasticity
Var (ui|X) = σ^2
This means that the variance of the error term ui is the same regardless of the predictor variable X.

  1. Homoskedasticity occurs when the variance of the error term in a regression model is constant.
  2. If the variance of the error term is homoskedastic, the model was well-defined. If there is too much variance, the model may not be defined well.
  3. Adding additional predictor variables can help explain the performance of the dependent variable.
  4. Oppositely, heteroskedasticity occurs when the variance of the error term is not constant.

heteroskedasticity
If Homoscedasticity is violated,
e.g. if Var (ui|X) = σ^2*(X), then we say the error term is heteroskedastic.

consequences
1. the OLS estimator is still unbiased, consistent and asymptotically normal. But not efficient
2. influence the standard error of the coefficient
if standard deviation of coefficient small, t-statistic large, Type I error
if standard deviation of coefficient large, t-statistic small, Type II error

how to deal
calculate robust standard errors
use weighted least square (WLS)

25
Q

Gauss-Markov Theorem

A

identification
if the three least squares assumptions hold and it the error is Homoscedastic, then the OLS estimator is the best linear conditionally unbiased estimator (BLUE)

limitation
its conditions might not hold in practice
there has other not linear and conditionally unbiased estimators, which are more efficient than OLC
if extreme outlier are not rare. Use least absolute deviation.

In statistics, the Gauss–Markov theorem states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero.[1] The errors do not need to be normal, nor do they need to be independent and identically distributed (only uncorrelated with mean zero and homoscedastic with finite variance). The requirement that the estimator be unbiased cannot be dropped, since biased estimators exist with lower variance. See, for example, the James–Stein estimator (which also drops linearity) or ridge regression.

Zero mean so that the noise does not present a net disturbance to the system. There’s as much positive noise as negative, so they cancel out in the long run. If the mean were not zero, then the noise would appear as an additional dynamic.

The Gauss–Markov assumptions concern the set of error random variables,ei:
They have mean zero: E[ei] = 0
They are homoscedastic, that is all have the same finite variance: Var(ei) = σ^2 < finite for all i
Distinct error terms are uncorrelated: Cov(ei,ej)= 0
i != j

26
Q

Chapter 8: linear regression with multiple regression

1. omitted variable

A

identification
1. the omitted variable is correlated with the movement of the independent variable in model
2. the omitted variable is determinant of the independent variable
在统计中,当统计模型遗漏一个或多个相关变量时,会发生遗漏变量偏差(OVB)。 偏差导致模型将缺失变量的影响归因于所包含的变量。

更具体地说,OVB是在回归分析的参数估计中出现的偏差,当假定的规范不正确时,因为它忽略了一个自变量,该自变量是因变量的决定因素,并且与一个或多个包含的自变量相关 变量。

impact

  1. the assumption of E(ui|xi) = 0 is not hold because Cov(ui, xi) != 0
  2. OLS estimator is biased, and biased dose not vanish even in large sample
  3. the large |p|, the larger bias

solution
add omitted valuable into the model

27
Q
  1. multiple regression
A

formula
Yi = Beta0 + Beta1X1i + Beta2X2i + … + BetakXki + ui
the population regression line
E(Y|X) = beta0 + beta1X1i + Beta2X2i + … + BetakXki
the intercept term is the expected value of Yi when Xki = 0
partial effect: beta1 = deltaY/DelaX1, holding X2,…, Xk constant, control variable

Homoscedastic
var(ui|X1i, …, Xki) is constant
OLS method still can be used

28
Q
  1. multiple regression assumption
A

E(ui|X1i, …, Xki) = 0
(X1i,…,Xki, Yi), i = 1,…,n are independently and identically distributed (i.i.d.)
large outlier are unlikely
there is no perfect multicollinearity

29
Q
  1. Multicollinearity
A

perfect multicollinearity
identification: if one of the independent variable is a perfect linear, combination of the other independent variables
impact: product division by zero in the OLS estimates
example: Dummy variable trap:
without beta0 –>N Dummy variables
with beta 0 –> Dummy variables

imperfect multicollinearity
identification : two or more independent variables are high correlated but not perfect correlated
impact: does not pose any problems for OLS estimators (still unbiased); have a high variance
method to detect:
t-test indicates that none of the individual coefficient is significantly different than zero, while the F-test overall significant and R^2 is high;
the absolute value of the sample correlation is greater that 0.7

30
Q
  1. measure of Fit
A

standard error of regression (SER)
回归的标准误差提供了数据点从回归线落下的典型距离的绝对度量。 S以因变量为单位。
R平方提供模型解释的因变量方差百分比的相对度量。 R平方的范围可以从0到100%。
SER = sqrt(SSR/(n-k-1)) = sqrt(sum(e^2)/n-k-1)
SER = sqrt(SSR/[n-2]) = sqrt(the sum of ui^2/[n-2])
(Residual sum of squares) SSR = sum[( Y within population - sample mean)]

tip: k = numbers of independents

coefficient of determination (R^2)
R^2 = ESS/TSS = 1 - SSR/TSS
for i in n
Total Sum of Square (TSS) :
TSS = sum[(Yi within population - population mean)^2]
Explained Sum of Squares (ESS)
ESS = sum[(Yi within sample - population mean)^2]
R^2 increases whenever a regressor is added, unless the estimated coefficient on the added regressor is exactly zero

adjusted R^2
formula: adjusted R^2 = 1 - [(n-1)/(n-k-1) x (1 - R^2)] = 1 - [(n-1)/(n-k-1) x (SSR/TSS)]

nature:

  1. adjusted R^2 <= R^2
  2. adjusted R&2 can be negative
  3. adding a regressor has two opposite effect, adjusted R^2 can increase or decrease

the R^2 or adjust R^2 dose not tell us

  1. an included variable is statistically significant
  2. the regressor are a true cause of the movement in the dependent variable
  3. there is omitted variable bias
  4. you have chosen the most appropriate set of regressors
31
Q

chapter 9: Hypothesis test and confident interval in multiple regression

A

page 13

tips: t-statistic = (estimated regression coefficient - value of estimate under H0)/standard error of estimated coefficient;
tips: the t-statistic has n - k - 1 degrees of freedom where k = number of independents (也就是在多元回归中的因子数)

confidence interval for a single coefficient
the confidence interval (CI) for a regression coefficient in multiple regression is calculated and interpreted the same way as it is simple linear regression

CI = estimated regression coefficient + or - critical t-value x standard error of regression coefficient

joint hypothesis (F-test)
in a multiple regression, we cannot test the null hypothesis that all slope coefficient equal 0 based on t-test that each individual slope coefficient equal 0
why? individual test do not account for the effects of interactions amount the independent variable 

for this reason, we conduct the F-test

the F-statistic, which is always a one-tailed test, is calculated as:
F = (ESS/k)/(SSR/[n-k-1])

n = number of observation 
k = number of independent variables 
ESS = explained sum of squares 
SSR = sum of squared residuals

identification:
a hypothesis that impose two or more restriction on the regression
null hypothesis and alternative hypothesis
H0: beta1 = beta2 = beta3 = … = betak = 0
Ha: at least one Betaj != 0
the test assesses the effectiveness of the model as a whole in explaining the dependent variable

使用单独的自由度为n-k-1的F检测且不能够用单独的t检测来代替

classification of F-statistics ?????????????????????
with q = 1 restriction: F is the square of the t-statistic
with q = 2 restriction: F = 1/2 x (t1^2 + t2^2 - 2 x pt1,t2 x t1 x t2)/ (1 - pt1,t2^2)
with q restriction (valid only when homoscedastic)

F = (SSR[restricted - SSR[unrestricted]/q)/[SSR[unrestricted]/(n - k[unrestricted] - 1)]

F = [(R[restricted]^2 - R[unrestricted]^2)/q] / [(1 - R[unrestricted]^2)/(n - k[unrestricted] - 1)

page 13

R^2
to determined the accuracy within which the OLS regression line fits the data, we apply the coefficient of determination and the regression’s standard error

the coefficient of determination, represented by R^2, is a measure of “goodness of fit” of the regression

it is interpreted as the percentage of variation in the dependent variable explained by the independent variables

R^2 = (total variation - unexplained variation)/total variation

adjusted R^2

However, R^2 is not reliable indicator of the explanatory power of multiple regression model

why? R^2 almost always increases as new independent variables are added to the model, even if the marginal contribution of the new variable is not statistically significant

thus, a high R^2 may reflect the impact of a large set of independents rather that how well the set explain the dependent

https://analystprep.com/study-notes/frm/part-1/hypothesis-tests-and-confidence-intervals-in-multiple-regression/

32
Q

chapter 10: modeling trend

A

linear trend b
Tt = Beta0 + Beta1 x TIMEt (it is a straight line)
non-linear trend:
quadratic trend—
Tt = Beta0 + Beta1 x TIMEt + Beta1 x TIMEt^2
= Beta2 x (TIME + Beta1 / 2*Beta2)^2 + C

—– monotonically (when beta1 > 0, beta2 >0 & beta1 < 0, beta2 < 0)

long-linear trend:
Tt = beta0 x exp(Beta1 x TIMEt)
In(Tt) = In(Beta0) + Beta1 x TIMEt

—— 增长率是常数

estimating & forecasting

estimating:
(beta0-hat, beta1-hat) = argmin sum(yt - Beta0 - Beta1 x TIMEt)^2

forecasting:
y[T+ h] = Beta0 + Beta1 x TIME[T + h] + error[T+h]

forecast model selection criteria

mean squared error (MSE)
MSE is related to two other diagnostic previously look at --- the sun of square residuals (SSR) and the coefficient of determination:
MSE = SSR / T
T = sample size 
R^2 = 1 - SSR/TSS

tip: the following is penalties form of “penalty time MSE”; e = error; T = time; k = the degree of freedom

unbiased estimate of the MSE —– S^2
this is a degree-of-freedom penalty
S^2 = sum(e^2)/(T - k)

Akaike information criterion (AIC)
AIC = e^(2k/T) x sum(e^2)/T

Scharwz information criterion (SIC)
SIC = T^(k/T) x sum(e^2)/T

consistency
two considerations are required for a model selection criteria to be considered consistent based on whether the true model is include among the regression models being considered
1. when the true model or data-generating process (DGP) is one the defined regression models, then the probability of selecting model approached one as the sample size increases
2. when the true model is not one of the defined regression models being considered, then the probability of selecting the best approximation model approached one as the sample size increases

MSE —> S^2 —–> AIC —-> SIC(consistent)
惩罚力度越来越来大

AIC 具有渐进有效性asymptotically

degree-of-freedom penalties, various model selection criteria page 13

所有指标越小越好

https://analystprep.com/study-notes/frm/part-1/modeling-and-forecasting-trend/

33
Q

chapter 11: modeling and forecasting seasonality

A

definition: a seasonal pattern which repeat itself every year
sources: weather, preference, social institution

how to deal:
macroeconomic: non-seasonal fluctuations
business forecast: seasonal fluctuations

modeling:
yt = beta1 x TIMEt + sum(yi x Dit) + errort

注意考虑哑变量多重共线性:
with intercept: 自变量取 S-1
without intercept: 自变量取S

calendar effect :
holiday variation
trading day variation ……………..

https://analystprep.com/study-notes/frm/part-1/modeling-and-forecasting-seasonality/

34
Q

chapter 12: cycles and modeling cycles: MA, AR, and ARMA models

A

page 14

https://analystprep.com/study-notes/frm/part-1/modeling-cycles-ma-ar-and-arma-models/

35
Q

chapter 13: volatility

A

page 15
https://video.search.yahoo.com/yhs/search?fr=yhs-iba-
1&hsimp=yhs1&hspart=iba&p=Book+2+volatility+in+
FRM#id=1&vid=08caa00f5c3b92940c0e7ab76443
37bd&action=click

36
Q

chapter 14: correlation and copulas

A

page 17

https://analystprep.com/study-notes/frm/part-1/correlations-and-copulas/

37
Q

chapter 15: simulation method

A

page 17

https://www.youtube.com/watch?v=imxpWMAZMj4