Chapter 3: conditional expectation and martingales Flashcards
Conditional expectation considering X and Z, discrete random variables. Values in {x_1,..,x_m} and {z_1,..,z_n}
P[X=x_i | Z= z_j]
E [X/Z = zj] =
Y= E[ X/Z] where if z(w)=z_j, then …
Conditional expectation defined:
P[X=x_i | Z= z_j] = P[ X=x_i, Z= z_j] / P[ Z=z]
E[ X/Z = zj] = sum of i of x_iP[X=x_i | Z= z_j]
Y= E[ X/Z] where if z(w)=z_j, then Y(w)= E[X\Z = z_j]
Problems:
1) not clear how discrete and continuous interact
2) what if we don’t have all vars discrete or continuous RVs
For a prob space ( Ω,F,P) a RV X: Ω) to R
For a large F we want to work with sub sigma algebra g: we want random variable Y st
1) Y is in mG ie Y is G-measurable
Y depends on the info we have
2) Y is the best way to approximate X with a G-measurable random variable
Eg best prediction for X, given G, info we have up to today
Unique best prediction
Eg minimise E[ |Y-X|]
Minimise var(Y-X) etc
Theorem 3.1.1- conditional expectation
Let X be an L^1 random variable on ( Ω,F,P). Let G be a sub-sigma-field if F. Then there exists a random variable Y in L^1 st
1) Y is G-measurable
2) For every “G” In G, E[Y1_G] = E[ X1_G]
For some event “G”in G, conditional expectation is 0 if not on event
(Indicator function used)
Moreover, if Y’ in L^1 is a 2nd RV satisfying conditions P [Y=Y’]=1
(Doesn’t tell us what Y is)
Definition 3.1.2 best way to approximate X given only info in G
We refer to Y as a version of the conditional expectation of X given G and write Y= E[X\G]
Sketch proof of def 3.1.2: Y as conditional expectation
Look at all the missing info and use expectation to average missing info, to predict.
Y is a random variable, depending on info from G/“G”
Example: let X_1 and X_2 be independent random variables
P[ X_i =1] = P [X_i = -1] = 0.5
Claim: E[ ( X_1 + X_2) | sigma(X_1) ] = X _1
Note: X_1 + X_2 is X , sigms(X_1) is G
That is, info in sigma field generated by X_1 not X_2
, X_1 is Y
E[ ( X_1 + X_2) | sigma(X_1) ] = X _1 + 0
+0 as expectation of X_2 is 0, averaging out and we don’t know information
Proof: we need to check properties one and two
1)
X_1 in sigma(X_1) by lemma 2.2.5 and do Y is in mG ie Y is G-measurable
2) Take “G” in G ( event “G” which is G-measurable)
E[ X 1_G] = E[ (X_1 + X_2) 1_G]
= E[ X_1 1_G] + E[X_2 1_G]
( note: X_2 is in msigma(X_2) by lema 2.2.5, ie it is simga(X_2) measurable.
(We wanted: expectation of Y1_G and expectation of X_1 1_G)
Similarly indicator function is in mG by Lemma 2.2.4)
(Sigma(X_1) and sigma(X_2) are also both independent)
= E[X_1 1_G] + E(X_2)E(1_G)
conditional expectation
E( X|G) =y
X is the RV we want to predict
G is Sigma fields cu representing info we currently know
Y is the conditional expectation of X given G, best guess for X
Proposition 3.2.2 properties of conditional expectations
Let G, H (curly G and H) be sub-sigma-fields if F and X,Y.Z in L^1
a_1, a_2 in R
Then, almost surely
LINEARITY:
E[{a_1X_1 + a_2X_2 }| G = a_1E[X_1|G] + a_2E[X_2|G]
ABSOLUTE VALUES:
|E[X|G]| less than or equal to E[ |X| |G]
MONOTONICITY:
If X is less than or equal to Y then
E[ X|G] less than or equal to E[Y |G]
CONSTANTS:
If a is in R (deterministic) then
E[ a|G] =a
MEASURABILITY:
If X is G measurable ( X depends on info only in G) (show you’ve checked this condition)
Then E[X|G ] =X
INDEPENDENCE:
If X is independent of G ( X depends on info we don’t have)
Then E[X\G] = E[X]
TAKING OUT WHAT IS KNOWN:
(No info but condition)
If Z is G-measurable, then E[ZX|G] = ZE[X|G]
TOWER:
If H is subset of G then E[E[X\G]|H] = E[X|H]
TAKING E:
It holds that E[ E[ X\G]] = E[X]
Ie interaction between conditional expectation and expectation
NO INFO: It holds that E[X | {empty ,sample space}] = E[X]
If taking conditional expectation given smallest sigma field eg this is {empty,sample space} which gives no info; if we don’t know anything best guess is E[X]
(Remember first 5 properties and always write which one you’ve used)
Lemma 3.2.3
Expectations of X and conditional expectations
Let G be a sub-sigma-field of F. Let X be an F-measurable random variable and let Y=E[X|G]. Suppose that Y’ is a G-measurable random variable. Then
E[ (X-Y)^2] less than or equal to E[ (X-Y’)^2]
Ie measure distance between X and Y( the conditional expectation of X)
Y is at least as good an estimator if X as Y’ is, ie no better approx of X than Y
Lemma 3.2.3 PROOF
Let G be a sub-sigma-field of F. Let X be an F-measurable random variable and let Y=E[X|G]. Suppose that Y’ is a G-measurable random variable. Then
E[ (X-Y)^2] less than or equal to E[ (X-Y’)^2]
E[ (X-Y’)^2]= E [ (X-Y +Y-Y’)^2]
BY LINEARITY
= E[ (X-Y)^2] + 2E[(X-Y)(Y-Y’)] + E[ (Y-Y’)^2]
BY MONOTONICITY OF EXP, bigger than or equal to 0 as (Y-Y’)^2 bigger than or equal to 0
( look at middle term: TAKING E RULE E[(X-Y)(Y-Y’)]= E[E[(X-Y)(Y-Y’)|G]] BY LINEARITY = E[(Y-Y’)E[(X-Y)|G]] =E[(Y-Y’)( E[X|G]-Y)]
(Look at ( E[X|G]-Y)] =0))
Thus
E[ (X-Y’)^2]
Bigger than or equal to
E[ (X-Y)^2]
A stochastic process
A stochastic process (S_n)_{n=0} ^ infinity
(Or sometimes n=1)
Is a sequence of RVs.
A stochastic process is bounded
A stochastic process is bounded if for some c in R we have
|S_n| less than or equal to C for all n
DEF 3.3.1 filtration
A sequence of sigma-fields (F_n)_{n=0}^infinity
Is known as a filtration if F_0 subset F_1 … subset F
DEF 3.2.2 adapted
A stochastic process X= (X_n) is adapted to the filtration (F_n) if for all n, X_n is F_n measurable
- if filtration has info that we see, based on this info we can see value of F.
“Watch happen”, random value we know at all times n in N