Chapter 3: conditional expectation and martingales Flashcards
Conditional expectation considering X and Z, discrete random variables. Values in {x_1,..,x_m} and {z_1,..,z_n}
P[X=x_i | Z= z_j]
E [X/Z = zj] =
Y= E[ X/Z] where if z(w)=z_j, then …
Conditional expectation defined:
P[X=x_i | Z= z_j] = P[ X=x_i, Z= z_j] / P[ Z=z]
E[ X/Z = zj] = sum of i of x_iP[X=x_i | Z= z_j]
Y= E[ X/Z] where if z(w)=z_j, then Y(w)= E[X\Z = z_j]
Problems:
1) not clear how discrete and continuous interact
2) what if we don’t have all vars discrete or continuous RVs
For a prob space ( Ω,F,P) a RV X: Ω) to R
For a large F we want to work with sub sigma algebra g: we want random variable Y st
1) Y is in mG ie Y is G-measurable
Y depends on the info we have
2) Y is the best way to approximate X with a G-measurable random variable
Eg best prediction for X, given G, info we have up to today
Unique best prediction
Eg minimise E[ |Y-X|]
Minimise var(Y-X) etc
Theorem 3.1.1- conditional expectation
Let X be an L^1 random variable on ( Ω,F,P). Let G be a sub-sigma-field if F. Then there exists a random variable Y in L^1 st
1) Y is G-measurable
2) For every “G” In G, E[Y1_G] = E[ X1_G]
For some event “G”in G, conditional expectation is 0 if not on event
(Indicator function used)
Moreover, if Y’ in L^1 is a 2nd RV satisfying conditions P [Y=Y’]=1
(Doesn’t tell us what Y is)
Definition 3.1.2 best way to approximate X given only info in G
We refer to Y as a version of the conditional expectation of X given G and write Y= E[X\G]
Sketch proof of def 3.1.2: Y as conditional expectation
Look at all the missing info and use expectation to average missing info, to predict.
Y is a random variable, depending on info from G/“G”
Example: let X_1 and X_2 be independent random variables
P[ X_i =1] = P [X_i = -1] = 0.5
Claim: E[ ( X_1 + X_2) | sigma(X_1) ] = X _1
Note: X_1 + X_2 is X , sigms(X_1) is G
That is, info in sigma field generated by X_1 not X_2
, X_1 is Y
E[ ( X_1 + X_2) | sigma(X_1) ] = X _1 + 0
+0 as expectation of X_2 is 0, averaging out and we don’t know information
Proof: we need to check properties one and two
1)
X_1 in sigma(X_1) by lemma 2.2.5 and do Y is in mG ie Y is G-measurable
2) Take “G” in G ( event “G” which is G-measurable)
E[ X 1_G] = E[ (X_1 + X_2) 1_G]
= E[ X_1 1_G] + E[X_2 1_G]
( note: X_2 is in msigma(X_2) by lema 2.2.5, ie it is simga(X_2) measurable.
(We wanted: expectation of Y1_G and expectation of X_1 1_G)
Similarly indicator function is in mG by Lemma 2.2.4)
(Sigma(X_1) and sigma(X_2) are also both independent)
= E[X_1 1_G] + E(X_2)E(1_G)
conditional expectation
E( X|G) =y
X is the RV we want to predict
G is Sigma fields cu representing info we currently know
Y is the conditional expectation of X given G, best guess for X
Proposition 3.2.2 properties of conditional expectations
Let G, H (curly G and H) be sub-sigma-fields if F and X,Y.Z in L^1
a_1, a_2 in R
Then, almost surely
LINEARITY:
E[{a_1X_1 + a_2X_2 }| G = a_1E[X_1|G] + a_2E[X_2|G]
ABSOLUTE VALUES:
|E[X|G]| less than or equal to E[ |X| |G]
MONOTONICITY:
If X is less than or equal to Y then
E[ X|G] less than or equal to E[Y |G]
CONSTANTS:
If a is in R (deterministic) then
E[ a|G] =a
MEASURABILITY:
If X is G measurable ( X depends on info only in G) (show you’ve checked this condition)
Then E[X|G ] =X
INDEPENDENCE:
If X is independent of G ( X depends on info we don’t have)
Then E[X\G] = E[X]
TAKING OUT WHAT IS KNOWN:
(No info but condition)
If Z is G-measurable, then E[ZX|G] = ZE[X|G]
TOWER:
If H is subset of G then E[E[X\G]|H] = E[X|H]
TAKING E:
It holds that E[ E[ X\G]] = E[X]
Ie interaction between conditional expectation and expectation
NO INFO: It holds that E[X | {empty ,sample space}] = E[X]
If taking conditional expectation given smallest sigma field eg this is {empty,sample space} which gives no info; if we don’t know anything best guess is E[X]
(Remember first 5 properties and always write which one you’ve used)
Lemma 3.2.3
Expectations of X and conditional expectations
Let G be a sub-sigma-field of F. Let X be an F-measurable random variable and let Y=E[X|G]. Suppose that Y’ is a G-measurable random variable. Then
E[ (X-Y)^2] less than or equal to E[ (X-Y’)^2]
Ie measure distance between X and Y( the conditional expectation of X)
Y is at least as good an estimator if X as Y’ is, ie no better approx of X than Y
Lemma 3.2.3 PROOF
Let G be a sub-sigma-field of F. Let X be an F-measurable random variable and let Y=E[X|G]. Suppose that Y’ is a G-measurable random variable. Then
E[ (X-Y)^2] less than or equal to E[ (X-Y’)^2]
E[ (X-Y’)^2]= E [ (X-Y +Y-Y’)^2]
BY LINEARITY
= E[ (X-Y)^2] + 2E[(X-Y)(Y-Y’)] + E[ (Y-Y’)^2]
BY MONOTONICITY OF EXP, bigger than or equal to 0 as (Y-Y’)^2 bigger than or equal to 0
( look at middle term: TAKING E RULE E[(X-Y)(Y-Y’)]= E[E[(X-Y)(Y-Y’)|G]] BY LINEARITY = E[(Y-Y’)E[(X-Y)|G]] =E[(Y-Y’)( E[X|G]-Y)]
(Look at ( E[X|G]-Y)] =0))
Thus
E[ (X-Y’)^2]
Bigger than or equal to
E[ (X-Y)^2]
A stochastic process
A stochastic process (S_n)_{n=0} ^ infinity
(Or sometimes n=1)
Is a sequence of RVs.
A stochastic process is bounded
A stochastic process is bounded if for some c in R we have
|S_n| less than or equal to C for all n
DEF 3.3.1 filtration
A sequence of sigma-fields (F_n)_{n=0}^infinity
Is known as a filtration if F_0 subset F_1 … subset F
DEF 3.2.2 adapted
A stochastic process X= (X_n) is adapted to the filtration (F_n) if for all n, X_n is F_n measurable
- if filtration has info that we see, based on this info we can see value of F.
“Watch happen”, random value we know at all times n in N
DEF 3.3.3 martingale
A process M=(M_n)_{n=0} ^infinity
Is a martingale wrt (F_n) if
(M1) (M_n) is adapted
(M2) M_n in L^1 for all n
(M3) E[ M_{n+1} |F_n] = M_n
M3 is the martingale property of fairness
Def SUBMARTINGALE
A process M=(M_n)_{n=0} ^infinity
Is a submartingale wrt (F_n) if
(M1) (M_n) is adapted
(M2) M_n in L^1 for all n
(M3) E[ M_{n+1} |F_n] bigger than or equal to M_n
M3 is the martingale property of fairness
DEF SUPERMARTINGALE
A process M=(M_n)_{n=0} ^infinity
Is a supermartingale wrt (F_n) if
(M1) (M_n) is adapted
(M2) M_n in L^1 for all n
(M3) E[ M_{n+1} |F_n] less than than or equal to M_n
M3 is the martingale property of fairness
Example: a martingale
Consider (X_n) is RVs
P( X_i =1) = P( X_i =-1) = 0.5
S_n = sum from i=1 to n of [ X_i]
(Total win after n plays)
F_n = sigma(X_1…,X_n)
X_n is info from first n rounds
Then (Sn) is a martingale:
Equally likely to win or lose and independence outcomes thus in the long run expect 0, previous results don’t help you.
Checking this:
(M1)S_n in mF_n by p2.5 because X_i are sigma(X_i)-measurable
(M2) |S_n| =| sum from i=1 to n of X_i| *less than or equal to [X_i] = n
⇔
S_n is bounded S_n ∈L^1
*by triangle law
(M3)
E[S_{n+1} | F_n]
=
E[(X_{n+1} +Σ from i=1 to n of [X_i] ) |F_n ]
(indep of before time n is X_{n+1}, the sum is measurable) = E[X_{n+1}] + Σ from i=1 to n of [X_i] (using measurability and independence) =S_n
(as each expectation of X_n+1 is 0)
Example 3.3.9: filtration
2 general examples of martingales: 2
Example 3.3.9 Let Z ∈ L^1
be a random variable and let (F_n) be a filtration. Then
M_n = E[Z|F_n]
is a martingale.
- sequence of better approx to Z, by taking conditional exp wrt F_n
Lemma 3.3.6: expectation stays constant
Let (F_n) be a filtration and suppose that (M_n) is a martingale.
Then
E[M_n] = E[M_0]
for all n ∈ N
proof:
(M3) implies E[M_{n+1}|F_n] = M_n
And by taking expectation:
E[E[M_{n+1}|F_n]] = E(M_n)
by the taking exp property of conditional exp LHN
E(M_{n+1}) = E(M_n)
then by induction result follows
E(M_0) = E(M_n)
Example of a sub-martingale:
Take (X_i) to be iid
P[X_i =2] =P[X_i =-1] =0.5
(ie not symmetrical and expectation of X_i not equal to 0)
Now, E(X_0) = 0.5 bigger than 0
S_n = sum from i=1 to n [X_i]
Check (M1)(triangle law) X_i in mF_i ( ie measurable wrt generated filtration sigma(X_i,..,),
so X_i∈mF_n so by p S_n∈mF_n
(M2) |X_i| ≤2
so |S_n| ≤ sum from i=1 to n of |X_i| ≤ sum from i=1 to n of [2] = 2n less than infinity
So S_n is bounded by 2n, so S_n∈L^1
(M3) E[S_{n+1}| F_n] by def of S_n =E[(S_n + X_{n+1})| F_n] by linearity = E[S_n | F_n] + E[X_{n+1}| F_n] (S_n ∈mF_n and X_{n+1} is indep of F_n) =S_n + E[X_{n+1}] by conditional exp rules = S_n +0.5 ≥ S_n
lemma 3.3.6***
if (Mn) is a submartingale
if (Mn) is a submartingale, then by definition E[Mn+1 | Fn] ≥ Mn, so taking
expectations gives us E[Mn+1] ≥ E[Mn].
(E[M_0] ≤ E[Mn]. )
lemma 3.3.6***
if (Mn) is a supermartingale
For supermartingales we get E[Mn+1] ≤ E[Mn].
(E[M_0] ≥ E[Mn]. )
In words: submartingales, on average, increase, whereas supermartingales, on average, decrease.
The use of super- and sub- is counter intuitive in this respect
Remark 3.3.7 Sometimes we will want to make it clear which filtration is being used
Remark 3.3.7 Sometimes we will want to make it clear which filtration is being used in the definition of a martingale.
To do so we might say that
(Mn) is an Fn-martingale’,
or that
(Mn) is a martingale with respect to Fn’.
We use the same notation for super/sub-martingales
2 general examples of martingales: 1
Take iid seuence (X_n) se E[X_n]=1.
There exists c∈R st X_n is bounded by c for all n.
Then M_n = ∏ for {i=1,n} of X_n
(prodicts)
is a martingale wrt filtration F_n= σ(X1, . . . , Xn).