7 Random Processes: Discrete-time Markov Chains Flashcards
Evolutionary processes are random
We have looked @ where time is discrete (generations)
deterministic, maps and ODES
equations, given IC we could solve
but in reality chance is an important component of evolution
model using probabilistic tools
discrete vs continuous RVs
pmf
probability vector
probability density
cumulative functs
expected values
moments
discrete RV X with binomial distribution
parameter p
distribution function
generating function
expectation
variance
fⱼˣ
≡ Prob(X = j) =
(n)
(j) pʲ(1-p)ⁿ⁻ʲ
=Bin(n,j)
j=0,…,n
e.g. urn containing fraction p black balls fraction 1-p white balls
X is RV corresponding to drawing X=j black balls in n attempts (returning the balls)
(as you vary p small average (peak of distribution towards left) (p large peak towards right)
when p=0.5 distribution peak at RV x=10 when n=20)
generating funct: pgf
Gₓ(z) ≡ E(zˣ) ≡ Σⱼ₌₀ⁿ zʲ fⱼˣ = · · ·
= [pz + 1 − p]ⁿ
average:
differentiating pgf then setting z=1
E(X) = G′ₓ(z = 1) = np(pz + 1 − p)ⁿ−¹
=np
variance:
var(X) ≡ E(X²) − (E(X))²
= G′′ₓ(1) + G′ₓ(1) − [G′(1)]²
= np(1 − p)
stochastic process
A stochastic process is a collection of (discrete or continuous) random variables {Xₜ:t ∈ T}, where T is an index set and t ∈ T is here the discrete time index. The set of all
possible realizations (or range) of Xₜ
is S such that Xₜ ∈ S (see Sec. 7.5).
The sample sequence
{Xᵢ, . . . , Xⱼ} = {Xₜ ∈ S : t ∈ T˜
= [i, j] ⊂ T}
corresponds to the sample path
or stochastic realization of the process Xₜ on t ∈ T˜ = [i, j] which is a subset of T
e.g. Xₜ~ position each minute an organism moving one step left or right with equal probability
infinite line
over a period of 100 minutes, organism moves for total of 1000 minutes
This is an example of
a symmetric one-dimensional random walk,
the state space is S = Z and T˜ =
{0, 1, 2, . . . , 100} ⊂ T = {0, 1, 2, . . . , 1000}.
Typical sample paths, or stochastic realizations,
of this process for t ∈ T˜ .
{X_i,..,X_j}={X_t in S s.t t in [i,j]}
We notice that while S = Z, all paths
are here within {−20, . . . , −2, −1, 0, +1, +2, . . . , +20}.
There are relationships between the set of random variables {X_t}: they are not independent. The stochastic process X_t can be discrete or continuous in time and space, depending respectively on whether t is a discrete or continuous variable, and S is a discrete or continuous
set.
random walk
An example of discrete-time stochastic process is the one-dimensional “random walk” in which, at each discrete time increment ∆t random walker can take one discrete step to the right with prob. q or one step to left with prob. 1-q. The random walk is symmetric when q=1/2, and asymmetric otherwise.
position wrt origin
∆x
S= {. . . , −1, 0, 1, . . . }
position of random walker
probabilities
Position of the RWer is characterized by probability vector
p(t) = (. . . , p₋₁(t), p₀(t), p₁(t), . . .)ᵀ
where pⱼ(t) gives the probability for RWer to be @ position k after time t steps
t=0,1,…
and
Σ_{j∈S} pⱼ(t)=1
(probability conservation)
probability distribution
is the RW probability distribution
{p_{j∈S} (t)}
Discrete-time Markov chains (DTMC)
we want to know how the probability vector p(t) =
p_{i∈S} (t))
with
p_i∈S (t))=Prob {Xt = i ∈ S}
of X_t vary in time
Markov property
Markov process is s.t
Markov property, the process is
is “Markovian”, it is a “Markov process” (here a DTMC) and there is a systematic way to find p(t)
Consider DTMC and hence time is discrete
X_t has the Markov property if
Prob{Xₜ = iₜ|X₀ = i₀, . . . , Xₜ₋₁ = iₜ₋₁}
= Prob{Xₜ = iₜ|Xₜ₋₁ = iₜ₋₁}
iₖ ∈ S and k ∈ {0, 1, 2, . . . , t}
MEMORYLESS
MEMORYLESS
The Markov property therefore means that the value of Xₜ depends only of Xₜ₋₁ , its value at time t-1
=> a Markov process is often said to be memoryless
E.g DTMC
one step transition probability
1D random walk; each move is independent from the previous and the position at t depends on where the RWer is at t-1.
one step transition probability
pᵢⱼ(t)
one step transition probability
pᵢⱼ(t)
one step transition probability
pᵢⱼ(t)
the probability for X=i at time t+1 given that X=j at the previous time step t
when time is homogeneous
pᵢⱼ
one step transition matrix
one step transition matrix
P=(pᵢⱼ)
[p₁₁ p₁₂ p₁₃…]
[p₂₁ p₂₂ p₂₃…]
[p₃₁ p₃₂ p₃₃…]
all cols add to 1
with
pᵢᵢⱼ=1- sum_{i̸=j} pᵢⱼ
We assume that all are constant (time is homogeneous) and therefore , and we have
PROBABILITY CONSERVATION
sum_ i in S of pᵢⱼ=1
and pᵢⱼ≥ 0
stochastic matrix
one step transition matrix
Since all columns add up to 1, is a “stochastic matrix”.
P²,P³,… are also stochastic matrices
stochastic matrices are that all have eigenvalues , and at least
one eigenvalue is equal to 1.
l-step probability matrix
P⁽ˡ⁾
=Pˡ
gives the probability of transition from i to j in steps.
For DTMC, there is a simple relation between simple identity step probability matrices, or powers
of :Chapman-Kolmogorov equation (CKE):
Chapman-Kolmogorov equation (CKE):
Pˡ= Pˡ⁻ˢPˢ
for any ℓ > s > 0
=> Evolution of probability of probability vector
p(t)
=(p₀,p₁,…,pₙ)^T
of DTMC
X_t in S={0,1,…,n}
is obtained from the one-transition matrix using the CKE
p(t+1)=Pp(t)
gives
p(t)= Pᵗp(0)
{pᵢ(t)}ᵢ₌₀ᶦ⁼ⁿ is the probability distribution of X_tin S={0,1,…,n}
(note: pⱼ(t+1)= Σᵢ Pᵢⱼ pᵢ(t)
sum of prob. that X_t=i multiplied by prob. of one step transition i to j
Moments
From the probability distribution , or probability vector, we can obtain the moments of Xₜ
kth (“raw”) moment is the expected value of (Xₜ)ᵏ that is
E[( Xₜ)ᵏ] = Σ_{j∈S} jᵏ pⱼ(t)
Stationary probability distribution
In the long run
lim_t→∞ pᵢ(t)= πᵢ
and is the stationary probability distribution of
Xₜ∈ S = {0, 1, . . . }
The probability distribution π is stationary when it does not change in time
Pπ= π
HENCE π
is a right eigenvector of the one-step transition matrix P with eigenvalue 1
(In finite space S , the fact
that is a stochastic matrix P
of finite size ensures that there
is at least one such eigenvector.
However there can be more
than one.)
finite DTMC has a unique stat dist
A finite DTMC has unique stationary distribution π if X_t is irreducible (if it is always possible to reach any state from any other state) and aperiodic (if returns to any given state occur at irregular
intervals.)
Example from Q1(c) of Example Sheet 4:
X_t ∈ {1, 2, 3}
is a DTMC with 1-step transition matrix
P=(pᵢⱼ)
= Xₜ=1 Xₜ=2 Xₜ=3
Xₜ₊₁=1[ 0.5 0.5 0]
Xₜ₊₁=2[0.25 0.5 0.5]
Xₜ₊₁=3[0.25 0 0.5]
Stationary distribution obtained by solving the eigenvalue problem
Pπ= π =
(π₁)
(π₂)
(π₃)
solving gives
(2/5)
(2/5)
(1/5)
random walk
Random walk is broadly used to model movement in many biological applications, and in the continuum limits leads to the Brownian motion whose probability density obeys the diffusion equation.
In many applications, boundary conditions are important especially in the context of first-passage
problems.
Symmetric random walk on the infinite line Z
X_t
A random walker (RWer), initially at the origin (at t=0), moves along the infinite line.
At each time-step the RWer moves one step to the right with probability 1/2 or one step to the left also with probability 1/2
X_t: DTMC giving the position of the RWer after t time-steps (i.e. t jumps)
what is the probability that the RWer is at n∆x after k time steps?
We assume n and k of same parity and are interested in (both even/odd)
pₙ(k) = Prob{Xₖ= n∆x} = Probability{to be at n∆x after k time steps}
After k time steps: RWer is certainly between
between −k∆x and k∆x
(all k left, all k right)
Xₖ∈[-k∆x, k∆x]
At each time step, the RWer has two choices (left/right jump, L/R)
=> after t=1 time step, 2 possible paths with respect to the origin: either L or R
=> after t=2 time steps, 2x2 possible paths: L or R and again L or R => 4 possible paths: LL, LR, RL, RR
(-2,0,0,2)
=> after t=k , 2ᵏ possible paths: L or R at each step=> number of possible paths after k steps is 2ᵏ
the probability that the RWer is at n∆x after k time steps is
pₙ(k)=Prob(Xₖ= n∆x)
[#paths ending at n∆x after k jumps]
/
[total#of possible paths after k jumps]
for all the different paths
Say there are n+ jumps right and n- jumps to the left to reach n∆x in k steps
n₊+ n₋ =k total #moves
n₊- n₋ =n position in unit of ∆x w.r.t. the origin
n₊= {k+n}\2
and
n₋= (k-n)/2
combinatorics:
How many ways to take n₊ right steps and n₋ left steps
k!{n₊!n₋!}
ways to take n₊ right steps and n₋ left steps
with n₊ + n₋ =k
paths ending at n∆x after k steps
k!{n₊!n₋!}
n₊ + n₋ =k
Since these k steps need to land to the position n∆x
after k steps
k!/
[[[k+n]/2]! [[k-n]/2]!
pₙ(k)
pₙ(k)
= [Nb of paths ending at n∆x after k jumps]/
[Total number of possible paths after k jumps]
=
[k!]/
[(k+n)/2]! [(k-n)/2]! * (1/2ᵏ)
out of total possible paths
stirling formula and taylor expansion for large k show that
pₙ(k)
k! ≃ kᵏe⁻ᵏ√(2πk)
pₙ(k) approx Gaussian distribution
for large k
pₙ(k)≈ Probability{to be at n∆x after k time steps}
= square root(2/πk) * exp(-n²/(2k))
in the continuum limit
∆x → 0, ∆t → 0
with
D= (∆x)²/∆t
finite with x= n∆x and t=k∆t
the random walk on Z becomes the Brownian motion on R
and
lim_{∆x→0,∆t→0} pₙ(k)
=P(x,t)
exp(-x²/(4Dt))/(√(4πDt))
where P(x,t) is the probability density of the 1D Brownian Motion (with the Brownian particle
initially at x=0); as shown in Chapter 10 it obeys the 1D diffusion equation.
(optional appendix notes)
Brownian motion
Brownian motion on R
and
lim_{∆x→0,∆t→0} pₙ(k)
=P(x,t)
exp(-x²/(4Dt))/(√(4πDt))
where P(x,t) is the probability density of the 1D Brownian Motion (with the Brownian particle
initially at x=0); as shown in Chapter 10 it obeys the 1D diffusion equation.
Figures for:
Samples paths of 1D random walk
starting from the origin (t is in
unit of time steps) evolving
unrestricted on Z all start from origin but their own path
Continuous sample paths of the
symmetric 1D Brownian on
starting from the origin
7.3.2 The asymmetric 1D random walk with absorbing boundaries
In many applications, one-dimensional motion is restricted, e.g. particles move randomly within a finite domain that they cannot leave, e.g. a random walker of position is confined between 0 and N.
In this case, the state space is and we need to specify what happens at the boundaries 0 and N=> specify the boundary conditions (BCs).
absorbing boundaries at 0 and N
one the RWer reaches one of the ends, either X=0 or X=N, then it is “absorbed” there. It is stuck
and cannot move from that location. In this sense the boundaries 0 and N are “absorbing”.
and absorbing boundary conditions
p₀₀=pₙₙ=1 and p₁₀ =Pₙ-₁ₙ
asymmetric RW
transition probabilities
Asymmetric random walk: at each , the RWer moves ∆t 1 step to the left with prob q or one step to the right with probability p=1-q
The random walk is symmetric only when q=1/2.
transition probabilities
pᵢⱼ=p if i=j+1
pᵢⱼ=q if i = j-1
pᵢⱼ=0 if i ̸= j ± 1
with i=1,..,N-1
In general: Asymmetric RW
in finite Markov chains (with either discrete or continuous time), i.e. when N < ∞ with absorbing boundaries, absorption after a finite time (that may be large) is
guaranteed
The asymmetric RWer will eventually reach either state 0 or N, and at that point the dynamics
cease and X=0 or X=N.
what is the probability ϕₖ to be absorbed in state N if initially X₀=k∈S ?
ϕₖ = Prob{X_{t→∞} = N|X₀ = k ∈ S} =?
so Probability of being absorbed
at X = 0 is simply 1 − ϕₖ
1st-step analysis:
at each time-step, the RWer moves to the right with prob p=1-q or to the left with prob q =>
Recursion relation ϕₖ for (2nd-order linear map, see Chapter 2):
ϕₖ=p ϕₖ₊₁ + qϕₖ₋₁
Prob. to jump right
k → k + 1 *
Prob. of being absorbed
at N from k + 1
+
Prob. to jump left
k → k − 1 *
Prob. of being absorbed
at N from k − 1
with absorbing BCS:
ϕ₀=0 ϕₙ=1
Prob. of being absorbed at N
if we start at X0 = 0 (absorbing) is zero
with absorbing BCS:
ϕ₀=0 ϕₙ=1
starting then stay forever so will be absorbed there
Worked example (Example Sheet 4)
try
Solution of the form
ϕk = C1 + C2(q/p)
k, where C1, C2 are constants, with BCs:
ϕk = (q/p)
k − 1
(q/p)
N − 1
if q ̸= p and ϕk = k/N i
hence when q>p…
Absorption at N almost impossible if RWer is biased towards jumping to the left (q>p)
starting away from the ends of a long “path” (N»k»1)
lambda gives
not discussed: Reflecting bcs
if p_00 = q and p_10 = p, and p_NN = p and p_{N−1}=q
reaches one of its reflecting boundaries, the DTMC cannot move beyond it: it is “bounced”
towards the other boundary with probability p or q, or stays put at the boundary with the
complementary probabilities q and
Extra MATH5567M topic: recurrent, transient & absorbing states
It is often useful to know if and when a stochastic process returns to a previous state => notions of
recurrence and transience. In many applications, one is also interested in the mean time after which an absorbing state is reached.
e.g. will it return to the origin? After how long?
Extra MATH5567M topic: recurrent, transient & absorbing states:
a discrete-time Markov chain
{Xₘ}ₘ₌₀ᵐ⁼∞ with Xₘ∈ S
Extra MATH5567M topic: recurrent, transient & absorbing states
TRANSIENT
A state is transient if the probability to return to state i is less than 1 (i.e. return is not certain). i ∈ S
Formally: state i is transient if
Σ_n≥1
Prob{Xₙ=i, Xₘ ̸= i, m = 1, 2, . . . , n − 1|X₀=i} <1
≡first return probability to state i in n steps
as the return to state i does not occur with a probability 1.
Extra MATH5567M topic: recurrent, transient & absorbing states
first return property
We are interested in the first time we return (perhaps 0 or many times)
We want to find X_n s.t all previous steps are different to i
if the sum over all n <1
the probability that return to state i is not guranteed/may may not
hence it is transient
A state is recurrent
i ∈ S is reccurent
guaranteed/certain return
Σ_n≥1
Prob{Xₙ=i, Xₘ ̸= i, m = 1, 2, . . . , n − 1|X₀=i} =1
return to state i is certain as it occurs with a probability 1.
(doesn’t say how long it would take)
A state is recurrent if and only if
i ∈ S is recurent
IFF
Sum from n equals zero to infinity of pᵢᵢ⁽ⁿ⁾ equals infinity
the n-step transition probability pᵢᵢ⁽ⁿ⁾
(to return to state i in n
steps) is generally nonzero the sum over all possible number of steps diverges when i is recurrent
if state is reccurent: for any n the property is non 0
however if transient the sum of
(this probabiloty is different to first return probbility)
Theorem 7.1: state is recurrent if and only if
Σₙ₌₀ ∞ pᵢᵢ⁽ⁿ⁾ = ∞
Similarly, state is transient if and only if i ∈ S
Σₙ₌₀ ∞ pᵢᵢ⁽ⁿ⁾ < ∞
as non zro when we add we get divergence
but if transient then whenever we add will be finite
q5 ps4
Examples:
- Symmetric one-dimensional random walk on :
definitely try this one
all states are RECURRENT since Σₙ₌₀ ∞ pᵢᵢ⁽ⁿ⁾ = ∞
, see Q5 of Example sheet 4
diagram: Sample path of 1D random walk on starting from the origin. Z
At later times, the randomwalkers returns to the origin=> the state i=0 and all the other states are recurrent.
Note that pᵢᵢ⁽ⁿ⁾
Note that pᵢᵢ⁽ⁿ⁾
generally differs from
the first-return
probability to state i
when n>1. This is because
in the transition from
i to i at the step n, given
by , the first return to i
may have occurred at
any earlier step 1,2,…, n-1
⇒ pᵢᵢ⁽ⁿ⁾ ≥ first return
probability to i in n steps
{z }
Symmetric one-dimensional random walk on Z
all states are RECURRENT since Σₙ₌₀ ∞ pᵢᵢ⁽ⁿ⁾ = ∞
convergent
guaranteed
Asymmetric one-dimensional random walk on Z
all states are TRANSIENT since Σₙ₌₀ ∞ pᵢᵢ⁽ⁿ⁾ < ∞
Symmetric two-dimensional random walk on Z^2
: all states i∈Z²
are RECURRENT
Each sample paths of the 2D random walk on eventually
covers the entire 2D space
=> return to any state is
guaranteed
=> all states are recurrent
Symmetric two-dimensional random walk on Z^3
all states i∈Z^3 are TRANSIENT
Sample paths of the 3D random walk may not cross each
other and do not cover entirely the 3D space=>
return to any state is not guaranteed=> all states are
transient
In finite DTMC, all states are transient
true or false?
In finite DTMC, not all states are transient
false
asymmetric 1D random walk on [0,N] with absorbing boundaries at 0 and N:
recurrent? transient?
absorbing states 0 and N are recurrent since, by definition
p₀₀=p₀₀⁽ⁿ⁾ = 1 = pₙₙ= pₙₙ⁽ⁿ⁾
Σₙ₌₀∞ p₀₀⁽ⁿ⁾ = ∞ and
Σₙ₌₀∞ pₙₙ⁽ⁿ⁾= ∞
asymmetric 1D random walk on [0,N] with absorbing boundaries at 0 and N:
absorption?
We have seen that in the asymmetric 1D random walk
with absorbing boundaries at 0 and N, absorption is
guaranteed and we computed the probability
ϕₖ = Prob{Xₜ→∞= N|X₀ = k ∈ S} that absorption occurs at X=N given X₀= k ∈ S
(if i start at k probability i am absorbed at N)
We used the fi ϕk rst-step analysis to obtain a recursion relation for ϕₖ
What is the mean absorption time?
What is the mean time to reach either absorbing state 0 or N?
Using the first-step analysis:
move from k to k+1 with probability p=1-q
move from k to k-1 with probability q
1 time-step elapses between each moves k → k ± 1
If τₖ is the mean absorption time in state k, and τₖ±₁ are mean absorption times in states k ± 1
()
gives
RECCURRENCE RELATION
τₖ = p(τₖ₊₁ + 1) + q(τₖ₋₁ + 1)
i’m at k, take step to right end up at k+1 etc
1 time step
diagram mean absorption time
If q>1/2 (jumps to left
more likely)=> mean
abs. time is max close
to N
initial position of RWer vs mean absorption time
(0,0)
then skewed n shape to right (100,0) N=100
q=0.55
p=0.45
more likely Left than right
tau_k
when q>p then tau_k is max when value k is closer to n but not quite at n (not too close as otherwise right)
p=q=0.5 would be n shape with N/2=k takes longest in the middle
τₖ is the mean absorption time
BCs?
RECCURRENCE RELATION gives..
τₖ = p(τₖ₊₁ + 1) + q(τₖ₋₁ + 1)
= p + q + pτₖ₊₁ + qτₖ₋₁
= 1 + pτₖ₊₁ + qτₖ₋₁
=> inhomogeneous linear 2nd-order map:
pτₖ₊₁ + qτₖ₋₁ − τₖ = −1,
with abs. BCs τ_0 = τ_N = 0
(absorption time is 0 if…)
inhomogeneous linear 2nd-order map:
pτₖ₊₁ + qτₖ₋₁ − τₖ = −1,
with abs. BCs τ_0 = τ_N = 0
(absorption time is 0 if…)
q2 ps4
if q ̸= p try solution:
τₖ = C₁ + C₂k + C₃(q/p)ᵏ
and use the BCs =>
τₖ = [1/(q-p)] [k - N ([1-(q/p)ᵏ]/[1-(q/p)ᴺ]] if q ̸= p
τₖ = k(N − k) if q = p = 1/2