CHAPTER 5: Continuous time Markov chain models Flashcards
A continuous time stochastic process
A continuous time stochastic process is then a set of random variables X_t for real numbers t; often we assume time starts at 0 so that t ∈ R+
given known state at time 0
A continuous time stochastic process is a Markov chain
A continuous time stochastic process is a Markov chain if it has discrete states
and satisfies the condition: for
t_n bigger than t_{n-1} bigger than … bigger than t_1 ≥ 0,
P{X_{t_n} = i_n | X_{t_{n−1}} = i_{n−1}, . . . X_{t_1} = i_1} =
P{X_{t_n} = i_n | X_{t_{n−1}} = i_{n−1}}
for each n ≥ 2
- maintain current value for random length of time and then jump to another value
TIME HOMOGENEOUS
P(X_{t+τ} = j | X_t = i) = p_{ij} (τ )
not depending on t
For a continuous-time chain the Chapman-Kolmogorov equations are,
For a continuous-time chain the Chapman-Kolmogorov equations are,
pij (t + τ ) = SUM over k of
[p_{ik}(t)p_{kj}(τ )]
which may be written in matrix terms as
P(t + τ ) = P(t)P(τ )
where P(t) = (p_{ij}(t))
For a continuous-time chain :
matrix terms of chapman kolmogorov
P(t + τ ) = P(t)P(τ )
extending the chapman kolmogorov: we apply multiple times ie we only need to know how TM behaves for very small time intervals
because of this if we knew the value of P(t) for all t in a short
interval [0, δt] then we would be able to find it for any larger t too because the C-K equations would allow successive extensions to [0, 2δt], [0, 4δt] etc
we should be able to parametrize the chain in terms of the transition
probabilities over arbitrarily short times. As δt → 0 it is reasonable to insist that
P(δt) → I
(the unit matrix)
we focus on how pij → 0 (i ≠ j) and pii → 1.
*the zero step transition matrix should be identity- as 1 to 1 and any other state to itself with prob 1
Matrices For a continuous-time chain :
in discrete time we had nth powers of TM but in this case
we cant use this fact, we cant describe the chain by the one step TM
we can only apply the chapman kolmogorov
Suppose that for
δt very small the probability of a transition from i to j is
p_ij (δt) = g_{ij}δt + o(δt)
i ≠ j,
(infinitesimal transition scheme)
ie for small time step, if currently in state i transition prob tends to 0 as interval tends to 0 in linear way
ie constant (derivative) * δt
o(δt) (last term goes to 0 faster than δt goes)
Suppose that for
δt
pii(δt) = 1 + g_{ii}δt + o(δt)
(infinitesimal transition scheme
transition prob of i to itself
linear terms + term going to 0 faster than δt
(g_ii will be negative to ensure probs tend to 1)
where the gij are constants.
DEF: generator matrix of the continuous Markov chain
Matrix G with elements g_ij is the (infinitesimal) generator of the Markov chain
ie tends to I
st p_ij (δt) = g_{ij}δt + o(δt) pii(δt) = 1 + g_{ii}δt + o(δt)
diagonal and off-diag elements
(When a continuous time Markov chain is used for modelling, it is usually not the transition probabilities pij (t) that are specified, but the generator.)
Properties of generator matrix of the continuous Markov chain
its possible to find chains which break
**Σ_j [p_ij(δt)] = 1 as rows of TM
we expect that
pii(δt) = 1 - Σ_{i ≠ j} [g_{ij}δt + o(δt)]
ie pii(δt) = 1 - sum of non-diagonal entries
= 1 - Σ_{i ≠ j} [g_{ij}]δt + o(δt)
so g_ii= -Σ_{i ≠ j} [g_{ij}]
(valid if first sum is finite)
will certainly be true for chains with finite state space. In general it is
not guaranteed. However, most chains used in modelling satisfy it. They are called
conservative chains
A natural way to think about gij for j ≠ i is as the instantaneous rate at which, given
that the chain is in state i, a transition to state j is expected to occur.
Conservative chains
Satisfy g_ii= -Σ_{i ≠ j} [g_{ij}]
EXAMPLE 22 A two state chain (continuous)
Let S = {1, 2} and set
G=
[−α α
[ β −β]
α, β bigger than 0
(no constraints on being less than one but off diagonals non negative and diagonals negative)
G is the generator matrix for a continuous time Markov chain with two
states. (As previously, we could think of this as a weather model, with the states being “wet”
and “dry”.)
ie its raining now will it continue to rain for some time
EXAMPLE 23 The Poisson process (continuous)
Consider N_t, # events in (0,t] in a Poisson process with rate λ.
Consider N_t, # events in (0,t] in a Poisson process with rate λ.
P(Nt+δt = n + 1|Nt = n)
= λδt + o(δt)
and that
P(Nt+δt > n + 1|Nt = n) = o(δt)
(rows sum to 0)
we can consider a Poisson process as a continuous time Markov chain on N_0 with generator given by g_{n,n+1} = λ, g_{n,n} = −λ and other entries zero
The chain is not irreducible: values can only increase, so it is not possible to get from state i to state j if j less than i.
EXAMPLE 24 Linear birth process
and d so each population member reproduces at rate λ
Consider pop where each member reproduces at rate λ
each member of pop has probability of a reproduction event between times t and t + δt is
λt + o(δt))
(no other ways pop can change)
Then letting N_t be the size of the population at time t, P(N_{t+δt} = i + 1|N_t = i) = λ_iδ_t + o(δ_t) and P(N_{t+δt} = i|N_t = i) = 1 − λ_iδ_t + o(δ_t), with P(N_t+δt = j|N_t = i) being o(δt) if j is neither i nor i + 1
continuous markov chain : g_{ij} = {λi j=i+1 {-λi j=1 {0 o/w
(not irreducible as cant leave state 0)
poisson processes
are not irreducible
EXAMPLE 25: linear birth-death process
and d so each population member reproduces at rate λ
and dies at rate µ
Extend example 24: allow deaths so each population reproduces at rate λ:
P(Nt+δt = i + 1|Nt = i) =
λiδt + o(δt),
and dies at rate µ giving: P(N_{t+δt} = i − 1|N_t = i)
= µ_iδt + o(δt)
P(Nt+δt = i|Nt = i) = 1 − (λ + µ)iδt + o(δt)
(still not irreducible as cant leave state 0)
continuous MC: g_ij = {λ_i , j = i + 1 {−(λ + µ)i 'j = i {µi ,j = i − 1 {0 otherwise.
EXAMPLE 26: M/M/n queue
Imagine that we have a queue with n servers, and assume that:
• Arrivals of customers occur as a Poisson process of rate λ.
• There are n servers, who can each serve one customer at a time, and will automatically
move on to the next customer in the queue when their current customer leaves.
• Each customer currently being served completes their service at rate µ, so that the probability they leave in (t, t + δt] is µδt + o(δt).
Let N_t= number in the queue at time t.
*If f i ≤ n, then in (t + δt], we may have an
arrival, which is an event of a Poisson process, so has probability λδt + o(δt), or we may have a
service completion, which happens with probability µiδt + o(δt) as there are i customers being served, or we may have more than one event with probability o(δt)
gi,i+1 = λ, gi,i−1 = µi gii = −(λ + µi).
*if i is bigger than n thengi,i+1 = λ still but only n customers being served
g_i,i−1 = µn
and
g_ii = −(λ + µn)
This chain is irreducible.
EXAMPLE 27: Generalized birth-death process
prev examples generalized
This has transitions i →i + 1, λi δt + o(δt); → i, 1 − (λi + µi) δt + o(δt); → i − 1, µi δt + o(δt).
corresponding to gij = {λi if j = i + 1 {−λi − µi if j = i {µi if j = i − 1 {0 otherwise
continuous MC and transition probabilities
obtained from G
writing generator defs in matrix notation:
P(δt) − I = Gδt + o(δt),
δt → 0
(o(δt), goes to 0)
ie
limit as δt → 0 of
[ (P(δt)-I)/δt] = G
(one way of thinking about G is that its the derivative of t-step transition matrix at time 0)
S by S matrix or infinite state space
considering chain from starting state at time 0 to state at time t + δt
Chapman-Kolmogorov eq:
transitions over the period t + δt. expressed in terms of those between 0 and t and in final short time δt.
P(t + δt) = P(t) P(δt)
Then
[P(t + δt) − P(t)]/δt =
[P(t) (P(δt) − I)]/δt
Let δt → 0 and e limit of the left hand term is deriv P’(t) of P at time t.
ie P’(t) = P(t) G,
known as the forward differential equations.
backward differential equations from paths over 0, t+ into short (δt) and long (t)
sections rather than vice versa
P’(t) = G P(t).
note: generator matrix
*All continuous chains with finite state space rows sum to 0 (we will only see in course st true)
the forward differential equations.
P’(t) = P(t) G,
backward differential equations
P’(t) = G P(t).
Solutions of forwards and backward diff eqs
for finite chains:
the equations are matrix generalizations of the simple DE
y’(t)=g y(t) with constant g
solution: y(t)=y(0)e^{gt}
With P(0) = I, solution for the forward and backward equations
P(t) = exp(Gt),
where, for a matrix G, the exponential function is defined as
exp(Gt) =
Σ_{i=0,∞} [(Gt)^i/i!]
Spectral decomposition of G:
Assuming G is diagonalizable, the spectral representation of G (cf (1)) is
G = T DT −1
where T is a matrix whose columns are right eigenvectors vector(t) of G, so that
Gt = dt, (t is bold)
and D is a diagonal matrix of the corresponding eigenvalues d
D= [0 0 ... ...] [0 d_2 0 ... ] [0 0 d_3..] [... ... ... ...]
*conservatism of G guaranteeing that 0 is an eigenvalue (NOT 1)
G^2= TDT-1 TDT-1= TD^2T-1
G^n = T D^nT−1, n = 1, 2, . . .
where D^n= [0 0 ... ...] [0 d^n_2 0 ... ] [0 0 d^n_3..] [... ... ... ...]
Solutions of forwards and backward diff eqs
expressed in matrix
Spectral representation of G allows exp(Gt) to be found
P(t) = P(0) exp(Gt).
exp(Gt)= Σ_{n=0,∞} [(Gt)^n/n!]
= T ( Σ_{n=0,∞} [(Dt)^n/n!]T^-1
= T [ 1 0 ... ...] [0 Σ[(d_2 t)^n/n!] 0 ...] [0 0 Σ[(d_3 t)^n/n!] ...] [... ... ... ...] T^-1
=T [1 0 ... ...] [0 e^{d_2 t} 0 ...] [0 0 e^{d_3 t} ...] [... ... ... ...] T^-1
EXAMPLE 28
Two state chain
G=
[-1 1]
[2 -2]
- if currently in state 1 then prob of jumping to state 2 in δt is δt
- in state 2 prob of jump to state 2 in δt is 2δt
tend to jump faster away from state 2 than state 1
(expect more time in state 1 as t increases)
G has eigenvalues 0 and -3, diagonalize
G=
[1 1 ][0 0][2/3 1/3]
[1 -2][0 -3][1/3 -2/3]
Hence P(t)=
[1 1 ][0 0 ][2/3 1/3]
[1 -2][0 e^{-3}][1/3 -2/3]
=
[ (2/3) + (1/3)e^{-3t} (1/3) -(1/3)e^{-3t}]
[(2/3) - (2/3)e^{-3t} (1/3) +(2/3)e^{-3t}]
(TM rows sum to 1)
so P(t) tends to
[2/3 1/3]
[2/3 1/3]
as t tends to infinity ( as e^{-3t} tends to 0)
*prob of being in state 1 a long time in the future is 2/3 and 2 is 1/3
entries converge to some kind of stat dist?
Solution of forward equations for Poisson process
(missed calculations in lectures)
…
If we treat s as fixed, this is an ordinary differential equation with respect to t, and its solution
is
F_{N_t}(s) = C exp (λ(s − 1)t).
F_{N_t}(s) = exp (λ(s − 1)t).
A Poisson distribution with parameter µ has probability generating function given by
Σ{k=0,∞} [((µ^k e^−µ) /k!) s^k] = e^−µ * Σ{k=0,∞}
[(µs)^k /k!]
= e^−µ e^µs =
e^µ(s−1)
,
so we can recognise F_{N_t}(s) as the probability generating function of a Poisson distribution with
parameter λt. The probability generating function determines the distribution, so we do indeed have that Nt ∼ P o(λt).
limiting distributions
Assume that as t → ∞ the transition probabilities pij (t) converge to some limit πj for each j, whatever initial state i
In matrix each row of P(t) tends to row vector bold(π ) of the π_j
Given pij (t) converge to constants expect their time derivatives to converge to zero
i.e. P’(t) → 0
Thus forward equations
P’(t) = P(t) G
as t → ∞ the first term tends to 0 and each row of second term tends to bold(π)G st limiting probabilities satisfy
bold(π)G=0
(limiting distribution satisfies this)
Stationary dists
Say that a distribution bold(π) = (πj ) (a row vector) on the states is stationary if
πP(t) = π for all t > 0.
Stationary dists and limiting dists
*If P(t) can be expressed as exp(Gt) then a distribution π is a stationary distribution if and only if π satisfies
bold(π)G=0 condition In particular this identifies stationary
and limiting distributions for finite chains. The identification holds for many infinite chains
too.
skeleton chain
An interesting way to think about the limiting behaviour of a continuous time chain is to
imagine observing the chain only at discrete times 0, δt, 2δt, . . . . The resulting sequence is a discrete time Markov chain (called a skeleton chain
(obtain a discrete MC from a continuous one and deduce results from another )
*skeleton chains are automatically aperiodic so we will use existence of limiting dists
if continuous chain is irreducible (all states communicate in the sense that p_ij (t) bigger than 0 and p_ij(s) bigger than 0 for some t and s), then either
if the chain is irreducible there are possible routes from I to j and j to I so positive transition probs
1) (STABLE CASE)the system is stable and the limiting distribution is limit as t tends to infinity of [ p_ij (t)] = π_j is the unique solution of bold(π)G = 0. (we have a limiting dist=stat dist)
2) (corres to null recccurence or transience for discrete) pij(t) → 0 for all states and
bold(π)G = 0
has no solution
(in discrete case we required aperiodicity but In continuous we assume this)
EXAMPLE 30: two state chain,
find a stationary distribution for the chain in Example 22
Let S = {1, 2} and set
G=
[−α α
[ β −β]
α, β bigger than 0
We solve (π1 π2)* [−α α] [β −β] = (0 0)
−απ1 + βπ2 = 0
απ1 −βπ2 = 0
so:
π2 = (α/β)π1
and constraint π1 + π2 = 1 Stationary dist is bold(π) = ( (β)/(α+β) (α)/(α+β))
Example 31. Generalized birth-death process
finding stat dist
This has transitions i →i + 1, λi δt + o(δt); → i, 1 − (λi + µi) δt + o(δt); → i − 1, µi δt + o(δt).
corresponding to gij = {λi if j = i + 1 {−λi − µi if j = i {µi if j = i − 1 {0 otherwise
G= [−λ_0 λ_0 0 0 0 …] [ µ1 −(λ1 + µ1) λ1 0 0 …] [0 µ2 −(λ2 + µ2) λ2 0 …] [0 0 µ3 −(λ3 + µ3) λ3 …] [ ... ... ... ... ... …]
The equations for the stationary distribution
(π_0 π1 …)G = 0
are therefore
λ_0π_0 = µ_1π_1
(λj + µj)πj = λ_{j−1}π_{j−1} + µ_{j+1}π_{j+1}, j ≥ 1
π1 = (λ0/µ1) π0,
π2 = (1/µ_2){−λ_0π_0 + (λ_1 + µ_1)(λ_0π_0)/ (µ1)) =
(λ0λ1/µ1µ2)π0
in general:
πj =
(λ0 …λ_{j−1})/( µ1 …µj)π0
which gives a probability distribution if and only if λ_0 ≠ 0 and
Σ_{j=1, ∞} [(λ0 …λ_{j−1})/( µ1 …µj)] less than ∞
ie sum is finite
Then the distribution has prob funct
π0 =
1/
[1 + Σ_{j=1, ∞} [(λ0 …λ_{j−1})/( µ1 …µj)]]
πj =
(λ0 …λ_{j−1})/( µ1 …µj)π0
j = 1,2,….
Example 32. M/M/1 queue
For the M/M/1 queue, Example 26 with n = 1, we have λi = λ and µi = µ for all i.
So example 31 means:
*Σ_{j=1, ∞} [(λ^j)/( µ^j)] less than ∞
true iff λ less than µ.
(I.e. the service rate exceeds the arrival rate
summing the geometric series gives π_0 = (µ−λ)/µ , and so π_j =(µ−λ)/µ (λ/µ)^j
Hence the stationary distribution is a geometric distribution.
Example 33. Immigration-death process
Consider a generalized birth-death process with λi = λ for all i, representing a constant rate of immigration, and µi = µi, as in the linear birth-death process.
The criterion in Example 31 now becomes
Σ_{j=1, ∞} [(λ^j)/( µ^j j!)] less than ∞
which is always satisfied because
Σ_{j=1, ∞} [(λ^j)/( µ^j j!)] =
e^{λ/µ} -1
Then
π_0 = 1/e^{λ/µ} and
πj =
λ^j/(µ^{j} j! e^{λ/µ})
= ((λ/µ)^{j} e^{−λ/µ})/ j!
showing that the stationary distribution is Poisson with parameter λ/µ.
Evolution of the chain
How long does chain remain in current state for?
Holding times
Where does it go when it leaves that state?
First Jump Probabilities:
Holding times
Suppose chain is initially in state u. denote H_i as length of time after which it first leaves I (HOLDING TIME in i)
We consider :
F(t) = P(Hi > t)
(1−F(t) =cumulative distribution function of Hi)
From the markov property
(probability that chain is still in state i at time t + δt:
F(t + δt) =
F(t)(1 + g_{ii}δt + o(δt))
(probability still in state i at time t)*(probability that it doesn’t move away between time t and t + δt)
(g_ii is negative as diag entry)
Distribution of holding times
F(t + δt) =
F(t)(1 + g_{ii}δt + o(δt))
On subtracting F(t) from both sides, dividing by δt and taking the limit it follows that
F’(t) = g_{ii}F(t)
The solution is
F(t) = e^{g_ii t}
(we knew F_0 =1 so constant included in sol)
so that Hi has an exponential distribution with mean 1/(−gii) (recall that gii less than 0).
For notational convenience we can write −gkk, as gk. Then Hi is exponential with mean 1/gi.
First Jump Probabilities:
P(i ↝ j during (t,t + δt) | leave i during (t,t + δt))
=
gijδt + o(δt) /(−giiδt + o(δt))
for j ≠ i
→gij/gi
(ij entry of generator matrix/ ?)
= ¯ pij,
say, as δt → 0.
The ¯ pij are called first jump probabilities. For a conservative chain (rows sum to 0)
Σ_{j ≠ i} [ ¯ pij] =1, for each i
so the first jump probabilities form a stochastic matrix( with zero diagonal )
Corresp discrete time chain is the first jump chain
also true that if J is dest. state on leaving i
P(J=j | H_i=t)
= ¯ pij , j ≠ i}
*calculations missed, since the chance of 2 jumps in δt is o(δt)
etc
How the chain develops
The process remains in its initial state i for an exponentially distributed time with mean 1/gi
It then jumps to another state j ≠ i chosen according to the first jump probabilities ¯ pij
sequence of steps you pass through given by discrete jump chain
length of time/ holding times are exp dist and is independent of the jump chain
Continuous markov chain is described by:
A continuous markov chain is described by generator matrix G
off diagonal entries are rates of moving to state j given state i (different states) g_ij
entries on diagonal are negative so g_ii is -g_i
rows sum to zero usually so g_i’s are found by adding the other entries in the same row
- starts in a state and remains in that state for a time with exponential distribution with parameter g_i ie minus the diagonal entry in matrix
- mean of exponential is 1/g_i
Eventually leaves- holding time ends and moves to other state j with probability
¯ pij=
gij/gi
we think of these as transition probabilities as we have they sum to 1, we call the jump chain
To simulate a continuous markov chain
we can study the discrete time chain and dont need new methods for inference: we study the jump chain
We use an exponential dist for RV and discrete markov chain for jump chain
holding times with exponential distributions
To see how, note that the jump chain, the discrete time chain obtained
by observing the continuous time process only when it changes state, passes through the
same sequence of states as the continuous time process, so properties depending on space but not time structure – for example, probabilities of absorption – can be obtained from the discrete jump chain.
Note, however, that the jump chain and the underlying continuous time chain will often have different stationary distributions.
ie could spend different amounts of time in different times
Example 34. Linear birth-death process
from example 25
so each population member reproduces at rate λ
and dies at rate µ
gi,i+1 = λi, gi,i−1 = µi and gi,i = −(λ + µ)i
(entries in G)
This tells us that the holding time in state i: the length of time remaining in state i is exp dist with parameter (λ + µ)i
probability that next event is a birth (jump chain up)
p¯i,i+1 =
gi,i+1/ −gi,i=λ/ (λ + µ)
probability that the next event is a death is p¯i,i−1 =
µ/ (λ + µ)
for any i ≥ 1. State 0 is a special case;
if state of chain is 0 then remains 0: corresponds to 3
BEHAVIOUR OF JUMP CHAIN
Example 34. Linear birth-death process
from example 25
so each population member reproduces at rate λ
and dies at rate µ
The behaviour of the jump chain is then similar to the Gambler’s Ruin discrete time chain
with the probability of jumping up being λ/(λ+µ)
and that of jumping down
being µ/(λ+µ), except at state 0 which cannot be left. The only difference is that in MAS275 there was an upper target N at which the gambler would stop, whereas here the population may increase indefinitely.
From MAS275, the probability of reaching N before 0 starting at state i is
q_{N,i} =
{[1−(µ/λ)^i[/] 1−(µ/λ)^N] λ ≠ µ
{i/N λ = µ.
Letting N → ∞, we see that q_{N,i} → 0 if µ ≥ λ,
indicating that the size of the population will not get indefinitely large and will eventually reach 0, so the population will become extinct
with probability 1.
(birth rate smaller than death rate- hard to reach large pop)
On the other hand, if µ < λ then there are two possibilities. We have that
q_{N,i} → 1 − (µ/λ)^i
, indicating that, if the chain starts in state i there is a probability 1 −(µ/λ)^i
that the population will become indefinitely large and not become extinct, but there is also a probability
(µ/λ)^i that the population will reach 0, and hence become extinct
(positive probability of getting indefinitely large and probability of becoming extinct)
How to fit a continuous-time Markov chain model to observation
LIKELIHOOD
GIven a complete record of the chain’s evolution over the period [0, t], {x(s) :
0 ≤ s ≤ t}
we know holding times and sequences of states it was in
We suppose the model is parametrized by its generator matrix G. Likelihood of G is the the probability (density) of the observation as a function of the parameters
*Exp continuous components and discrete jumps
*the chain’s path can be described in terms of the sequence of states
passed through and the lengths of holding time in each. Thus let
{X¯_k : k = 0, 1, . . . , Nt}
denote the sequence of states passed through by time t, Nt being the total number of jumps. The holding time in state X¯_k is denoted by H_k. (remains in state for random time H_k, exp dist with parameter entry of G)
Then conditionally on X¯_0=x_0 the likelihood is
**Holding time H_{N_t} is incomplete at time t, we don’t know how long stays in state X¯_n_t
observed values in lower case vs upper RVs
We observe state 0 to n_{t-1} X¯0 to X¯{t-1}, holding times are exp dist terms for each holding time (where gs are same) * each transition we observe to a different state
(last term corresponds to the last jump, we know its at least this value, at least this holding time up until t, the probability that this value is at least than this is used as the last term)
L(G : x(s), 0 ≤ s ≤ t) =
{∏ {k=0, n{t} − 1}
[gx¯k e{−gx¯k h_k}
p¯x¯_k x¯{k+1}) ]
e^{−gx¯{n_t} h_{n_t}
=
∏{k=0, n_t-1} e^{−gx¯k
h_k g{x¯k,x¯{k+1}})
e^{−gx¯{n_t} h_{n_t}}
= exp−(∑i [ g_ia_i]) ∏{i≠j} [g^{nij} _ij ]
where
a_i is the total time spent in state i during [0, t]
and
nij is the number of transitions observed from state i to state j during [0, t]
*therefore we only need n_t,a_i, n_ij to find likelihood
number of transitions observed, amount of time spent in each state and number of transitions between each pair of states
How to fit a continuous-time Markov chain model to observation
LOG LIKELIHOOD
the log likelihood is
l =
−∑i [g_ia_i] + ∑{ij}
[n_ij log g_{ij}]
which, because g_i = −g_{ii} =
∑_{i≠j} gij ,
is the same as
l = −∑i ∑{j≠i} [ g_ijai] +
∑_j≠i [nij log gij]
How to fit a continuous-time Markov chain model to observation
MAXIMUM LIKELIHOOD ESTIMATES
partial derivative of log likelihoodwrt to each g_ij each transition rate form i to j
∂l/∂g_{ij} = −a_i + nij/gij
and is zero at at g^ij = n_ij/a_i i ≠ j mle is number of transitions from i to j / time that it was in state i
(ie dividing by the time it had the opportunity to move from i)
DIAGONAL ENTRIES ESTIMATE
for gi = −gii follows from g^i = −g^_ii = ∑_{j ≠ i} [g^ij] = ∑_{j ≠ i} [n_ij/ai] = ni./ai
transitions out of state i/ amount of time in state i
ie rate of transitions out of state i
How to fit a continuous-time Markov chain model to observation
LIKELIHOOD NOTES
covariance matrix
variance matrix
**estimates g^ij = n_ij/a_i
i ≠ j have the form
no. transitions i to j per unit time spent in state i.
gij as the rate
at which transitions from i to j occur.
** generator elements gij are themselves functions of a lower dimension vector parameter, θ say In that case the maximum likelihood estimate θ^
would be found as the value at which l is maximum with respect to θ, and the generator elements would be estimated as gij (θ^)
** The asymptotic variance-covariance matrix of the estimates is given as usual by the inverse of the information matrix, minus 1 times the matrix of second derivatives of
the log-likelihood l with respect to the parameters. (This is the observed information matrix J whose expectation is I, the expected information matrix. Either J or I may be used; in large samples they lead to equivalent results.)
Example 35. Inference for a linear birth-death process.
Birth rate for pop n and death rate for pop n, where parameters are unknown
inference for parameters
we have the data: #transitions and holding times
λ_n = λ n and µ_n = µ n, for n = 0, 1, . . . , where λ and µ are unknown. The generator
elements are
g_ij=
{λi j = i + 1
{−(λ + µ)i j = i
{µ i j = i − 1
Suppose as above that the process has been observed over an interval [0, t], and the times ai and transition counts n_ij recorded. Then the log-likelihood for λ and µ is
l = −
∑i [g_ia_i] +
∑{i,j} nij log gij
=
−∑_i [(λ + µ)ia_i] +
∑i [n{i,i+1} log(λi)] +
∑i [n{i,i−1} log(µi)]
partial diff wrt to either parameter, births and deaths terms
∂l/∂λ = −∑_i [ia_i] +
(1/λ) ∑i [n{i,i+1}]
∂l/∂µ = −∑_i [iai] + (1/µ)
∑i [n{i,i−1}]
mles are:
λ^ = ∑i [n{i,i+1}]/∑_i [iai]
µ^ = ∑i [n{i,i−1}]/ ∑_i [iai]
b =∑[ n{i,i+1} and d =
∑i [n{i,i−1}] are the total number of births and deaths respectively,
and ∑[iai] is the total accumulated time, T say, (person-years say) during which birth or death could happen to one individual, so λ^ is the observed birth
rate b/T, (spending more time for each individual members to born die for bigger population)
and µ^ the observed death rate d/T, per individual per unit time.
Example 35. Inference for a linear birth-death process.
ESTIMATED STANDARD ERRORS
CIs
To find standard errors we calculate the observed information matrix J. From the above
∂^2l/∂λ^2
= −b/λ^2
∂^2l/∂µ∂λ = 0
∂^2l/∂µ^2
= −d/µ^2
so that
J =
[b/(λ^2) 0 ]
[ 0 d/µ^2]
and J−1 =
[(λ^2)/b 0 ]
[0 µ^2/d]
Estimated standard errors for λb and µb are obtained from J−1 by taking square roots of the appropriate elements and substituting estimates for unknown parameters. For example
ese (λ^) = λ^/√b
and an approximate 95% confidence interval for λ is
λ^ ± 2(λ^/√b)
Continuous-time Markov chain models and developments of them are used in many areas; for example in:
- population studies, where, for example, they have been developed to take account of age structure and spatial distributions of animal populations;
- manpower planning for large organizations;
- studies of social and occupational mobility; • diffusion of news, rumours and internet viruses;
- competition, in ecology and in animal populations;
- predator-prey phenomena;
- disease modelling, including cancer growth;
- epidemic modelling
The following sketch of epidemic modelling aims to indicate the subject-matter and some of the demands it makes for development of ideas and techniques.
EPIDEMICS: cont process
Compartmentalised population
SUSCEPTIBLE- one who may catch the relevant disease by contact with an infective individual
INFECTIVE- infected
REMOVED/IMMUNE- The removed individuals are those who, having caught the disease earlier, have now recovered and become immune or have been removed (by death or by being isolated, for example) and are no longer open to re-infection themselves or able to infect any one else
For a population of n, S(t),I(t) and R(t) thus S(t)+I(t)+r(t)=n.
(we only track S and I)
EPIDEMIC RATES
A susceptible person meeting infected can catch the disease At time t the rate of occurrence of such meetings and infections will depend on the numbers S(t) and I(t) of both susceptibles and infectives. proportional
*rate of change in the number of susceptibles is dS(t)/dt = −βS(t)I(t)
for some β bigger than 0
β= infection rate
*reduction if infected recovers, rate proportional to I(t)
dI(t)/dt
= βS(t)I(t)−γI(t) =
(βS(t)−γ)I(t)
where γ is a constant referred to as the removal rate.
- the rate of change of the number removed is
dR(t)/ dt = γI(t).
For epidemic begins from a small number I0 of infectives at time 0.
We approximate S by S_0 for beginning of chain; we only track I for linear birth/death process
solution of equations
dS(t)/dt = −βS(t)I(t)
dI(t)/dt = (βS_0)−γ)I(t)
then gives a deterministic prediction for the progress of the epidemic until no new infections are occurring
EPIDEMIC RATES giving MARKOV CHAIN
continuous time MC model
during (t, t+𝛿t)
(I,S) →
{(I+1,S-1) with prob βSI𝛿t+o(𝛿t)
{(I-1),S) with probability γI𝛿t+o(𝛿t)
A markov chain on the state of pairs (S,I)
*if we consider at the beginning of the epidemic #S close to n and I close to 0, reduction in S slow
thus the transition probabilities will be approx.
I →
{I+1 with prob βS_0I𝛿t+o(𝛿t)
{I-1 with probability γI𝛿t+o(𝛿t)
only changes in I need be tracked
Transition probabs of a simple linear birth-death process with
birth rate λ_i = βS_0i
death rate μ_i = γi
Follows that infectives die out if
βS_0⩽γ
(S_0 ⩽ ρ) and therefore there will be no major epidemic
(γ/β = ρ)
*If βS_0 bigger than γ there is still a probability ( γ/βS_0 )^l_0 of extinction so that a major epidemic MAY take place (not def will take place)
Different conclusions for stochastic approach vs deterministic