CHAPTER 5: Continuous time Markov chain models Flashcards

1
Q

A continuous time stochastic process

A

A continuous time stochastic process is then a set of random variables X_t for real numbers t; often we assume time starts at 0 so that t ∈ R+

given known state at time 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A continuous time stochastic process is a Markov chain

A

A continuous time stochastic process is a Markov chain if it has discrete states

and satisfies the condition: for
t_n bigger than t_{n-1} bigger than … bigger than t_1 ≥ 0,

P{X_{t_n} = i_n | X_{t_{n−1}} = i_{n−1}, . . . X_{t_1} = i_1} =
P{X_{t_n} = i_n | X_{t_{n−1}} = i_{n−1}}
for each n ≥ 2

  • maintain current value for random length of time and then jump to another value

TIME HOMOGENEOUS
P(X_{t+τ} = j | X_t = i) = p_{ij} (τ )
not depending on t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

For a continuous-time chain the Chapman-Kolmogorov equations are,

A

For a continuous-time chain the Chapman-Kolmogorov equations are,
pij (t + τ ) = SUM over k of
[p_{ik}(t)p_{kj}(τ )]

which may be written in matrix terms as
P(t + τ ) = P(t)P(τ )
where P(t) = (p_{ij}(t))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

For a continuous-time chain :
matrix terms of chapman kolmogorov

P(t + τ ) = P(t)P(τ )

A

extending the chapman kolmogorov: we apply multiple times ie we only need to know how TM behaves for very small time intervals

because of this if we knew the value of P(t) for all t in a short
interval [0, δt] then we would be able to find it for any larger t too because the C-K equations would allow successive extensions to [0, 2δt], [0, 4δt] etc

we should be able to parametrize the chain in terms of the transition
probabilities over arbitrarily short times. As δt → 0 it is reasonable to insist that
P(δt) → I

(the unit matrix)

we focus on how pij → 0 (i ≠ j) and pii → 1.

*the zero step transition matrix should be identity- as 1 to 1 and any other state to itself with prob 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Matrices For a continuous-time chain :

A

in discrete time we had nth powers of TM but in this case

we cant use this fact, we cant describe the chain by the one step TM

we can only apply the chapman kolmogorov

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Suppose that for
δt very small the probability of a transition from i to j is
p_ij (δt) = g_{ij}δt + o(δt)

i ≠ j,

(infinitesimal transition scheme)

A

ie for small time step, if currently in state i transition prob tends to 0 as interval tends to 0 in linear way

ie constant (derivative) * δt

o(δt) (last term goes to 0 faster than δt goes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Suppose that for
δt

pii(δt) = 1 + g_{ii}δt + o(δt)

(infinitesimal transition scheme

A

transition prob of i to itself

linear terms + term going to 0 faster than δt

(g_ii will be negative to ensure probs tend to 1)
where the gij are constants.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DEF: generator matrix of the continuous Markov chain

A

Matrix G with elements g_ij is the (infinitesimal) generator of the Markov chain

ie tends to I

st 
p_ij (δt) = g_{ij}δt + o(δt) 
pii(δt) = 1 + g_{ii}δt + o(δt)

diagonal and off-diag elements

(When a continuous time Markov chain is used for modelling, it is usually not the transition probabilities pij (t) that are specified, but the generator.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Properties of generator matrix of the continuous Markov chain

its possible to find chains which break

A

**Σ_j [p_ij(δt)] = 1 as rows of TM

we expect that
pii(δt) = 1 - Σ_{i ≠ j} [g_{ij}δt + o(δt)]

ie pii(δt) = 1 - sum of non-diagonal entries

= 1 - Σ_{i ≠ j} [g_{ij}]δt + o(δt)

so g_ii= -Σ_{i ≠ j} [g_{ij}]

(valid if first sum is finite)

will certainly be true for chains with finite state space. In general it is
not guaranteed. However, most chains used in modelling satisfy it. They are called
conservative chains

A natural way to think about gij for j ≠ i is as the instantaneous rate at which, given
that the chain is in state i, a transition to state j is expected to occur.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Conservative chains

A

Satisfy g_ii= -Σ_{i ≠ j} [g_{ij}]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

EXAMPLE 22 A two state chain (continuous)

A

Let S = {1, 2} and set
G=
[−α α
[ β −β]

α, β bigger than 0

(no constraints on being less than one but off diagonals non negative and diagonals negative)

G is the generator matrix for a continuous time Markov chain with two
states. (As previously, we could think of this as a weather model, with the states being “wet”
and “dry”.)

ie its raining now will it continue to rain for some time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

EXAMPLE 23 The Poisson process (continuous)

Consider N_t, # events in (0,t] in a Poisson process with rate λ.

A

Consider N_t, # events in (0,t] in a Poisson process with rate λ.

P(Nt+δt = n + 1|Nt = n)
= λδt + o(δt)

and that

P(Nt+δt > n + 1|Nt = n) = o(δt)

(rows sum to 0)

we can consider a Poisson process as a continuous time Markov chain on N_0 with generator
given by 
g_{n,n+1} = λ, 
g_{n,n} = −λ 
and other entries zero

The chain is not irreducible: values can only increase, so it is not possible to get from state i to state j if j less than i.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

EXAMPLE 24 Linear birth process

and d so each population member reproduces at rate λ

A

Consider pop where each member reproduces at rate λ

each member of pop has probability of a reproduction event between times t and t + δt is
λt + o(δt))

(no other ways pop can change)

Then letting N_t be the size of the population at time t,
P(N_{t+δt} = i + 1|N_t = i) = 
λ_iδ_t + o(δ_t)
 and
P(N_{t+δt} = i|N_t = i) = 
1 − λ_iδ_t + o(δ_t), with 
P(N_t+δt = j|N_t = i) being 
o(δt) if j is neither i
nor i + 1
continuous markov chain :
g_{ij} =
{λi    j=i+1
{-λi   j=1
{0   o/w

(not irreducible as cant leave state 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

poisson processes

A

are not irreducible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

EXAMPLE 25: linear birth-death process

and d so each population member reproduces at rate λ

and dies at rate µ

A

Extend example 24: allow deaths so each population reproduces at rate λ:

P(Nt+δt = i + 1|Nt = i) =
λiδt + o(δt),

and dies at rate µ giving: P(N_{t+δt} = i − 1|N_t = i)
= µ_iδt + o(δt)

P(Nt+δt = i|Nt = i) = 1 − (λ + µ)iδt + o(δt)

(still not irreducible as cant leave state 0)

continuous MC:
g_ij =
{λ_i          , j = i + 1
{−(λ + µ)i   'j = i
{µi             ,j = i − 1
{0 otherwise.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

EXAMPLE 26: M/M/n queue

Imagine that we have a queue with n servers, and assume that:
• Arrivals of customers occur as a Poisson process of rate λ.
• There are n servers, who can each serve one customer at a time, and will automatically
move on to the next customer in the queue when their current customer leaves.
• Each customer currently being served completes their service at rate µ, so that the probability they leave in (t, t + δt] is µδt + o(δt).

A

Let N_t= number in the queue at time t.

*If f i ≤ n, then in (t + δt], we may have an
arrival, which is an event of a Poisson process, so has probability λδt + o(δt), or we may have a
service completion, which happens with probability µiδt + o(δt) as there are i customers being served, or we may have more than one event with probability o(δt)

gi,i+1 = λ,
gi,i−1 = µi 
gii = −(λ + µi). 

*if i is bigger than n thengi,i+1 = λ still but only n customers being served

g_i,i−1 = µn
and
g_ii = −(λ + µn)

This chain is irreducible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

EXAMPLE 27: Generalized birth-death process

prev examples generalized

A
This has transitions
i 
→i + 1,        λi δt + o(δt);
→ i,            1 − (λi + µi) δt + o(δt);
→ i − 1,        µi δt + o(δt).
corresponding to
gij =
{λi    if j = i + 1
{−λi − µi   if j = i
{µi  if j = i − 1
{0 otherwise
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

continuous MC and transition probabilities

obtained from G

A

writing generator defs in matrix notation:

P(δt) − I = Gδt + o(δt),
δt → 0

(o(δt), goes to 0)

ie
limit as δt → 0 of
[ (P(δt)-I)/δt] = G

(one way of thinking about G is that its the derivative of t-step transition matrix at time 0)
S by S matrix or infinite state space

considering chain from starting state at time 0 to state at time t + δt

Chapman-Kolmogorov eq:
transitions over the period t + δt. expressed in terms of those between 0 and t and in final short time δt.

P(t + δt) = P(t) P(δt)
Then

[P(t + δt) − P(t)]/δt =
[P(t) (P(δt) − I)]/δt

Let δt → 0 and e limit of the left hand term is deriv P’(t) of P at time t.

ie P’(t) = P(t) G,
known as the forward differential equations.

backward differential equations from paths over 0, t+ into short (δt) and long (t)
sections rather than vice versa

P’(t) = G P(t).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

note: generator matrix

A

*All continuous chains with finite state space rows sum to 0 (we will only see in course st true)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

the forward differential equations.

A

P’(t) = P(t) G,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

backward differential equations

A

P’(t) = G P(t).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Solutions of forwards and backward diff eqs

A

for finite chains:
the equations are matrix generalizations of the simple DE

y’(t)=g y(t) with constant g

solution: y(t)=y(0)e^{gt}

With P(0) = I, solution for the forward and backward equations

P(t) = exp(Gt),
where, for a matrix G, the exponential function is defined as
exp(Gt) =
Σ_{i=0,∞} [(Gt)^i/i!]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Spectral decomposition of G:

A

Assuming G is diagonalizable, the spectral representation of G (cf (1)) is
G = T DT −1

where T is a matrix whose columns are right eigenvectors vector(t) of G, so that
Gt = dt, (t is bold)
and D is a diagonal matrix of the corresponding eigenvalues d

D=
[0    0     ... ...]
[0  d_2  0 ... ]
[0    0   d_3..]
[...   ...    ...   ...]

*conservatism of G guaranteeing that 0 is an eigenvalue (NOT 1)

G^2= TDT-1 TDT-1= TD^2T-1

G^n = T D^nT−1, 
n = 1, 2, . . .
where D^n=
[0    0     ... ...]
[0  d^n_2  0 ... ]
[0    0   d^n_3..]
[...   ...    ...   ...]
24
Q

Solutions of forwards and backward diff eqs

expressed in matrix

A

Spectral representation of G allows exp(Gt) to be found

P(t) = P(0) exp(Gt).

exp(Gt)= Σ_{n=0,∞} [(Gt)^n/n!]
= T ( Σ_{n=0,∞} [(Dt)^n/n!]T^-1

= T
[ 1  0  ...  ...]
[0  Σ[(d_2 t)^n/n!] 0 ...]
[0   0    Σ[(d_3 t)^n/n!] ...]
[...  ...  ...  ...]
T^-1
=T
[1  0 ... ...]
[0 e^{d_2 t} 0 ...]
[0  0  e^{d_3 t} ...]
[... ... ... ...]
T^-1
25
Q

EXAMPLE 28

Two state chain

G=
[-1 1]
[2 -2]

A
  • if currently in state 1 then prob of jumping to state 2 in δt is δt
  • in state 2 prob of jump to state 2 in δt is 2δt

tend to jump faster away from state 2 than state 1

(expect more time in state 1 as t increases)

G has eigenvalues 0 and -3, diagonalize
G=
[1 1 ][0 0][2/3 1/3]
[1 -2][0 -3][1/3 -2/3]

Hence P(t)=
[1 1 ][0 0 ][2/3 1/3]
[1 -2][0 e^{-3}][1/3 -2/3]
=
[ (2/3) + (1/3)e^{-3t} (1/3) -(1/3)e^{-3t}]
[(2/3) - (2/3)e^{-3t} (1/3) +(2/3)e^{-3t}]

(TM rows sum to 1)

so P(t) tends to
[2/3 1/3]
[2/3 1/3]

as t tends to infinity ( as e^{-3t} tends to 0)

*prob of being in state 1 a long time in the future is 2/3 and 2 is 1/3

entries converge to some kind of stat dist?

26
Q

Solution of forward equations for Poisson process

A

(missed calculations in lectures)

If we treat s as fixed, this is an ordinary differential equation with respect to t, and its solution
is
F_{N_t}(s) = C exp (λ(s − 1)t).

F_{N_t}(s) = exp (λ(s − 1)t).

A Poisson distribution with parameter µ has probability generating function given by
Σ{k=0,∞} [((µ^k e^−µ) /k!) s^k] = e^−µ * Σ{k=0,∞}
[(µs)^k /k!]
= e^−µ e^µs =
e^µ(s−1)
,
so we can recognise F_{N_t}(s) as the probability generating function of a Poisson distribution with
parameter λt. The probability generating function determines the distribution, so we do indeed have that Nt ∼ P o(λt).

27
Q

limiting distributions

A

Assume that as t → ∞ the transition probabilities pij (t) converge to some limit πj for each j, whatever initial state i

In matrix each row of P(t) tends to row vector bold(π ) of the π_j

Given pij (t) converge to constants expect their time derivatives to converge to zero

i.e. P’(t) → 0

Thus forward equations
P’(t) = P(t) G
as t → ∞ the first term tends to 0 and each row of second term tends to bold(π)G st limiting probabilities satisfy

bold(π)G=0

(limiting distribution satisfies this)

28
Q

Stationary dists

A

Say that a distribution bold(π) = (πj ) (a row vector) on the states is stationary if
πP(t) = π for all t > 0.

29
Q

Stationary dists and limiting dists

A

*If P(t) can be expressed as exp(Gt) then a distribution π is a stationary distribution if and only if π satisfies
bold(π)G=0 condition In particular this identifies stationary
and limiting distributions for finite chains. The identification holds for many infinite chains
too.

30
Q

skeleton chain

A

An interesting way to think about the limiting behaviour of a continuous time chain is to
imagine observing the chain only at discrete times 0, δt, 2δt, . . . . The resulting sequence is a discrete time Markov chain (called a skeleton chain

(obtain a discrete MC from a continuous one and deduce results from another )

*skeleton chains are automatically aperiodic so we will use existence of limiting dists

31
Q

if continuous chain is irreducible (all states communicate in the sense that p_ij (t) bigger than 0 and p_ij(s) bigger than 0 for some t and s), then either

if the chain is irreducible there are possible routes from I to j and j to I so positive transition probs

A

1) (STABLE CASE)the system is stable and the limiting distribution is limit as t tends to infinity of [ p_ij (t)] = π_j is the unique solution of bold(π)G = 0. (we have a limiting dist=stat dist)

2) (corres to null recccurence or transience for discrete) pij(t) → 0 for all states and
bold(π)G = 0
has no solution

(in discrete case we required aperiodicity but In continuous we assume this)

32
Q

EXAMPLE 30: two state chain,

find a stationary distribution for the chain in Example 22

Let S = {1, 2} and set
G=
[−α α
[ β −β]

α, β bigger than 0

A
We solve 
(π1 π2)*
[−α α]
[β −β]
= (0 0)

−απ1 + βπ2 = 0
απ1 −βπ2 = 0
so:
π2 = (α/β)π1

and constraint 
π1 + π2 = 1
Stationary dist is
bold(π) = 
( (β)/(α+β)   (α)/(α+β))
33
Q

Example 31. Generalized birth-death process

finding stat dist

This has transitions
i 
→i + 1,     λi δt + o(δt);
→ i,        1 − (λi + µi) δt + o(δt);
→ i − 1,    µi δt + o(δt).
corresponding to
gij =
{λi    if j = i + 1
{−λi − µi   if j = i
{µi  if j = i − 1
{0 otherwise
A
G=
[−λ_0    λ_0       0   0   0 …]
[ µ1    −(λ1 + µ1)    λ1 0 0 …] 
[0      µ2    −(λ2 + µ2)   λ2 0 …] 
[0 0 µ3 −(λ3 + µ3) λ3 …]
[ ... ... ... ... ... …]

The equations for the stationary distribution
(π_0 π1 …)G = 0

are therefore
λ_0π_0 = µ_1π_1

(λj + µj)πj = λ_{j−1}π_{j−1} + µ_{j+1}π_{j+1}, j ≥ 1

π1 = (λ0/µ1) π0,

π2 = (1/µ_2){−λ_0π_0 + (λ_1 + µ_1)(λ_0π_0)/ (µ1)) =
(λ0λ1/µ1µ2)π0

in general:
πj =
(λ0 …λ_{j−1})/( µ1 …µj)π0

which gives a probability distribution if and only if λ_0 ≠ 0 and

Σ_{j=1, ∞} [(λ0 …λ_{j−1})/( µ1 …µj)] less than ∞
ie sum is finite

Then the distribution has prob funct

π0 =
1/
[1 + Σ_{j=1, ∞} [(λ0 …λ_{j−1})/( µ1 …µj)]]

πj =
(λ0 …λ_{j−1})/( µ1 …µj)π0

j = 1,2,….

34
Q

Example 32. M/M/1 queue

For the M/M/1 queue, Example 26 with n = 1, we have λi = λ and µi = µ for all i.

A

So example 31 means:

*Σ_{j=1, ∞} [(λ^j)/( µ^j)] less than ∞

true iff λ less than µ.
(I.e. the service rate exceeds the arrival rate

summing the geometric series gives π_0 = (µ−λ)/µ , and so π_j =(µ−λ)/µ (λ/µ)^j

Hence the stationary distribution is a geometric distribution.

35
Q

Example 33. Immigration-death process

Consider a generalized birth-death process with λi = λ for all i, representing a constant rate of immigration, and µi = µi, as in the linear birth-death process.

A

The criterion in Example 31 now becomes

Σ_{j=1, ∞} [(λ^j)/( µ^j j!)] less than ∞

which is always satisfied because
Σ_{j=1, ∞} [(λ^j)/( µ^j j!)] =
e^{λ/µ} -1

Then
π_0 = 1/e^{λ/µ} and

πj =
λ^j/(µ^{j} j! e^{λ/µ})

= ((λ/µ)^{j} e^{−λ/µ})/ j!

showing that the stationary distribution is Poisson with parameter λ/µ.

36
Q

Evolution of the chain

A

How long does chain remain in current state for?

Holding times

Where does it go when it leaves that state?

First Jump Probabilities:

37
Q

Holding times

A

Suppose chain is initially in state u. denote H_i as length of time after which it first leaves I (HOLDING TIME in i)

We consider :
F(t) = P(Hi > t)

(1−F(t) =cumulative distribution function of Hi)

From the markov property
(probability that chain is still in state i at time t + δt:

F(t + δt) =
F(t)(1 + g_{ii}δt + o(δt))

(probability still in state i at time t)*(probability that it doesn’t move away between time t and t + δt)

(g_ii is negative as diag entry)

38
Q

Distribution of holding times

A

F(t + δt) =
F(t)(1 + g_{ii}δt + o(δt))

On subtracting F(t) from both sides, dividing by δt and taking the limit it follows that
F’(t) = g_{ii}F(t)

The solution is
F(t) = e^{g_ii t}
(we knew F_0 =1 so constant included in sol)
so that Hi has an exponential distribution with mean 1/(−gii) (recall that gii less than 0).

For notational convenience we can write −gkk, as gk. Then Hi is exponential with mean 1/gi.

39
Q

First Jump Probabilities:

A

P(i ↝ j during (t,t + δt) | leave i during (t,t + δt))

=
gijδt + o(δt) /(−giiδt + o(δt))
for j ≠ i
→gij/gi

(ij entry of generator matrix/ ?)

= ¯ pij,

say, as δt → 0.

The ¯ pij are called first jump probabilities. For a conservative chain (rows sum to 0)

Σ_{j ≠ i} [ ¯ pij] =1, for each i

so the first jump probabilities form a stochastic matrix( with zero diagonal )

Corresp discrete time chain is the first jump chain

also true that if J is dest. state on leaving i

P(J=j | H_i=t)
= ¯ pij , j ≠ i}

*calculations missed, since the chance of 2 jumps in δt is o(δt)
etc

40
Q

How the chain develops

A

The process remains in its initial state i for an exponentially distributed time with mean 1/gi

It then jumps to another state j ≠ i chosen according to the first jump probabilities ¯ pij

sequence of steps you pass through given by discrete jump chain

length of time/ holding times are exp dist and is independent of the jump chain

41
Q

Continuous markov chain is described by:

A

A continuous markov chain is described by generator matrix G

off diagonal entries are rates of moving to state j given state i (different states) g_ij

entries on diagonal are negative so g_ii is -g_i

rows sum to zero usually so g_i’s are found by adding the other entries in the same row

  • starts in a state and remains in that state for a time with exponential distribution with parameter g_i ie minus the diagonal entry in matrix
  • mean of exponential is 1/g_i

Eventually leaves- holding time ends and moves to other state j with probability
¯ pij=
gij/gi

we think of these as transition probabilities as we have they sum to 1, we call the jump chain

42
Q

To simulate a continuous markov chain

we can study the discrete time chain and dont need new methods for inference: we study the jump chain

A

We use an exponential dist for RV and discrete markov chain for jump chain

holding times with exponential distributions

To see how, note that the jump chain, the discrete time chain obtained
by observing the continuous time process only when it changes state, passes through the
same sequence of states as the continuous time process, so properties depending on space but not time structure – for example, probabilities of absorption – can be obtained from the discrete jump chain.

Note, however, that the jump chain and the underlying continuous time chain will often have different stationary distributions.
ie could spend different amounts of time in different times

43
Q

Example 34. Linear birth-death process

from example 25

so each population member reproduces at rate λ

and dies at rate µ

A
gi,i+1 = λi, 
gi,i−1 = µi and 
gi,i = −(λ + µ)i

(entries in G)

This tells us that the holding time in state i: the length of time remaining in state i is exp dist with parameter (λ + µ)i

probability that next event is a birth (jump chain up)
p¯i,i+1 =
gi,i+1/ −gi,i=λ/ (λ + µ)

probability that the next event is a death is p¯i,i−1 =
µ/ (λ + µ)

for any i ≥ 1. State 0 is a special case;

if state of chain is 0 then remains 0: corresponds to 3

44
Q

BEHAVIOUR OF JUMP CHAIN

Example 34. Linear birth-death process

from example 25

so each population member reproduces at rate λ

and dies at rate µ

A

The behaviour of the jump chain is then similar to the Gambler’s Ruin discrete time chain

with the probability of jumping up being λ/(λ+µ)
and that of jumping down
being µ/(λ+µ), except at state 0 which cannot be left. The only difference is that in MAS275 there was an upper target N at which the gambler would stop, whereas here the population may increase indefinitely.
From MAS275, the probability of reaching N before 0 starting at state i is
q_{N,i} =
{[1−(µ/λ)^i[/] 1−(µ/λ)^N] λ ≠ µ
{i/N λ = µ.

Letting N → ∞, we see that q_{N,i} → 0 if µ ≥ λ,
indicating that the size of the population will not get indefinitely large and will eventually reach 0, so the population will become extinct
with probability 1.
(birth rate smaller than death rate- hard to reach large pop)

On the other hand, if µ < λ then there are two possibilities. We have that
q_{N,i} → 1 − (µ/λ)^i

, indicating that, if the chain starts in state i there is a probability 1 −(µ/λ)^i
that the population will become indefinitely large and not become extinct, but there is also a probability
(µ/λ)^i that the population will reach 0, and hence become extinct

(positive probability of getting indefinitely large and probability of becoming extinct)

45
Q

How to fit a continuous-time Markov chain model to observation

LIKELIHOOD

A

GIven a complete record of the chain’s evolution over the period [0, t], {x(s) :
0 ≤ s ≤ t}

we know holding times and sequences of states it was in

We suppose the model is parametrized by its generator matrix G. Likelihood of G is the the probability (density) of the observation as a function of the parameters

*Exp continuous components and discrete jumps

*the chain’s path can be described in terms of the sequence of states
passed through and the lengths of holding time in each. Thus let
{X¯_k : k = 0, 1, . . . , Nt}

denote the sequence of states passed through by time t, Nt being the total number of jumps. The holding time in state X¯_k is denoted by H_k. (remains in state for random time H_k, exp dist with parameter entry of G)

Then conditionally on X¯_0=x_0 the likelihood is

**Holding time H_{N_t} is incomplete at time t, we don’t know how long stays in state X¯_n_t

observed values in lower case vs upper RVs
We observe state 0 to n_{t-1} X¯0 to X¯{t-1}, holding times are exp dist terms for each holding time (where gs are same) * each transition we observe to a different state

(last term corresponds to the last jump, we know its at least this value, at least this holding time up until t, the probability that this value is at least than this is used as the last term)

L(G : x(s), 0 ≤ s ≤ t) =

{∏ {k=0, n{t} − 1}
[gx¯k e{−gx¯k h_k}
p¯x¯_k x¯
{k+1}) ]
e^{−gx¯
{n_t} h_{n_t}

=
{k=0, n_t-1} e^{−gx¯k
h_k g
{x¯k,x¯{k+1}})
e^{−gx¯
{n_t} h_{n_t}}

= exp−(∑i [ g_ia_i]) ∏{i≠j} [g^{nij} _ij ]

where
a_i is the total time spent in state i during [0, t]

and
nij is the number of transitions observed from state i to state j during [0, t]

*therefore we only need n_t,a_i, n_ij to find likelihood
number of transitions observed, amount of time spent in each state and number of transitions between each pair of states

46
Q

How to fit a continuous-time Markov chain model to observation

LOG LIKELIHOOD

A

the log likelihood is
l =
−∑i [g_ia_i] + ∑{ij}
[n_ij log g_{ij}]

which, because g_i = −g_{ii} =
∑_{i≠j} gij ,

is the same as
l = −∑i ∑{j≠i} [ g_ijai] +
∑_j≠i [nij log gij]

47
Q

How to fit a continuous-time Markov chain model to observation

MAXIMUM LIKELIHOOD ESTIMATES

A

partial derivative of log likelihoodwrt to each g_ij each transition rate form i to j

∂l/∂g_{ij} = −a_i + nij/gij

and is zero at at
g^ij = n_ij/a_i
i ≠ j
mle is
number of transitions from i to j  / time that it was in state i 

(ie dividing by the time it had the opportunity to move from i)

DIAGONAL ENTRIES ESTIMATE

for gi = −gii follows from
g^i = −g^_ii =
∑_{j ≠ i} [g^ij] =
∑_{j ≠ i} [n_ij/ai] =
ni./ai

transitions out of state i/ amount of time in state i

ie rate of transitions out of state i

48
Q

How to fit a continuous-time Markov chain model to observation

LIKELIHOOD NOTES

covariance matrix
variance matrix

A

**estimates g^ij = n_ij/a_i
i ≠ j have the form
no. transitions i to j per unit time spent in state i.

gij as the rate
at which transitions from i to j occur.

** generator elements gij are themselves functions of a lower dimension vector parameter, θ say In that case the maximum likelihood estimate θ^
would be found as the value at which l is maximum with respect to θ, and the generator elements would be estimated as gij (θ^)

** The asymptotic variance-covariance matrix of the estimates is given as usual by the inverse of the information matrix, minus 1 times the matrix of second derivatives of
the log-likelihood l with respect to the parameters. (This is the observed information matrix J whose expectation is I, the expected information matrix. Either J or I may be used; in large samples they lead to equivalent results.)

49
Q

Example 35. Inference for a linear birth-death process.

Birth rate for pop n and death rate for pop n, where parameters are unknown

inference for parameters

we have the data: #transitions and holding times

A

λ_n = λ n and µ_n = µ n, for n = 0, 1, . . . , where λ and µ are unknown. The generator
elements are

g_ij=
{λi j = i + 1
{−(λ + µ)i j = i
{µ i j = i − 1

Suppose as above that the process has been observed over an interval [0, t], and the times ai and transition counts n_ij recorded. Then the log-likelihood for λ and µ is

l = −
i [g_ia_i] +
{i,j} nij log gij

=
−∑_i [(λ + µ)ia_i] +
i [n{i,i+1} log(λi)] +
i [n{i,i−1} log(µi)]

partial diff wrt to either parameter, births and deaths terms

∂l/∂λ = −∑_i [ia_i] +
(1/λ) ∑i [n{i,i+1}]

∂l/∂µ = −∑_i [iai] + (1/µ)
i [n{i,i−1}]

mles are:
λ^ = ∑i [n{i,i+1}]/∑_i [iai]

µ^ = ∑i [n{i,i−1}]/ ∑_i [iai]

b =∑[ n{i,i+1} and d =
i [n{i,i−1}] are the total number of births and deaths respectively,
and ∑[iai] is the total accumulated time, T say, (person-years say) during which birth or death could happen to one individual, so λ^ is the observed birth
rate b/T, (spending more time for each individual members to born die for bigger population)

and µ^ the observed death rate d/T, per individual per unit time.

50
Q

Example 35. Inference for a linear birth-death process.

ESTIMATED STANDARD ERRORS

CIs

A

To find standard errors we calculate the observed information matrix J. From the above
∂^2l/∂λ^2
= −b/λ^2

∂^2l/∂µ∂λ = 0
∂^2l/∂µ^2
= −d/µ^2

so that
J =
[b/(λ^2) 0 ]
[ 0 d/µ^2]

and J−1 =

[(λ^2)/b 0 ]
[0 µ^2/d]

Estimated standard errors for λb and µb are obtained from J−1 by taking square roots of the appropriate elements and substituting estimates for unknown parameters. For example
ese (λ^) = λ^/√b

and an approximate 95% confidence interval for λ is
λ^ ± 2(λ^/√b)

51
Q

Continuous-time Markov chain models and developments of them are used in many areas; for example in:

A
  • population studies, where, for example, they have been developed to take account of age structure and spatial distributions of animal populations;
  • manpower planning for large organizations;
  • studies of social and occupational mobility; • diffusion of news, rumours and internet viruses;
  • competition, in ecology and in animal populations;
  • predator-prey phenomena;
  • disease modelling, including cancer growth;
  • epidemic modelling

The following sketch of epidemic modelling aims to indicate the subject-matter and some of the demands it makes for development of ideas and techniques.

52
Q

EPIDEMICS: cont process

A

Compartmentalised population
SUSCEPTIBLE- one who may catch the relevant disease by contact with an infective individual

INFECTIVE- infected

REMOVED/IMMUNE- The removed individuals are those who, having caught the disease earlier, have now recovered and become immune or have been removed (by death or by being isolated, for example) and are no longer open to re-infection themselves or able to infect any one else

For a population of n, S(t),I(t) and R(t) thus S(t)+I(t)+r(t)=n.
(we only track S and I)

53
Q

EPIDEMIC RATES

A

A susceptible person meeting infected can catch the disease At time t the rate of occurrence of such meetings and infections will depend on the numbers S(t) and I(t) of both susceptibles and infectives. proportional

*rate of change in the number of susceptibles is dS(t)/dt = −βS(t)I(t)
for some β bigger than 0

β= infection rate

*reduction if infected recovers, rate proportional to I(t)
dI(t)/dt
= βS(t)I(t)−γI(t) =
(βS(t)−γ)I(t)

where γ is a constant referred to as the removal rate.

  • the rate of change of the number removed is
    dR(t)/ dt = γI(t).
54
Q

For epidemic begins from a small number I0 of infectives at time 0.

A

We approximate S by S_0 for beginning of chain; we only track I for linear birth/death process

solution of equations
dS(t)/dt = −βS(t)I(t)
dI(t)/dt = (βS_0)−γ)I(t)

then gives a deterministic prediction for the progress of the epidemic until no new infections are occurring

55
Q

EPIDEMIC RATES giving MARKOV CHAIN

continuous time MC model

A

during (t, t+𝛿t)

(I,S) →
{(I+1,S-1) with prob βSI𝛿t+o(𝛿t)
{(I-1),S) with probability γI𝛿t+o(𝛿t)

A markov chain on the state of pairs (S,I)

*if we consider at the beginning of the epidemic #S close to n and I close to 0, reduction in S slow
thus the transition probabilities will be approx.
I →
{I+1 with prob βS_0I𝛿t+o(𝛿t)
{I-1 with probability γI𝛿t+o(𝛿t)

only changes in I need be tracked

Transition probabs of a simple linear birth-death process with
birth rate λ_i = βS_0i
death rate μ_i = γi

Follows that infectives die out if
βS_0⩽γ

(S_0 ⩽ ρ) and therefore there will be no major epidemic

(γ/β = ρ)

*If βS_0 bigger than γ there is still a probability ( γ/βS_0 )^l_0 of extinction so that a major epidemic MAY take place (not def will take place)

Different conclusions for stochastic approach vs deterministic