CHAPTER 2: Review of probability models Flashcards

1
Q

THE BASIC POISSON PROCESS

one dimensional poisson PROCESS

A

one dimensional poisson process is a model for the random times of occurrences of instantaneous events

Assume start counting at time t=0

graph of t against N(t): jumps at time of events 
----xx--x---x---> t 
|              --
|          --
|  _-----     
\_\_\_\_\_\_\_\_\_\_> t

N(t) =random variable for #events in (0,t]

for 0< u < v
N(v) -N(u) = RV for # events in (u,v]

N is a random counting function:

  • N(t) is non-neg integer for any t
  • N is an increasing function
  • N(0)=0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

the poisson DISTRIBUTION

A

The POISSON DIST was introduced by instantaneous events occurred in an interval of time (0,t]. Assumptions and dividing interval

So:

# events up to time t have dist equiv to binomial with 
n trials and success probability μ/n 
where μ= expected #occurrences over the whole interval. 

Limit as n tends to infinity gives the POISSON DIST with parameter μ

we assume the expected #occurrences in (0,t] is proportional to t so N(t) has POISSON DIST with parameter λt

(λ constant of proportionality)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

N(v)-N(u)

A

(u,v]

N(v)-N(u)~Poisson(λ (v-u))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

POISSON PROCESS ASSUMPTIONS***

A

Homogeneous one dimensional poisson process:

1) For any 0 ≤u ≤v, distribution of N(v)-N(u) is poisson with parameter λ(v-u)
2) If (u_1,v_1], (u_2,v_2],…, (u_k,v_k] are DISJOINT time interval then N(v_1)-N(u_1), N(v_2)-N(u_2),…, N(v_k)-N(u_k) are indep RVs

( disjoint periods of time have indep #occur)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

MODELLING AN EPIDEMIC

A
initial outbreak SIR model
pop size N at continuous time t
S_t- susceptible
I_t- infective
R_t- recovered/removed

Assume
S_0= N-1
N_0= 1
R_0= 0

For a short interval between times 0 and δt:

  • one S catches the disease OR
  • one I recovers/is removed - epidemic is over OR
  • nothing changes

FROM (S_t,I_t,R_t)
to (S_t -1, I_t +1, R_t) at time δt
with probability ~ λS_tI_tδt

OR
to (S_t , I_t -1, R_t+1) at time δt
with probability ~ μI_tδt
(S_t doesnt change rate of recovery, higher prob if more infective)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Endemic

A

if recovery not faster than spreading.

λ and μ
dependent and rates of transitions depend on previous ie current state only

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

INTER-OCCURRENCE TIMES

A

Random vars T_i, continuous. Let T_i denote length of time until first occurrence. T_2 length of time between first and second occurrences.

T_n represents time between occurrences n-1 and n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

THEOREM 1: 2 occurrences in a Poisson process

A

The probability that in (0, t] two occurrences of a Poisson process with rate λ occur at exactly the same time is zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

THEOREM 1: 2 occurrences in a Poisson process
PROOF

The probability that in (0, t] two occurrences of a Poisson process with rate λ occur at exactly the same time is zero.

A

PROOF

Divide (0,t] up into n small intervals
( {i-1}t/n, it/n] for i=1,..,n

|___|___|_…….|___|___|
0 t/n 2t/n….. (n-1)t/n t

By our Poisson process assumptions

N(it/n) - N ({i-1}t/n)~Poisson(λt/n)

so we find the probability that this is ≥ 2

P( N(it/n) - N ({i-1}t/n) ≥ 2)
= 1 - [λt/n]exp{-λt/n} - exp{-λt/n}
(= 1 - (probability of 1) - (probability of 0) )

our small intervals are disjoint, so their numbers of occurrences are indep (by assumption). So, if we let Y_n be the number of the small interval with at least 2 occurrences.

Then, Y_n~ Bin(n, 1- exp(-λt/n)(1 + (λt/n)))

so P(Y_n =0) = ( exp(-λt/n)(1 + λt/n)^n) 
(by binomial P(Y=0)=(1-p^n) )

= exp(-λt)( 1 + (λt/n))^n

Because limit as n tends to infinity (1 + (λt/n))^n = exp(λt)
P(Y_n=0) tends to 1 as n tends to infinity*
so there cannot be a positive probability of two occurrences at the same time t

( But if there were
a probability p > 0 that there were two occurrences at exactly the same time, we would
have P(Yn = 0) < 1 − p for all n, so this cannot be the case. Hence the probability of there
being two occurrences at exactly the same time is )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

THEOREM 2: inter-occurence times

A

Inter-occurrence times are independent of each other, and are exponentially distributed with parameter λ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

THEOREM 2
PROOF

Inter-occurrence times are independent of each other, and are exponentially distributed with parameter λ

A

PROOF
P( T_1 ≤ t) is the probability that the first occurrence has happened by time t
so is the probability that there is at least one occurrence in (0,t]

 Hence
P(T_1 ≤ t) = 
P(N(t) ≥ 1) 
= 1 − P(N(t) = 0) = 1 − e^{−λt}
 (the distribution function of an exponential dist with parameter λ)

Consider the probability that T_n is at most t, conditional on the values of T_1, T_2, . . . , T_{n−1},

P(T_n ≤ t|T_1 = t_1, T_2 = t_2, . . . , T_{n−1} = t_{n−1})

(There are some technical details to deal with the fact that we are conditioning on an event of probability zero, which is why this proof is a “sketch”.) Similarly to the above argument,
this is the probability that there is at least one occurrence between times t_1 +t2 +. . .+tn−1
and t1 + t2 + . . . + tn−1 + t, which again is 1 − e^{−λt}. So, regardless of the values taken by
T_1, T2, . . . , Tn−1, the conditional distribution of Tn is exponential with parameter λ, which
implies the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ANOTHER WAY OF DEFINING THE POISSON PROCESS

A

We can construct the poisson process: by considering a sequence of indep RVs T_1, T_2,….

each with exponential distribution with parameter λ. Discrete events occur at T_1, T_1+T_2,…

N-1 th event occurs at time T_1 + T_2 + … + T_{N-1}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

take care

A

take care when referring to less than or equal to events with probabilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

EXAMPLE 1: disease modelling

infections from a rare disease from a poisson process with rate 2 per year. Let N_t =#infections between now, time 0, and time t years then:

Prob of no infections in next year

probability there are no infections in the next two years and then three in the year after

Expected time until next infection

A

• The probability there are no infections in the next year is P(N(1) = 0) = e^{−2} = 0.135 . . ..

• The probability there are no infections in the next two years and then three in the year after that is 
P(N(2) = 0, N(3) − N(2) = 3) 
= P(N(2) = 0)P(N(3) − N(2) = 3) 
= e^{−4} ·e^{−2}2^{3}/{3!} 
= 0.00331 . . ., 

as N(2) ∼ Po(4) and
N(3) − N(2) ∼ Po(2),
and they are
independent.

(disjoint)
• The distribution of the time until the next infection occurs is exponential with parameter 2 (and so has mean 1/2).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

THM 3:

A conditional property of the Poisson process

A

Given the total number of points of a homogeneous Poisson process in an interval, the positions of the points are independently uniformly distributed over the interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

notation: N_{u,v}

A

N_{u,v}= N_u - N_v

= # occurences in (u,v]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

THM 3:
Given the total number of points of a homogeneous Poisson process in an
interval, the positions of the points are independently uniformly distributed over the interval.
NOTE

A

Note that this is equivalent to the statement that, conditional on there being n occurrences
in (s, t], the number of occurrences in any interval (u, v] ⊆ (s, t] (so 0 ≤ u < v ≤ t) has
a Bi(n,(v − u)/(t − s)) distribution, as each of the n occurrences would have probability
(v − u)/(t − s) of being in (u, v], independently of the others. We will prove this latter
version of the statement, and will assume, without loss of generality, that the interval we
are conditioning on the number of points in is (0, t].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

THM 3:
Given the total number of points of a homogeneous Poisson process in an
interval, the positions of the points are independently uniformly distributed over the interval.
PROOF

A

Proof. For simplicity, write Nu,v = N(v) − N(u), the number of occurrences in
(u, v].

Note
that N_{0,t} ∼ Po(λt),
N_{u,v} ∼ Po(λ(v −u)) and N_{0,t} − N_{u,v} = N_{0,u} + N_{v,t} ∼ Po(λ(t−(v −u)),

and the latter two random variables are independent. Thus, for 0 ≤ a ≤ n,
P(N_{u,v} = a|N_{0,t} = n) = P(N_u,v = a, N0,t = n)
P(N0,t = n)
=
P(Nu,v = a, N0,t − Nu,v = n − a)
P(N0,t = n)
=
P(Nu,v = a)P(N0,t − Nu,v = n − a)
P(N0,t = n)
=
(λ(v − u))a
e
−λ(v−u)
a!
(λ(t − (v − u)))n−a
e^{−λ(t−(v−u))
(n − a)!
n!
(λt)
ne−λ
=
e
−λ(v−u)
e
−λ(t−(v−u))
e−λ
n!
a!(n − a)!
λ
a
(v − u)
aλ
n−a
(t − (v − u))n−a
(λt)
n
=
n
a
 v − u
t
a 
1 −
v − u
t
n−a
,
which is the probability that a Bin(n,(v − u)/t) random variable takes the value a, as
required.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Another way to simulate Poisson process ~λ on [0,t]

A

1) first generate the #points in [0,t] as observations N(t) from the poisson distribution Po(λt)

2) generate n independent uniform u(0,t) vars U_i
i=1,…,n and put a point at each position U_i

then the successive positions of the points from left to right will be ordered values W_1 ≤… ≤ W_n of the U_i’s

***prev
We can construct the poisson process: by considering a sequence of indep RVs T_1, T_2,…. each with exponential distribution with parameter λ. Discrete events occur at T_1, T_1+T_2,…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

DEF 4:

Events in a short period of time, and Landau O and o Notation

A

For sequences x_n and y_n,

x_n = o(y_n) as n → ∞ means lim as n→∞ of x_n/y_n = 0,
and

x_n = O(y_n) as n → ∞ means that for some B, |x_n/y_n| ≤ B for all n sufficiently large.

If instead f(h) and g(h) are (real) functions of a continuous variable h, then
f = o(g(h))
and f = O(g(h))
as h tends to a limit have analogous meanings.

f = o(g(h)) as h tends to l:
if f(h)/g(h) tend to 0 as h tends to limit e.g 0 then f(h)= o(g(h))

f = O(g(h)) as h tends to l:

if f(h)/g(h) less than or equal to B for all n sufficiently large then f(h)= O(g(h)) at least as fast as

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Landau O and o Notation

NOTES

A

We will use these continuous versions mainly in the case
h → 0, for example to look at the behaviour of probabilities over very short time intervals.

O:
x_n tends to 0 at least as fast as y_n
grow at the same rate

o:

strict upper stronger statement
as fast as / faster than

22
Q

EXAMPLE: Binomial distribution asymptotics

A

Let S_n ∼ Bin(n, p) with fixed p ∈ (0, 1), and consider
x_n = P(S_n ≤ 1) = (1 − p)^n + np(1 − p)^{n−1}

As x_n = (1-p)^{n-1} (1-p +np) x_n tends to 0 as n tends to infinity.

So x_n =o(1) as n tends to infinity.

Rate of convergence to o by comparing with another funct/seq which converges to 0:

TRY (1-p)^n 
x_n/ (1-p)^n = 1+ n(p/{1-p})
tends to infinity 
so we cannot write 
x_n = O( (1-p)^n) nor is it o

TRY n(1-p)^n

x’_n/{n(1-p)^n}
= (1/n) + (P/{1-p}) ≤ 1 + (P)/(1-P)
and hence bounded

So x_n = O( n(1-p)^n) as n tends to infinity

but x_n ≠o( n(1-p)^n)
as 1/n tends P/(1-P) does not tend to 0 as n tends to infinity

however,
x_n/{n^2(1-p)^n} = (1/n^2) + (p)/(n(1-p)) tends to 0 as n tends to infinity

so x_n = o(n^2 (1-p)^n)
so x_n tends to 0 faster than n^2(1-p)^n and as fast as
n(1-p)^n

23
Q

EXAMPLE 3:

A
Let p(t) be the probability that there is at least one event of a Poisson process with rate λ in the time interval {0,t].  By the assumption that the number of events in the interval has a Poisson
distribution with parameter λt, we have 
p(t) = 1 − e^{−λt}. 

We consider the asymptotics of this
as t ↓ 0.

By SERIES

p(t) = λt − [λ^2t^2/2] + [λ^3t^3/6] − [λ^4t^4/24] + . . .
and thus
p(t)/t
= λ −[λ^2t/2] + [λ^3t^2/6] − [λ^4t^3/24] + . . . .

The right hand side here tends to λ as t → 0, so is bounded close to 0, and hence we can write
p(t) = O(t).

To be more precise, we consider the difference
p(t) − λt, and observe
[p(t) − λt]/t
= −[λ^2t/2] + [λ^3t^2/6] − [λ^4t^3/24] + . . . → 0 as t ↓ 0.

So we can write
p(t) − λt = o(t), and we also write this as p(t) = λt + o(t).

24
Q

POISSON PROCESS CHARACTERIZED

A
  1. If (u1, v1], (u2, v2], . . . ,(uk, vk] are disjoint time intervals then
    N(v_1)− N(u_1), N(v_2)− N(u_2), . . . , N(v_k)− N(u_k) are independent random variables.
  2. N(u) takes non-negative integer values, and is increasing in u.
  3. For any time u, as t ↓ 0 we have that
    P(N(u + t) − N(u) ≥ 1) = λt+o(t)
    ( prob at __, tends to 0 faster than t does)

as t tends to 0, tends to 0 faster than o(t)/t tends to 0 does
* for some small time interval (u, u+t) the probability of one event, 2 or more events is almost 0

and
P(N(u + t) − N(u) ≥ 2) = o(t).

(The first of these properties was the second of the properties we assumed originally.)

25
Q

Discrete time Markov chains
Definition 5.
MARKOV PROCESS

A

Let {X_t : t ∈ T} be a set of random variables (a stochastic process), with an
infinite index set T.
If, for each real number x, each n ≥ 1 and t_1 < . . . < t_n < t with t_i ∈ T for all i, the
following condition (the Markov property) holds

P{X_t ≤ x | X_{t1} = x_1, . . . X_{tn} = x_{n}} = P{X_t ≤ x | X_{tn} = x_n}
we say X_t is a Markov process.

26
Q

POISSON PROCESS and markov

A

Poisson process is a continuous time Markov process

27
Q

DEF 6:

MARKOV CHAIN
STATE SPACE

A

If {Xt, t ∈ T} is a Markov process and there is a countable set S such that
P{Xt ∈ S} = 1 for all t ∈ T we say that {X_t, t ∈ T} is a Markov chain. S is the state
space of the chain and points i ∈ S are states of the chain.

If T is a set of integers, {X_t} is a DISCRETE time Markov chain.

28
Q

DEF 7:Homogeneous Markov chains

A

A Markov chain is said to be homogeneous if and only if X_{t+h} | X_t has distribution independent of t, so that the probability that you move from one state to another
in a fixed number h of moves is the same whenever the moves begin.

(probabilities depend on length of interval, not on starting time)

29
Q

DISCRETE TIME CHAIN

A

with t and n in integers

P{X_{t+n} = j | X_t = i}
= P{X_n = j | X_0 = i}

30
Q

DEF 8:

TRANSITION PROBABILITIES

A

Let X_n be a HOMOGENEOUS DISCRETE time Markov chain. Then we call the
probabilities P{Xn = j | X0 = i} the n-step TRANSITION PROBABILITIES and denote them by
p^(n)_{ij}

(at any 2 times n steps apart)

31
Q

DEF 8:

ONE-STEP TRANSITION PROBABILITIES

A

For n = 1 they are called the one-step transition probabilities or simply transition probabilities, and we just write
p_ij
instead of
p^(1)_{ij} .

p_ij= probability of j at next time step given current state i

32
Q

Transition matrix

A

The n-step transition probabilities form a |S| ×|S| matrix,
(countable or infinite state space)
P^(n) = (p^(n)_ij ), which is called the
n-step transition probability matrix.

For n = 1 it is called the one-step transition matrix, or just the transition matrix,
and we write P instead of P^(1).

*row sums of a transition matrix P are all equal to 1
sum over j of p_ij = 1. (non negative so stochastic matrix)

33
Q

Starting distribution

A

row vector of probabilities which sums to 1, distribution of starting probabilities

34
Q

EXAMPLE

Wet and Dry days

A

S={W,D}
X_n = W means that day n is wet

Assuming a Markov
chain model, let our transition matrix be

[1 − α α ]
[β 1 − β ]

Then, given that day n is dry, the probability that day n + 1 is wet is α, while

given that day n is wet, the probability that day n + 1 is dry is β.

α, β in [0,1]
rows sum to 1

35
Q

EXAMPLE

Random walk on triangle

A

Label the vertices of a triangle by A, B and C, and assume that a particle moves as a Markov chain from vertex to vertex, at each time step moving from its current vertex to one of its neighbours, moving clockwise with probability p and anti-clockwise with probability 1−p. Then
S = {A, B, C}, and the transition matrix is

[ 0 p 1 − p ]
[1−p 0 p ]
[ p 1 − p 0 ]

If p = 1/2 then the random walk is symmetric

36
Q

EXAMPLE Ehrenfest model for diffusion

A

Imagine that we have N molecules, each of which is in one of two containers, A or B. At each time step, one molecule (of the N) is selected at random (each equally likely to be chosen),
and moved to the other container.

Let Y_n be the number of particles in container A at time n. Then (Y_n) forms a Markov chain
with state space
{0, 1, 2, . . . , N} and transition probabilities given by
p_{k,k−1} =k/N

p_{k,k+1} =(N−k)/N

p_{k,j} = 0 if j /∈ {k − 1, k + 1}.

37
Q

The Chapman-Kolmogorov equations

A

Let Xn be a discrete time homogeneous Markov chain with state space S. It was shown in MAS275 that the transition probabilities for Xn satisfy the Chapman-Kolmogorov
equations

p^(m+n)_ij =
Σ
k∈S
[p^(m)_ik p^(n)_kj],

*and that therefore the n-step transition matrix
P^(n) equals P^n ,
the n-th power of P.

38
Q

The Chapman-Kolmogorov equations NOTES

A

*with knowledge of the initial state, allows us to calculate the probability that the
chain will be in any specified state at time n; that is
P{X_n = j} = π^(n)_j

  • any j, any time n
  • condition on state of chain at time 0:
π^(n)_j = P{X_n = j}
=
Σ
i
[P{X_n = j | X_0 = i}P{X_0 = i}]
=
Σ
i
[π^(0)_i p^(n)_ij]

using the Chapman-Kolmogorov equations

39
Q

RANDOM EVOLUTION OF THE CHAIN

A

the random evolution of the chain is determined by the transition matrix P and the initial probabilities π_i

In matrix-vector terms:
π^(n) = π^(0)P^n

where π^(n)
is the row vector with entries (π^(n)_i: i ∈ S).

(
π^(n) is vector giving the dist of state n

π^(0)is vector giving starting dist at time 0

P^n is the nth power of the TM)

40
Q

P^n matrix multiplication

A

spectral representation of P:

P = T DT^{−1}
where D is a diagonal matrix whose diagonal entries are the eigenvalues of P, and T is a non-singular matrix.
columns t of T are right eigenvectors of P; that is, they satisfy Pt = dt for an eigenvalue d of P

  • since P is stochastic d=1 is always an eigenvalue
    corresponding to eigenvector
    t’ = (1, . . . , 1)
  • eigenvalues of a stochastic matrix have modulus≤ 1, so under our assumption about P, d = 1 is the largest eigenvalue
41
Q

P = T DT^{−1}

A

It is convenient
to reorder rows and columns so that the diagonal entries in D are in order of decreasing
modulus, so D is of the form
D =

[1    0    · · · · · ]
[0 d_2  0 · · · ]
[0    0  d_3 · ]
[........................]
where 1 > |d2| > |d3| > . . ..
42
Q

P^n

A
P^n = T D^nT^{−1}
n = 1, 2, . . . .

P^2 = T DT^{−1} T DT^{−1}
= T D^2T^{−1}

D^n =
[ 1     0 · · · · · ·        ]
[0 d^n_2     0 · · ·  ]
[0     0      d^n_3· ·]
[...     ...        ...   ...  ]
43
Q

Stationary distributions

DEF 9:

A
Let X_n be a discrete time homogeneous Markov chain with one-step transition matrix P. 
Any distribution π such that π_j =
Σ
i∈S
[π_ip_{ij}]
is called a stationary
distribution of the chain.
In other words, 
bold(π) = {π_k} satisfies 
π = πP, 
so it is a left eigenvector of P with eigenvalue 1.

If a Markov Chain has a stationary distribution π and if the starting state is chosen according to π, then Xn ∼ π for all n, since
π^(1) = πP = π,
π^(2) = πP^2 = (πP)P = πP =π

etc

44
Q

STAT DIST NOTES

A

*A stationary distribution need not exist or be unique,

*However Markov
chains that do have a stationary distribution are of particular interest as models for processes whose overall properties remain stable over time even though the state of the process itself is
continually changing

*many Markov chains, π^(n)
converge to a stationary distribution, irrespective of the starting distribution, and it is these chains that will be particularly useful as models for stable systems

45
Q

ERGODIC CHAINS

A

ERGODIC CHAINS are IRREDUCIBLE- always move from one state to another by steps
APERIODIC- times to return have common factor 1
POSITIVE RECURRENT- starting at particular probability 1 to return and expected time until return is finite

p^(n)_{ij} → πj > 0 as n → ∞
and
{π_j} is a stationary distribution

46
Q

irreducible and aperiodic

A

if every element of the TM is strictly positive then chain is irreducible and aperiodic
“chain forgets where starts over time”

47
Q

EXAMPLE: STATIONARY DISTRIBUTIONS

A

S={W,D}
π =( π_D π_W)
we find a stationary distribution by solving
π = πP

(1 − α)π_D + βπ_W = π_D
απ_D + (1 − β)π_W = π_W

as its a distribution:
π_D + π_W = 1

π_D =(β/α)π_W

π =(β/(α+β), α/(α+β) )

the chain is irreducible and aperiodic, (and positive recurrent)

and hence ergodic

probability that day n is dry converges to β/(α+β)
as n → infinity

48
Q

EXAMPLE: Ehrenfest model stat dist

A

Solving the stationary distribution equations gives a unique
stationary distribution with
π_j =(NCj)/2^N .

However, this chain is not aperiodic

(odd and even states
alternate)

and so the convergence results do not apply

49
Q

poisson process has number occurences

A

distributed poisson dist with parameter lambda

lambda= mean and variance
"lamba= 1/(expected occur once every \_\_)

we use lambda *t = mean number of occurrences in the interval of length t

P(X=k)= (lambdat)^k * e^{-lambdat} / k!

P(X less than or equal to k)
= sum of less than or equal to k (lambdat)^k * e^{-lambdat} / k!

50
Q

Long run

A

Long run relative frequencies =limiting probabilities= stationary probabilities

We find stationary by solving pi P = pi