CHAPTER 4: Inference for discrete time Markov chains Flashcards

Question 1

Q

A weather problem

Answer

A

We focus on the problem of constructing, ﬁtting and checking a model for daily rainfall at a speciﬁc location. As a ﬁrst step, given observed rainfall amounts, r1,…,rn say, on a sequence of consecutive days, we suppose that daily rainfalls are the realised values of a sequence of random variables R = R1,…,Rn.

Question 2

Q

Given Rainfall at Snoqualmie Falls

Answer

A

1948-1843 Januaries

The fact that on some days there is no rain and on others a positive amount suggests that it might be useful to consider ﬁrst the succession of dry days and wet days, and later model the amount of rain falling on a day conditionally on the day being wet.
Let
X_i =
{1 if Ri bigger than  0 
{0 if not

so that Xi is an indicator variable for day i being wet

We seek a simple model for the Xi. It is possible that precipitation patterns vary seasonally, so initially we concentrate on modelling only within a limited period during which conditions can be expected to remain reasonably stable. For this reason we concentrate on the sequence of Xi during Januaries

Question 3

Q

Independence model:

Snoqualmie Falls

Answer

A

simplest model supposes that the Xi are independent and identically distributed; that is, they form a Bernoulli sequence
ie that its indep of W or D the day before
so one parameter p for prob of being wet, 1-p for dry

*If p = P(X_i = 1) then the likelihood of p based on observations x1,…,xn is
L(p) = p ^{Σ i=,n [xi]} (1−p)^{n−Σi=1, n [xi]}.

(ie probability of a particular sequence is p^{number of wet days}(1-p)^{number of dry days})

Over the Januaries 1948-1983 there were 791 wet days out of 1116 days altogether.

Thus Σi=1,1116 [ xi] = 791
and the log-likelihood is
l(p) = 791log(p) + 325log(1−p)

figure of the log likelihood for parameters at p is a trajectory skewed st peak at around 3.71 ( which is the number of dry days/ total days) a line intersecting at log-likelihood just above -675 gives suitable likelihood region/ 95% CI)

The MLE
ˆ p =Σi=1,n [xi/n] = 791/1116 = 0.71

and an approximate 95% conﬁdence interval for p obtained from the log-likelihood is (0.68, 0.74)

Question 4

Q

Independence model:
Snoqualmie Falls

NOTES

Answer

A

In fact in this case we can ﬁnd a conﬁdence interval directly, since under our assumptionsPn i=1 Xi has a Binomial Bin(1116,p) distribution, so that the estimated standard error of ˆ p is
sqrt root(pˆ (1− ˆ p)/n) and an approximate 95% conﬁdence interval based on the Normal approximation is 
ˆ p±2×e.s.e.

(2*standard error)

Here the two intervals, likelihood-based and Binomial-based, are numerically the same to 2 dp.

Question 5

Q

Independence model:
Snoqualmie Falls

Model adequacy

Answer

A

We might suspect not, because large weather systems may take several days to pass a point, and whilst they are doing so the weather conditions at the point will tend to persist, undermining the independence assumption.
For a rough check, consider the number of dry days followed by wet days. Under independence, the expected number of such pairs in the data is E( no. dry wet pairs) = 36×30×(1−p)p, since there are 36 Januaries in the data-set and 30 pairs of days in each. Similarly for other combinations of wet and dry. To estimate these expected numbers we replace p by ˆ p.

*given a table for the numbers of pairs of days we have more D-D and W-W pairs than expected
(Today)

Wet| Dry |Total 
Wet 643 (542.6) |128 (223.0) |771

Dry 123 (223.0) |186 (91.6) |309 
Total 776 |314 |1080

Yesterday
actual (estimates from model)

*we might consider carrying out a goodness of ﬁt test. The Pearson X^2 statistic

X^2 =
Σ[(observed−expected^2/expected]

is a natural choice for a test statistic for such a test, since large values of X^2 will indicate discrepancy from our expectations under the independence assumption. For a standard 2×2 contingency table, the observed value of X^2 would be compared with a χ^2_1 distribution, which is the approximate distribution of X^2 under the hypothesis of independence in a contingency table

For such a table the underlying model is a Multinomial distribution with four categories, and the basis for use of this distribution is a sampling scheme in which each of the items. whose frequencies are recorded in the table can fall independently into the four cells with prescribed probabilities.

Is that appropriate for Table 1?
In Table 1 the items counted in the four cells are pairs of days. From the way the data have been collected it is clear that consecutive pairs of days overlap; the second day of one is the ﬁrst day of the next (within each January). Thus our null hypothesis of independence and identical distribution of days does not translate immediately into the standard Multinomial basis for a X2 goodness of ﬁt test. We need an extension of the standard theory to the case when items are collected in such a way that each is dependent on its predecessor.

( WE DONT HAVE INDEP SO MULTINOMIAL NOT APPLICABLE)

If instead we consider non-overlapping consecutive pairs, then individual items being classiﬁed would be independent under our hypothesis and the χ2 theory would apply. Frequencies from non-overlapping pairs are

Today
Wet Dry Total
Wet 325 (271.3) 55 (111.5) 380

Dry 66 (111.5) 94 (45.8) 160 Total 391 149 54

Yesterday (on lhs)

A standard X2 test applied to Table 2 (with expected values calculated from the marginal totals in the table rather than from our estimate ˆ p = 0.709) gives X2 = 108. For comparison the 95% quantile of χ2 1 is 3.84. There is overwhelming evidence against the independence assumption even with this partial view of the data. Wetness or dryness has a tendency to persist from one day to another to a greater extent than the independence theory would predict.
This points the need for the analysis of dependent variables. We need models that allow a variable’s value today to aﬀect its value tomorrow. A discrete time Markov chain is such a model

Question 6

Q

Estimation of transition probabilities

for n states

Answer

A

We now assume that we have observations of a process which we think might be well modelled by a Markov chain on a known state space S, but with unknown transition matrix P
estimate P by using likelihood:

Suppose we have observations x1,…,xn modelled as successive states of a Markov chain X1,…,Xn with transition probabilities p_ij

So probability of observing a given sequence of states is given by transition probabilities:
P(X0 = x0,…,Xn = xn) =
= P(Xn = xn | Xn−1 =xn−1,…,X0 = x0) P(Xn−1 = xn−1,…,X0 = x0)

= P(X0 = x0) px0 x1 px1 x2 … pxn−1 xn

by the Markov property, which is the likelihood L for the parameters P = (pij) given data.

n_ij = number of transitions observed from state i to state j. So:

L = L(P) = P(X_0 = x_0)∏_{ i,j} of [p^{n_ij} _ij]

log likelihood for P:
l(P) = log(P(X_0 = x_0) +∑_{i,j} of [nij log(pij)]

As usual, values of the pij giving large l are plausible in the light of the data.

Question 7

Q

Estimation of transition probabilities
:

L = L(P) = P(X_0 = x_0)∏_{ i,j} of [p^{n_ij} _ij]

log likelihood for P:
l(P) = log(P(X_0 = x_0) +∑_{i,j} of [nij log(pij)]

Answer

A

The maximum likelihood estimates of the transition probabilities are obtained by maximizing l(P) subject to P being a transition matrix. This means that we will need to apply the constraint that the sum of the entries in each row is 1; we can do this either using Lagrange multipliers (see MAS211) or by, for example,

writing the last term in each row in terms of the others as we did in Example 9: ]
if S = {1,2,...,N}
then 
p_{i,N} = 
1−(p_{i,1}+p_{i,2}+...+p_{i,N−1}).

In some cases we may assume a particular form for the transition matrix where the constraint that the row sums are 1 is already assumed

the issue is that P(X_0 = x_0) isn’t dep on parameters p_ij?
if we argue conditionaly:
the (conditional) likelihood is deﬁned as P(Xn = xn,… | X0 = x0) and the above argument gives a log-likelihood without the P(X_0 = x_0) term
Another approach is to assume that the chain is in equilibrium, meaning that we can take P(X0 = x0) to be the probability of ﬁnding the chain in state x0 under its stationary distribution
when n is large, l is likely to be dominated by its second term, so the precise treatment of the ﬁrst term is unimportant- we ignore it

Question 8

Q

Estimation of transition probabilities
:

ARGUING CONDITIONALLY

L = L(P) = P(X_0 = x_0)∏_{ i,j} of [p^{n_ij} _ij]

log likelihood for P:
l(P) = log(P(X_0 = x_0) +∑_{i,j} of [nij log(pij)]

Answer

A

Suppose the state space S is ﬁnite. Then, arguing conditionally on X0 = x0, we wish to maximize

∑{i,j} [ nij log(p{ij})

subject to ∑_j [pij] = 1,
for i = 1,…,|S|.

Form the Lagrangian function
L(P,λ) =∑{i,j} [nij logpij] +∑{i} λi .
Then

∂L ∂pij
=nij/pij + λi, i,j = 1,…,|S|. (23)
From this we get that
ˆ pij =nij/ −λi

and thus, using
∑{j} [ˆ pij] = 1, it follows that
ˆ pij = nij/n{i.} (24)

Here we use ni. to mean ∑{j} nij.
= total number transitions from i
Similarly, we will later use n.. to mean∑{i,j} [nij]
= total number transitions from i to j

eg if we know that we have 25 transitions from state i and 6 of them are to state j, estimate for the transition prob p_ij = 6/25

Question 9

Q

Example 13. Snoqualmie Falls

Answer

A

Given transition count data - to fit a 2 state MC model for sequence of W or D. Under this model the log likelihood has form:

l(P) = log(P(X_0 = x_0) +∑_{i,j} of [nij log(pij)]
for each January but for the full likelihood we consider relationship between years.
Transition probabilities for the month of jan given, we assume the underlying behaviour is the same in each year

We can suppose the years are independent and that TPs {p_ij} are the same over the years. (eg 1948,1949,…)

As 1st model:

1) assume independence from 1 year to another and
2) stationarity over the years.

Let n_ij ^ (k) = observed transition count from i to j in year k.

Then the likelihood of P based on the observed transition count from i to j in year k.

implies likelihood of P based on observed seq, of W/D, conditionally on the 1st obs in jan each year is:

L(P) =∏_ k ∏ {i,j} [p ^{n^(k) ij} _ ij ,
giving log-likelihood
l(P) =∑ k ∑ {ij} [n^(k) _ij logp_ij]

=
ij[nij logpij]_∑

The maximization used above therefore applies directly, and from Table 1 gives estimated transition matrix: The estimated probabilities are quite diﬀerent from those based on the independence model, which would be (0.29, 0.71) in each row

Question 10

Q

EG

2x2

3X3 TRANSITION

Answer

A

[1-α α]
[β 1-β]

[α β 1-α- β]
[γ δ 1-γ-δ]
[ε ζ 1- ε-ζ ]

[0 P 1-P]
[1-P 0 P]
[P 1-P 0]

[p_11 p_12…. p_1N]
[p_21… p_2N]
[… … …. ]
[p_M1……. p_MN]

Question 11

Q

Uncertainty on these transition probabilities estimates:

Answer

A

Consider the conditional likelihood function and to allow for constraint on the parameters in a given row, we can reparametrize the transition matrix

Table 3: Estimated Transition Matrix; Januaries 1948–1983
Today Dry Wet
Dry 0.602 (186/309) 0.398 (123/309)

Wet 0.166 (128/771) 0.834 (643/771)
Yesterday

(MLE for TPs)

for uncertainty we assume conditional on value of 1st jan of that year

*To allow for constraint: reparametrize the TM in terms of first |S|-1 probabilities on each row and look for likelihood region

eg |S|=2 easy

Alt: asymptotic approximation

Testing the model: We have considered pairs of days . We look at data we haven’t used to fit the model

Question 12

Q

Example 14. Snoqualmie Falls: persistence of spells

The Markov chain model in the example has been ﬁtted by considering pairs of days. A natural question to ask is how well does it represent features of the data other than those used in its ﬁtting

Answer

A

Length of wet/dry spells for jan 1983:

11....01...101...10100
1x10
0
1x9
0
1x6
0
1x1
00

3 dry spells whose beginnings and ends fell in this month and lasted a day.(1=W, 0=D)

over 36 years average length of dry spells that did not overlap Jan 1 or Jan 31 is 2.21 days and standard deviation of length is 1.64 days.

Using markov properties we can calculate the expected duration of dry spells in the model and compare observed values with calculated to give a rough check for model:

Let T denote the duration of dry spell
*To observe a dry spell there must be a wet day followed by a dry day, we start numbering time from a wet day followed by a dry day. Then
….

Question 13

Q

TEST FOR SPECIFIED P

modelled by Markov chain with known transition matrix P*

Answer

A

data modelled by Markov chain with known transition matrix P*.

Null hypothesis-
H_0: P=P*

alternative H_1: P is unrestricted

Use the log likelihood ratio test statistic

W=-2(l(P*) - l(P^))

2*(log likelihood at assumed null hypothesis value - maximum log likelihood / estimate)

Note: Large values of W discredit H0
If hypothesis is true distribution approximatelt χ2 with d −s degrees of freedom.

d= # entries in TM that are non-zero eg s^2 if all

s= state space size ie # rows in TM (from sum equal 1 we need all but 1 element of each row)

**Unless we are assuming a particular form for the transition matrix, d here will be the number of entries in the matrix, i.e. s2, so the number of degrees of freedom will be s^2 −s

The test statistic W is given by

W = −2(l(P∗)−l(ˆ P)) =
2Σ _{i,j} [nij log [nij/(ni.p∗_ij)]

where p∗ ij is the i,j entry of P∗

2* sum over all pairs of states of
(#transitions from state i to j) log( (#transitions from state i to j) over ( #transitions from state i * mle for p*_ij)

mle for p*_ij (transition probabilities) is # transitions from i to j/ #total from i

Question 14

Q

Example: Pseudo-random numbers

Sequence of random digits produced by a computer’s random number generator: for example
5,3,1,8,7,9,0,6,4,5,…
Are they independent, or is there a tendency for certain digits to follow others?

Answer

A

Given data for 200 random integers between 1 and 10, generated in R:

for our transition matrix we want 1/10 for probabilities of next states

we count numbers of transitions of all possible pairs of states for given data set

mle for each entry
: for p_{1,1} = number observed 1 to 1 transitions /sum of all entries in top row
= 1/15

(find MLE for transition matrix by number of transitions for i to j divided by total number transitions from i)

we test:
H_0: 10 by 10 TM is entries of 1/10??

Alternative hypothesis: TM is not all entries 1/10

With these data we obtain W = 98.308.

(expected value would be 90, our degrees of freedom )

1-pchisq(98.308,90) in R gives 0.25
So no evidence against H_0

we have degrees of freedom = (100 entries -10 determined by others) = 90

*some numbers in mle TM were 0 treat 0log0 as 0.

In the example above, if we were given the total number, ni. , of digits i seen in the data, then, under H_0, we would expect to see ni.p∗ ij = ni./10 transitions i j. The observed number is nij, so W is a comparison of observed and expected frequencies
Writing ni.p∗ ij = eij,
we can express W as
W = 2Σ_{ i,j}[ nij log(nij/eij) =
−2Σ_{ i,j} nij log(1− ((nij −eij)/ nij)) ,

This gives numerical values close to those from the Pearson X2 statistic for comparing observed and expected frequencies
X^2 =Σ_{ i,j} (nij −eij)^2/eij .
In fact expansion of the log term shows asymptotic equivalence of the two statistics in large samples.

The example above uses a W test to check for independence, but it can be used in the same way to test any other simple hypothesis about the transition probabilities P.

Question 15

Q

Goodness of ﬁt: testing independence

COMPOSITE HYPOTHESIS

Answer

A

Earlier, we wished to test independence against the alternative of Markov dependence in a model for the Snoqualmie Wet/Dry data.

(wanted pearson but the pairs overlapped)

We now have a general approach.

The diﬀerence from the last example is that H0, the hypothesis of independence, is now composite: it does not specify the value of P completely, only that rows of P are identical

null hypothesis is: pij = qj for each i, where {qj} is a probability distribution on the state space S.

ie each day is wet or dry w/o reference to previous day
ie rows are identical for P, entries in each row only dep on column

alternate hypothesis is P transition matrix P=
[α 1-α]
[β 1-β]

under H_0: P is determined by q=s-1 parameters.

Without restriction P is determined by s(s-1) parameters.

TEST STATISTIC for testing composite hypotheses:
W = −2(l( ˜ P)−l( ˆ P))

(we replace P*, specific H_0 value, with P~ restricted MLE assuming H_0 is true)

(d.o.f is difference between)
χ2 distribution on d−s−q = s(s−1)−(s−1) = (s−1)^2 degrees of freedom.

RESTRICTED MLEs ˜ pij (under H_0):

log likelihood is
l =Σ_{i,j} [nij logpij =
Σ_{i,j} [nij logqj] =
Σ_ j [n.j logqj]

so that ˜ pij = ˆ qj = n.j/n…

TEST STATISTIC:

W = 2[ Σ_{ i,j} [nij log(nij/ni.) −Σ_{j} [n.j log(n.j/n..)]]
 

Question 16

Q

Example 18. Snoqualmie Falls (test completed) USING

COMPOSITE HYPOTHESIS

Answer

A

From Table the transition counts from 36 Januaries were
Dry Wet
Dry 186 123
Wet 128 643

assume observations in different years indep and argue conditionally on the state on Jan 1 each year
The log likelihoods under the two hypotheses are then sums of log likelihoods arising from the separate years, each of the form above, and it is easily veriﬁed that they reduce to the above forms on summation. We can therefore treat the problem as if the data had arisen from a single realization. It is found that w_obs = 193.5, which in comparison with the quantiles of χ^2_1

(99%, 99.9% and 99.99% quantiles 6.6,10.8,15.1)

gives emphatic conﬁrmation of our earlier ﬁndings with partial data, that an independence model is a very poor description of the data.

As in section 4.6, the log likelihood ratio test statistic W is close in value to the Pearson X^2 statistic for comparing observed frequencies with those expected under H0. The observed value of X2 for the data above is 201.2

Question 17

Q

EXAMPLE 19:

Russian linguistics (Guttorp after Markov)

Answer

A

Modelling sounds as MC

        Vowel next Consonant next  Vowel          1106            7532  Consonant 7533            3829

If we model the sequence of vowels and consonants as a Markov chain with state space {V,C}, then this gives the maximum likelihood estimate for our transition matrix as
[0.128 0.872]
[0.663 0.337]

Using the markov chain model:

P=
[1-p p]
[ p 1-p]

Under hypothesis that P is of this form:

log likelihood reduces to
l = (n_{00} + n_{11})log(1−p) + (n_{01} + n{10})logp

which is maximised for mle
p = (n_{01} + n_{10})/n.

1 degree of freedom:

P of P is obtained by substitution in our model for P.

The unrestricted estimates of pij are ˆ pij = nij/ni. as usual.

Thus the likelihood ratio statistic W = −2(l( ˜ P)−l( ˆ P) can be found.

Its value for the data above turns out to be 1217.7. The degrees of freedom for the null χ2 distribution are d−s−q = 2(2−1)−1 = 1 since the model has dimension 1. The large value of w_obs is extremely strong evidence against the tentative model.

Inspection of the data suggests that there is a stronger tendency for consonants to follow consonants than for vowels to follow vowels.

Question 18

Q

Comparing models: homogeneity

2 different sources, is the same TM appropriate for modelling the data from each source or do we need different TMs

Answer

A

Suppose we have data x = (x0,…,xn) from source 1 and data y = (y0,…,y_m) from source 2,

(different lengths of data poss from each source)

where x and y are realizations from Markov chains with transition matrices P_x and P_y respectively.

We wish to test H_0 : Px = Py = P
alternative :
P_x and P_y are different

we can use the W test statistic.
Under H0, assuming independence of the two sources, the log likelihood is
l =Σ_{ij} (nx_{ij} + n_{yij} )log(pij )

(n_{xij}= # transitions from i to j in x data)

so that the maximum likelihood estimates are

p~{ij} = (n{xij} + n_{yij})/(n_{xi.} + n_{yi.}).

(#number transitions from i to j in both data sets)/ (#transitions from i in both data sets)

The unrestricted estimates (mle) of Px,Py, each based on its own sample of data , are of the standard form.

Thus W may be calculated :

Its null distribution is approximately χ2 with degrees of freedom 2(s^2 −s)−(s^2 −s) = s(s−1).
(degrees of freedom for one TM is s^2-s, for alternative is 2*(s^2 - s )

Question 19

Q

Example 20. Snoqualmie wet/dry

Is there evidence of changes through time in the Markov chain model for Wet/Dry? Transition counts for the years 1948–1965 and 1966-1983 were:

Answer

A

Divide data set into 2

Januaries 1948–1965 
      Dry Wet 
Dry 86 63 
Wet 66 325
Januaries 1966–1983 
        Dry Wet 
Dry 100 60 
Wet 62 318

Comparing models: homogeneity

The marginally increased rate Dry → Dry and reduced rate Wet → Wet might suggest that conditions in later years were systematically less wet. The observed value of the W statistic however is w_obs = 0.78, giving no evidence when compared to χ2 2 values of a diﬀerence in transition matrices. A conclusion is that the observed discrepancy is within the bounds of random variation within a common model

Question 20

Q

DEF 16:

rth order Markov chain

Answer

A

A stochastic process {X_i} in discrete time with a discrete state space S is
an rth order Markov chain if

P(X_{n+1} = x_{n+1} | X_n = x_n, X_{n−1} = xn−1, . . . , X0 = x0)

= P(X_{n+1} = x_{n+1} | X_n = x_n, Xn−1 = x_{n−1}, . . . , X_{n−r+1} = x_{n−r+1})

*bold(Y_i) = (Xi, Xi−1, . . . , Xi−r+1) on state space S^When (X_i) is rth order M

probabilities of VECTORS

P(Y_{n+1} = y_{n+1} | Y_n = y_n
,Y_{n−1} = y_{n−1}
, . . . , Y_{r−1} = y_{r−1})

= P(Y_{n+1} = y_{n+1} | Y_n= y_n)
so that {Y_i} is first order Markov.

ie conditional on state n and time n-1 and,…… , time n-r dep on order r

Question 21

Q

rth order Markov chain: transition probabilities

Answer

A

The one-step transition probabilities for a second order chain {Xi} now depend on the previous two states.
TPs are:
p_{(ij)k} = P(X_2 = k | X_0 = i, X_1 = j)

2 steps of memory, conditional on last 2 not just previous

2 state prev tps= conditioning on the
days 4=W and days 5=W
X_4=W X_5=W

P(X_6=D | X_4=W and X_5=W)
=P((X_5= W, X_6=D ) | X_4=W)

ie looking at pairs of consecutive days we can write 2nd order in terms of first order chain

Question 22

Q

rth order Markov chain:
log likelihood

maximum likelihood estimators

Answer

A

corresponding transition counts n_{(ij)k} given starting state pair

(number transitions from (i,j) to k)

l= sum over i,j,k of [n_{(ij)k} log(p_{(ij)k}]

mles
pˆ_{(ij)k} = n_{(ij)k}/n_{(ij)}

Question 23

Q

Given suitable data we can use W to test whether a first order chain is an adequate model:

Answer

A

Within the overall second-order model for the chain, the hypothesis that the chain is first
order is
H_0 : p_{(ij)k} = q_{jk} (doesnt dep on i)

and the log-likelihood under H0 becomes

l= sum over i,j,k of [n_{(ij)k} log(p_{(ij)k}]
= sum over j,k of [n_{(.j)k}log(p_{jk}]
= sum over j,k of [n_{jk} log(p_{jk}]

so the maximum log likelihood under the restricted model is
l(P~) = sum over jk of [n_{jk} log(n_{jk}/n_j)]

The maximum without restriction is
l(P^) = sum over ijk of [n_{(ij)k} log(n_{(ij)k}/n_{(ij).}]

e GLRT test statistic W = −2(l(P~) − l(P^)) may be found. Large values of W discredit H0 in favour of second order dependence

The null distribution of W is approximately
χ^2 with degrees of freedom
s^2(s − 1) − s(s − 1) = s(s − 1)^2

since the number of non-zero elements in P is s^3, the size of the state space is s^2 and under H0 there are s^2 − s independent probabilities q_{jk}.

Question 24

Q

EXAMPLE Snoqualmie
wet/dry

W to test whether a first order chain is an adequate model:
data:

Previous days Current day Proportion
2 before 1 before Dry Wet Dry Wet
Wet Wet 100 527 0.159 0.841
Dry Wet 25 94 0.206 0.794
Wet Dry 70 52 0.574 0.426
Dry Dry 109 67 0.619 0.381

Answer

A

s=2 so degrees of freedom in W test are s(s-1)^2 =2

Transition count data w_obs =0.29
p-value is 0.87
so no reason at all to question the first order Markov dependence
model.

Question 25

Q

Further rain models 1

Answer

A

* extend the wet/dry modelling to other months it is natural to ask whether they too could be modelled with Markov chains, possibly with different transition matrices
in different months or groups of months. If so, a Markov chain model in which
the transition matrix changes over time might be considered. The change might
be discontinuous (at month-ends, say) or be made smooth by interpolating between
transition matrices estimated separately for different months. In either case the result
would be a time-heterogeneous chain, but one simple to simulate from.

Question 26

Q

Further rain models 2

Answer

A

If Ri
is the rainfall on day i, then Ri = 0 if i is dry. A very simple model for the positive
Ri would be to take them to be independent and from a fixed distribution. Checks on
such a model could be based informally on the semi-variogram (for independence) and on tests for equality of distribution (for the assumption of stationarity), all applied
to observed values of rainfall. It seems possible that the distribution of daily rainfall
amount might vary over the year. If so, then it may be possible to represent the
variation through simple forms of time-varying parameters in a standard distribution, say a gamma distribution. Again, such a model would be easy to simulate from.

Question 27

Q

Further rain models 3

Answer

A

Two general strategies for checking the adequacy of a model are (i) to compare the theoretical value of some feature(s) of the model with data, and (ii) to embed the model in a broader one and to test the hypothesis that the data are generated from the smaller model. In (i) it is desirable that the features should not have been used directly in the fitting of the model. A frequent difficulty is the calculation of the theoretical values of model features. Simulation may be useful when analytical approaches are
not available. The same difficulty can stand in the way of strategy (ii) and again
simulation methods may offer a way round it.

Question 28

Q

Further rain models 4

Answer

A

The rainfall model emerging from (1) and (2) above is unsatisfactory in that it makes minimal use of knowledge of the weather. Rainfall is associated with the passage of
weather systems over the observation site. Different weather systems have different propensities to generate rain at the site. We might model this by assuming a random succession of a small number of weather types over the site, each having different
probabilities of causing rain. Observations, as before, are the presence or absence of rain on successive days. If the succession of weather types is modelled by a Markov
chain whose states are not recorded, the resulting process is called a hidden Markov model (HMM). It is an example of a state space model. We might assume in addition different distributions for the amount of rain falling on the wet days generated through
the basic HMM, obtaining a more flexible overall rainfall model than before. This idea generalizes to models for several sites over a region

Question 29

Q

Approximate confidence intervals for pij

Answer

A

Given numbers of state transitions we can estimate p_ij

p_ij ^ = proportion out of total out of state I

p_ij^ has an approximate normal distribution with mean (p_ij) and variance p_ij(1-p_ij)/N_i

Var p_ij^
= p_ij (1-p_ij) /N_i.

Cov(p_ij^, p_ik^)
= -p_ij p_ik/N_i
For j not equal to k

Each row estimates are indep

Difference between two: example

Question 30

Q

Example 16
Snoqualmie falls

Estimates transition matrix

Standard errors

Answer

A

[0.602 0.398]
[0.166 0.833]

n_S =309
n_W =771

Estimates standard errors are therefore
ese( p^_DD) = sqrt ( p^_DD( 1-p^_DD)/n_D) =0.028

ese(p^WW) = sqrt ( p^_WW( 1-p^_WW)/n_W)= 0.013

Giving eg 95% approx intervals

(0. 55,0.66)
(0. 81,0.86)

Question 31

Q

Mles for transition matrices

Difference between

Answer

A

For the same rows:
V
For different rows:
Covariance =0

So

Var (p^_22 - p^_33) = var(p^_22) + var(p^33)

And ese? CI value used found from(sqrt(var))