CHAPTER 6:Modelling sets of points Flashcards
EXAMPLE 36
Floods in Burbage Brook
days and numbers of floods plotted
In the plot ‘flood’ is taken to be flow over 4 cumecs. 4 Initial interest might be in modelling the times of floods as a point process. Times and magnitudes together can be regarded as a point process in two dimensions.
marked on 1D the intervals between
one dimensional line with points scattered
–xx—x–x–x–x–x-xx—
the number of them occurring in the time interval has a poisson dist with parameter lambda = constant *length of interval
points at moments in time ie 1d space
Example 37. Insurance claims
Figure 12 shows major fire insurance claims in Denmark from 1980 to 1990, from Embrechts, Kluppelberg & Mikosch 1997.
points thin out higher up, time of claim on x axis and size (log scale) on y
Example 38. Japanese pine saplings
random looking points on 2D
Figure 13 shows the locations of saplings of Japanese black pines, collected by Numata (1961). Models for spatial patterns like this are of interest; questions could include whether there is any evidence for clustering or repulsion.
Example 39. East Yorkshire leukaemia cases
Figure 14 gives the locations of cases of leukaemia in children in East Yorkshire from 1974 to 1986, and locations of a second set of children without leukaemia but otherwise matched to the cases. Model the two patterns. Is there evidence for differences?
different clusters . different population densities for areas
Fitting a Poisson process model to data:
Given a point process over interval [0,t], estimate parameter λ
We estimate λ:
2 possibilities:
1. from the observed number of events N(t) = n;
- by the methods developed for continuous time Markov chains in section 5.5.
Fitting a Poisson process model to data:
1)Fitting using N(t) ∼ Po(λt), the log-likelihood having observed N(t) = n
l = −λt + nlog(λt) + constant
= −λt + nlog(λ) + nlog(t) + constant = −λt + nlog(λ) + constant
So
∂l/∂λ = −t + n/λ
so that
λ^=n/t and
estimated standard error
ese(λ^)= λ^/√n.
Fitting a Poisson process model to data:
- by the methods developed for continuous time Markov chains in section 5.5.
The durations in states
i = 0,…,n are
a_0 = T1,…,an−1 = Tn,
(inter occurrence times, a_0 is time spent in state 0, until 1st event)
a_n = t−Σ_{1,n} T_i.
(so far In last state, incomplete holding time-always when working with MLE for cont MC)
Also, the transition rates in the chain are simply
g_{i,i+1} = λ = −g_{ii} = gi
so that the log-likelihood is
l = −Σgiai +Σ_ i≠j [nij loggij]
= −λΣ_iai + logλ Σ_i n_{I,i+1}
= −λt + nlogλ
(n_ijs are 0 or 1 )
(Σ_iai is total time observed)
(Σ_i n_{I,i+1} total number of points in [0,t]
This is the same as the log-likelihood from approach 1, inferences from the 2 approaches are the same.
the values of individual Ti, doesn’t actually make any difference
We only need total numbers of events
Possibilities for checks on model adequacy:
Any means of checking the properties of a Poisson process can be used. For example, the interval [0,t] could be divided into a number of equal sub-intervals, the number of points noted in each, and the resulting data used in a χ2 goodness-of-fit test for a Poisson distribution. Similarly a check could be based on the distribution of the intervals between points, which should be exponential.
EXAMPLE 40:
Burbage Brook floods
λ^
estimated standard error
Between 1925 and 1983 (inclusive) there were 48 flood events (flows ≥ 4 cumecs) in Burbage Brook.
Since the observation period is t = 59 years, from
λ^=n/t and
estimated standard error
ese(λ^)= λ^/√n.
the maximum likelihood estimator of the rate of occurrence is
λ^ = 48/59 = 0.81, with ese = 0.12 events/year.
EXAMPLE 40:
Burbage Brook floods
is there evidence that our estimated rate is changing
λ is not constant, ie floods might be more likely in winter
don’t have constant rate, λ varies with time
INHOMOGENOUS POISSON PROCESS of rate λ (t)
The rate λ governs the probability that a point occurs within short interval:
λ is not constant and varies with time, varies for intervals of the same length
We allow λ = λ (t) to depend on time, assuming as before that the corresponding counting process N(t) satisfies
P(N( t +𝛿t) = i + 1| N(t)=i) = λ(t)𝛿t +o(𝛿t)
(probability of transition i to i+1 now dep on t)
and
P(N(t + δt) = i|N(t) = i) = 1−λ(t)δt + o(δt),
(probability of not having transition is 1- prev prob)
INHOMOGENOUS POISSON PROCESS of rate λ (t):
TIME CHANGE
slow down where loads of ocurences (events happen fast) and speed up time when not many occurrences
ie busy times and slow times
we allow value lambda to depend on time in inhomogen poisson process
Let N(t) be an inhomogeneous Poisson process of intensity λ(t), and define Λ(t) =∫ _{0,t} λ(u)du
Suppose we change the time-scale and define a new counting process
M(s) = N(t) where s = Λ(t).
Let t(s) = Λ^−1(s) be the inverse transformation of the time scale, taking the new time s back to t.
…[obtaining probabilities of M(s+δs) transition probabilities assmall change s → s + δs on the s-scale corresponds to a change δt = t(s + δs)−t(s) = δs dt ds + o(δs) = (1/λ(t)) δs + o(δs) ≈ δs/λ(t)
]…
Thus M(s) is a homogeneous Poisson process with intensity 1
Note that a small change s → s + δs on the s-scale corresponds to a change δt = t(s + δs)−t(s) = δs dt ds + o(δs) = 1 λ(t) δs + o(δs) ≈ δs λ(t)
EXAMPLE 41
Figure 17 shows a histogram of the dates of the Burbage flood events, and Figure 18 shows the
corresponding QQ plot based on the uniform distribution. These plots might raise some doubts
about constancy of rate.
lambda is low when
lambda is high when
slow down time in busy periods
faster when slow periods
turns into a standard poisson process
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
Proposition 1.
N(t) has a poisson distribution with mean Λ(t) = integral_(0,t) [λ(u)du]
This is because N(t) = M(s) ∼ Po(s) = Po(Λ(t)).
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
Proposition 2.
The numbers of points in disjoint intervals I1, . . . , Ik are independent and
Poisson distributed with means integral_(I_i) [λ(u)du]
, i = 1, . . . , k.
Reason: independence follows from translation of the corresponding property of the basic Poisson process, and the distributions follow as in Proposition 1 above.
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
Proposition 3.
Given that the total number of points in [0, t] is N(t) = n, the positions of the points are independently distributed with pdf λ(v)/Λ(t), 0 ≤ v ≤ t.
For a standard process: indep and uniformly distributed on the interval
Reason: conditional independence and identical distribution of positions in the N process follows from the same properties of those in the M process.
total number of points in [0,t] is N(t)=n
Let V denote the position of a point in the N process. Then the corresponding position for the M process is s(V ) and in the M process positions are uniformly distributed over [0, s(t)].
Thus the distribution function of V is
P(V ≤ v) = P(s(V ) ≤ s(v)) = s(v)
s(t)
=Λ(v)/Λ(t)
and so the pdf is
dP(V ≤ v)/dv = λ(v)/Λ(t)
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
Fitting to data
finding likelihood for particular rate function
the last property enables us to write down likelihood for λ(·) based on observations over [0,t].
Suppose that we have observed N(t) = n points and that their positions are v1, . . . , vn.
Then
L=P(N(t) = n) × P(positions of the points |N(t) = n)
= e^{−Λ(t)}[ Λ(t)^n/n!]×∏{1,n}[λ(v_i)/Λ(t)]
= e^{−Λ(t)} ∏{1,n} [λ(vi)/n!]