CHAPTER 6:Modelling sets of points Flashcards
EXAMPLE 36
Floods in Burbage Brook
days and numbers of floods plotted
In the plot ‘flood’ is taken to be flow over 4 cumecs. 4 Initial interest might be in modelling the times of floods as a point process. Times and magnitudes together can be regarded as a point process in two dimensions.
marked on 1D the intervals between
one dimensional line with points scattered
–xx—x–x–x–x–x-xx—
the number of them occurring in the time interval has a poisson dist with parameter lambda = constant *length of interval
points at moments in time ie 1d space
Example 37. Insurance claims
Figure 12 shows major fire insurance claims in Denmark from 1980 to 1990, from Embrechts, Kluppelberg & Mikosch 1997.
points thin out higher up, time of claim on x axis and size (log scale) on y
Example 38. Japanese pine saplings
random looking points on 2D
Figure 13 shows the locations of saplings of Japanese black pines, collected by Numata (1961). Models for spatial patterns like this are of interest; questions could include whether there is any evidence for clustering or repulsion.
Example 39. East Yorkshire leukaemia cases
Figure 14 gives the locations of cases of leukaemia in children in East Yorkshire from 1974 to 1986, and locations of a second set of children without leukaemia but otherwise matched to the cases. Model the two patterns. Is there evidence for differences?
different clusters . different population densities for areas
Fitting a Poisson process model to data:
Given a point process over interval [0,t], estimate parameter λ
We estimate λ:
2 possibilities:
1. from the observed number of events N(t) = n;
- by the methods developed for continuous time Markov chains in section 5.5.
Fitting a Poisson process model to data:
1)Fitting using N(t) ∼ Po(λt), the log-likelihood having observed N(t) = n
l = −λt + nlog(λt) + constant
= −λt + nlog(λ) + nlog(t) + constant = −λt + nlog(λ) + constant
So
∂l/∂λ = −t + n/λ
so that
λ^=n/t and
estimated standard error
ese(λ^)= λ^/√n.
Fitting a Poisson process model to data:
- by the methods developed for continuous time Markov chains in section 5.5.
The durations in states
i = 0,…,n are
a_0 = T1,…,an−1 = Tn,
(inter occurrence times, a_0 is time spent in state 0, until 1st event)
a_n = t−Σ_{1,n} T_i.
(so far In last state, incomplete holding time-always when working with MLE for cont MC)
Also, the transition rates in the chain are simply
g_{i,i+1} = λ = −g_{ii} = gi
so that the log-likelihood is
l = −Σgiai +Σ_ i≠j [nij loggij]
= −λΣ_iai + logλ Σ_i n_{I,i+1}
= −λt + nlogλ
(n_ijs are 0 or 1 )
(Σ_iai is total time observed)
(Σ_i n_{I,i+1} total number of points in [0,t]
This is the same as the log-likelihood from approach 1, inferences from the 2 approaches are the same.
the values of individual Ti, doesn’t actually make any difference
We only need total numbers of events
Possibilities for checks on model adequacy:
Any means of checking the properties of a Poisson process can be used. For example, the interval [0,t] could be divided into a number of equal sub-intervals, the number of points noted in each, and the resulting data used in a χ2 goodness-of-fit test for a Poisson distribution. Similarly a check could be based on the distribution of the intervals between points, which should be exponential.
EXAMPLE 40:
Burbage Brook floods
λ^
estimated standard error
Between 1925 and 1983 (inclusive) there were 48 flood events (flows ≥ 4 cumecs) in Burbage Brook.
Since the observation period is t = 59 years, from
λ^=n/t and
estimated standard error
ese(λ^)= λ^/√n.
the maximum likelihood estimator of the rate of occurrence is
λ^ = 48/59 = 0.81, with ese = 0.12 events/year.
EXAMPLE 40:
Burbage Brook floods
is there evidence that our estimated rate is changing
λ is not constant, ie floods might be more likely in winter
don’t have constant rate, λ varies with time
INHOMOGENOUS POISSON PROCESS of rate λ (t)
The rate λ governs the probability that a point occurs within short interval:
λ is not constant and varies with time, varies for intervals of the same length
We allow λ = λ (t) to depend on time, assuming as before that the corresponding counting process N(t) satisfies
P(N( t +𝛿t) = i + 1| N(t)=i) = λ(t)𝛿t +o(𝛿t)
(probability of transition i to i+1 now dep on t)
and
P(N(t + δt) = i|N(t) = i) = 1−λ(t)δt + o(δt),
(probability of not having transition is 1- prev prob)
INHOMOGENOUS POISSON PROCESS of rate λ (t):
TIME CHANGE
slow down where loads of ocurences (events happen fast) and speed up time when not many occurrences
ie busy times and slow times
we allow value lambda to depend on time in inhomogen poisson process
Let N(t) be an inhomogeneous Poisson process of intensity λ(t), and define Λ(t) =∫ _{0,t} λ(u)du
Suppose we change the time-scale and define a new counting process
M(s) = N(t) where s = Λ(t).
Let t(s) = Λ^−1(s) be the inverse transformation of the time scale, taking the new time s back to t.
…[obtaining probabilities of M(s+δs) transition probabilities assmall change s → s + δs on the s-scale corresponds to a change δt = t(s + δs)−t(s) = δs dt ds + o(δs) = (1/λ(t)) δs + o(δs) ≈ δs/λ(t)
]…
Thus M(s) is a homogeneous Poisson process with intensity 1
Note that a small change s → s + δs on the s-scale corresponds to a change δt = t(s + δs)−t(s) = δs dt ds + o(δs) = 1 λ(t) δs + o(δs) ≈ δs λ(t)
EXAMPLE 41
Figure 17 shows a histogram of the dates of the Burbage flood events, and Figure 18 shows the
corresponding QQ plot based on the uniform distribution. These plots might raise some doubts
about constancy of rate.
lambda is low when
lambda is high when
slow down time in busy periods
faster when slow periods
turns into a standard poisson process
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
Proposition 1.
N(t) has a poisson distribution with mean Λ(t) = integral_(0,t) [λ(u)du]
This is because N(t) = M(s) ∼ Po(s) = Po(Λ(t)).
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
Proposition 2.
The numbers of points in disjoint intervals I1, . . . , Ik are independent and
Poisson distributed with means integral_(I_i) [λ(u)du]
, i = 1, . . . , k.
Reason: independence follows from translation of the corresponding property of the basic Poisson process, and the distributions follow as in Proposition 1 above.
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
Proposition 3.
Given that the total number of points in [0, t] is N(t) = n, the positions of the points are independently distributed with pdf λ(v)/Λ(t), 0 ≤ v ≤ t.
For a standard process: indep and uniformly distributed on the interval
Reason: conditional independence and identical distribution of positions in the N process follows from the same properties of those in the M process.
total number of points in [0,t] is N(t)=n
Let V denote the position of a point in the N process. Then the corresponding position for the M process is s(V ) and in the M process positions are uniformly distributed over [0, s(t)].
Thus the distribution function of V is
P(V ≤ v) = P(s(V ) ≤ s(v)) = s(v)
s(t)
=Λ(v)/Λ(t)
and so the pdf is
dP(V ≤ v)/dv = λ(v)/Λ(t)
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
Fitting to data
finding likelihood for particular rate function
the last property enables us to write down likelihood for λ(·) based on observations over [0,t].
Suppose that we have observed N(t) = n points and that their positions are v1, . . . , vn.
Then
L=P(N(t) = n) × P(positions of the points |N(t) = n)
= e^{−Λ(t)}[ Λ(t)^n/n!]×∏{1,n}[λ(v_i)/Λ(t)]
= e^{−Λ(t)} ∏{1,n} [λ(vi)/n!]
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
LOG LIKELIHOOD
The log-likelihood is
l = −Λ(t) +∑_{1,n} [log λ(vi)] + const.
If λ(t) is a constant λ, then Λ(t) reduces to Λ(t) = λt and this
becomes becomes
l = −λt + n log λ + const
as before.
TRANSFERRING PROPERTIES TO INHOMOGENEOUS POISSON PROCESSES
NOTES
In most applications of inhomogeneous Poisson models the function λ(·) is specified in terms of a finite number of parameters.
Their maximum likelihood estimates are then the values maximizing l.
(in terms of those parameters)
The asymptotic properties of likelihood inference carry over to such inhomogeneous Poisson processes under reasonable conditions.
Approximate standard errors may then be found from the information matrix and tests may be based on twice the difference of log-likelihoods.
Example 42: Freezes of Lake Constance 1300–1974
Figure 19 shows the years between 1300AD and 1974AD when major freezes of Lake Constance occurred, a major freeze being defined as one in which the upper lake, which is 7–14km wide, could be crossed by vehicle or foot.
the histogram and uniform QQ plot for the dates of freezes in Figures 20, 21 do not
appear to support a Poisson process model with a constant intensity. Consider therefore a non-homogeneous Poisson process model. irregular invents in time
*given a concern about climate change are the number of freezes changing over time we allow for rate to change over time, The plots do not appear to support a Poisson process model with a constant intensity.
*To represent a changing rate in a simple form, we assume
λ(t) = α + βt
where α and β are constants, and t, for convenience in this problem, measures years post 1300 in units of 100 years.
*The cumulative intensity function Λ(t) is
Λ(t) = ∫ _{0,t} [λ(u)du]
= αt + (1/2)βt^2
*log-likelihood is then
l = −αt_o −(1/2)βt^2o + ∑{1,n} log(α + βv_i)
where t_o denotes the length of the observation period, 6.75 centuries, n denotes the number of events, and the v_i denote the times of occurrence of the events.
Note that this log likelihood only makes sense for values of α and β such that α + βvi > 0 for
all freeze dates vi
Example 42: Freezes of Lake Constance 1300–1974
FIND THE MAXIMUM LIKELIHOOD ESTIMATORS
To find the maximum likelihood estimators α^ and β^, differentiate:
∂l/∂α = −t_o + ∑_{1,n} [ 1/(α^ + β^v_i)]
∂l/∂β = −(1/2)t^2o + ∑{1,n} [v_i/(α^ + β^v_i]
Hence we solve
Hence we need to solve
−to + ∑_{1,n} [1/(α^+ β^v_i)] = 0
−(1/2)t^2o + ∑{1,n} (v_i/(α^ + β^v_i)) = 0.
Multiplying first by α^ and seconds by β^ and adding them together shows that α^t_o + (1/2)β^t^2_o = n,
which allows elimination of one variable. However, there is still no explicit solution, so numerical maximization is required. It shows that
a^ = 7.015 (1.76) β^ = −0.81 (0.38)
*where the values in brackets are estimated standard errors obtained by inversion of the observed information matrix
[Var(α^) Cov(α^, β^)]
[Cov(α^, β^) Var(β^) ]
= J^−1
where
J = −
[∂^2l/∂α^2 ∂^2l/∂α∂β]
[∂^2l/∂β∂α ∂^2l/∂β^2]
We could calculate J by differentiating 1st and 2nd and substituting α^ and β^. However, if the numerical maximization of l is done in R, numerical differentiation is available as a by-product.
Example 42: Freezes of Lake Constance 1300–1974
RATIO TEST?
The estimated rate λb(t) = 7.0 − 0.81t is decreasing, in agreement with the indications from Figures . To assess the objective strength of evidence for a decreasing rate we can
carry out a test of the hypothesis H0 : β = 0, against the alternative H1 : that there is no restriction on the value of β. The generalized likelihood ratio test gives a general method for constructing such tests; see section
Example 42: Freezes of Lake Constance 1300–1974
Generalized Likelihood Ratio Test
no change over time*
The test statistic is
W = −2(l( α~, β˜) − l(α^,β^)),
where l(α~,β~) is the maximum log-likelihood under H1, and l(α^, β^) the maximum under the restriction imposed by H0. (that beta is 0)
Under H0, the distribution of W is approximately χ^2 with
degrees of freedom = no.parameters under H1 − no.parameters under H0,
and under H1 it tends to be larger.
Thus large values of W discredit H0 and the p-value corresponding to an observed value of W may be found from the χ^2 distribution.
Example 42: Freezes of Lake Constance 1300–1974
Extra calculations
Adding in the years to 2018 (with no more freezes) changes the estimates to
α^ = 7.437 (1.74)
β^ = −0.95 (0.34)
Slightly smaller standard errors; estimates give a slightly steeper line.
Example 43. Lake Constance revisited
Example 42: Freezes of Lake Constance 1300–1974
Generalized Likelihood Ratio Test
For the Lake Constance data the maximum likelihood estimators α^ and β^ above give the values of α and β that maximize l under H1, so we only need to find the maximizing values under H0, the restricted maximum likelihood estimators α˜ and β˜. When H0 is true, λ(t) = α so the Poisson process is time-homogeneous. From section 6.2 therefore the maximum likelihood estimator of α is
α˜ =n/t_o=29/6.75
= 4.3 events/century
(and necessarily β˜ = 0).
Substitution into l (71) and use of (74) now gives
w = −2(13.275 − 15.249) = 3.948.
By comparison with χ^2_1 the p-value is slightly less than 0.05.
Thus there is some evidence of
a change in the rate of occurrence of freezing events – which from the sign of β^ must be a reduction – but the strength of the evidence is not overwhelming.
Thus there is some evidence of a change in the rate of occurrence of freezing events, which from the sign of β^ must be a reduction. But the strength of the evidence is not overwhelming.
Notes
brief
- Other forms for λ(t) may be preferable. For example λ(t) = exp(α + βt) specifies a rate that could be increasing or decreasing (according to the sign of β) but can never give a negative value, unlike the linear λ(t) used in the example. If there is a possibility of periodicity in the rate of occurrence of points (as for floods, for example, which may be more likely in the winter), then a λ(t) incorporating sines and cosines
might be useful. - To check adequacy of a non-homogeneous model we could transform the time-scale from t to s by the s = Λ(t) transformation. On the new scale the non-homogeneous process becomes homogeneous, and therefore all the checks for that case (section 6.2) become available.
Spatial Poisson processes
points randomly scattered in space
one dimension:
Given an intensity function λ(·) ≥ 0, a general Poisson process on the line has the following properties:
*for any interval I, the number of points N(I) of the process in I
has a Poisson distribution with mean integral_I λ(u) du
- for disjoint intervals I1, I2, . . . , Ik , the random variables N(I1),N(I2), . . . ,N(Ik ) are independent.
- N is a counting process: the number of points it counts in an interval is the sum of the numbers in subintervals.
Spatial Poisson processes
points randomly scattered in space
higher dimensions
A Poisson process in the plane or in space (or indeed in d dimensions for any positive integer d) is defined as a counting process with these same properties, where the properties are merely re-phrased to make sense in higher dimensions.
DEF 17
higher dimensions
a spatial Poisson process (on the plane), or a planar Poisson process, with intensity λ(·)
Suppose that λ(·) is a real-valued non-negative function on R2 and that for each set B in the plane, N(B) is a random variable taking non-negative integer values (interpreted as the number of points of the process in B). If
- N(B) has a Poisson distribution with mean ∫ _B λ(u) du;
- when B1, B2, . . . , Bk are disjoint, the random variables N(B1), N(B2), . . . , N(Bk) are independent;
• N has the additive property N(∪{i=1,k} Bi) = ∑{i=1,k} N(B_i)
for disjoint Bi;
then N is called a spatial Poisson process (on the plane), or a planar Poisson process, with intensity λ(·).
DEF 18
homogeneous spatial poisson process
A homogeneous spatial Poisson process is the special case of Definition 17 when λ(·) is constant.
NOTATION
Notation: To save writing let us define, for sets B in the space of the points (R^d, d =2, 3, . . .),
Λ(B) = ∫ _B λ(u) du
PROPOSITION 4
probability density function of R
homogeneous planar Poisson process
Let R denote the distance from the origin to the nearest point in a homogeneous planar Poisson process with intensity λ. Then the probability density function of R is h_R(r) = 2λπre^{−λπr^2}, r > 0, (75)
(the density of a Rayleigh distribution).
Reason: The number of points in a circle of radius r centred at the origin has a Poisson distribution with mean λπr^2 . If there are no points in this circle then R > r, and conversely. Thus P(R > r) = exp(−λπr2 ) and (75) follows by differentiation.
PROPOSITION 5
THINNING
taking points at random and deleting them
Thinning of a Poisson process refers to the random deletion of some of the points. A simple form of thinning is to remove or retain each point independently with fixed probabilities 1 − p and p, say. If the original process has intensity function λ(·), then the point process resulting from such independent thinning is a Poisson process with intensity pλ(·).
Reason: Let N denote the original process and N* the thinned process. Independence and additivity of N* follow immediately from independence and additivity of N and the fact that thinning is carried out independently. Thus the only thing to show is that, for each set B, N*(B) has a Poisson distribution with mean pΛ(B). For each r ≥ 0,
P(N*(B) = r) = X∞ k=r P(N ∗ (B) = r | N(B) = k)P(N(B) = k) = X∞ k=r
k
r
p r (1 − p) k−r e −Λ(B)Λ k (B) k! since, given k points, the number of points retained has a Binomial Bi(k, p) distribution = e −Λ(B) (pΛ(B))r r! X∞ k=r {(1 − p) Λ(B)} k−r (k − r)! = e −Λ(B) (pΛ(B))r r! e (1−p) Λ(B) = e −p Λ(B) (pΛ(B))r r! .
PROPOSITION 6
Conditional property
Conditional property (cf section 2.1.3 and section 6.3.1).
Given the total number of points of a spatial Poisson process in a region B, the positions V of the points are independently distributed over B with probability density function
f_V (v) = λ(v)/Λ(B)
v ∈ B.
SIMULATE
POISSON PROCESS
if we know the intensity function, the conditional distribution property gives a way to simulate a Poisson process over any region.
First generate a number n from the Poisson distribution Po(Λ(B)), then simulate n independent values from the density fV .
With this ability we could implement a simulation test for a Poisson process using the comparison of distributions method suggested by the first contact distribution.
The approach is feasible even for processes on sets B with irregular shapes.
LIKELIHOOD
The conditional distribution property also gives the likelihood for λ based on an observed pattern of points.
If n points are observed in a region B and their positions are
vi
, i = 1, . . . , n, then the likelihood function is
L = e^{−Λ(B})Λ^n(B)/n! ×
∏{1,n} λ(v_i)/Λ(B)
=
(1/n!) e^{−Λ(B)} ∏{1,n} λ(vi),
and the log-likelihood
l = −Λ(B) + sum from i=1 to n of log λ(vi) + constant.
PARAMETRIZTION
The λ function is often specified in terms of a small number of parameters.
In that case fitting of the model by maximization of l and subsequent inference go ahead along the same lines as before.
MARKED POISSON PROCESSES
original poisson process, at each point of process we have a random variable Y_i which are the marks
Several examples can be thought of as a sequence of time points at each of which another variable is observed.
A marked Poisson process is a simple model for this.
Given a Poisson process N – on the line, plane or in higher dimensions – with intensity λ(·), associate with each point Xi of the process a random variable Yi, called the mark at Xi
Then the new process {N, Y1, . . . } is called a marked Poisson
process.
consists of the mark values and also the poisson process points
MARKED POISSON PROCESSES
DEPENDENCE
For some modelling problems it might be appropriate to take the Yi to be independent and identically distributed.
In others there may be interest in possible dependence between marks at different points, and in possible changes in the distribution of marks with position of the point.
Example: Insurance risk
The arrival of claims at an insurance company and the sizes of the claims might be modelled as a marked Poisson process.
An initial assumption, to be checked, might be that marks (claim amounts) are independent and identically distributed.
The difference between premium income and claim payouts is the key to financial viability of the company.
If the ith claim is made at time Xi and is of size Yi, and premiums
bring income at a steady rate ρ net of running costs, then the assets A(t) of the insurance company at time t are
A(t) = A(0) + ρt −∑_{1,N(t)} Y_i
where N(t) is the number of claims up to time t.
The probability that A(t) remains positive for a long time, and of
how large the reserves A(0) need to be to make this probability large, are of great interest.
COMPOUND POISSON PROCESS
Sums of the form
∑_{1,N(t)} Y_i
for a Poisson process N with marks Yi arise in many contexts.
They are called compound Poisson processes.
EXAMPLE: Earthquakes
A sequence of earthquakes could be modelled by attaching marks representing earthquake magnitude to the times of occurrence.
Questions about dependence between magnitudes close in time are highly relevant to predictability and the possibility of warning systems.
The same question arises too about the times themselves and motivates further development of the Poisson models we have considered in this course.
Example: Floods
Floods could be modelled by a marked Poisson process, the mark for a flood occurrence being the magnitude of the flood.
Marks could include more information too, becoming
multi-dimensional.
If further data were available, for example about weather conditions at the times of floods, or environmental conditions such as dryness/wetness of the ground in the period before the flood, then it
too could be modelled as part of a multi-dimensional mark
Example: Rainfall
A point process model for rainfall attempts to mimic the occurrence and heaviness of rain at a place in terms of the passage of rain cells over the place.
The arrivals of rain cells are modelled by a Poisson process and the time a rain cell takes to pass over the place and the intensity of the rain it brings are attached as random marks.
Marks in this case are two-dimensional.
Example: Burbage floods
An initial model for the times and severities of the Burbage Brook flood events is based on a marked point process.
Dates of floods are assumed to come from a Poisson process, and the excess flood flows over 4 cumecs are modelled as conditionally independent marks with exponential distributions whose means 1/µ(t) may depend on time.
scattering of Y_i and X_i
2d
if indep and indep of locations of the points then the marks give a second dimension to the poisson process
If the original Poisson process has intensity λ(x) and the mark
probability density is k(y) then the intensity of the two-dimensional Poisson process (Xi , Yi) is µ(x, y) = λ(x) k(y).