Point Pattern Analysis Flashcards
Methods of PPA
- Quadrat Analysis and & Poisson Distribution
- Nearest Neighbour
- Ripley’s K-function
- Spatial Autocorrelations (LISA)
PPA
- Only concerned with location
- Indication of underlying spatial process
- Exploratory
- May lead to spatial regression and geostatistical analysis
- Ex. Crime data (burglaries), Biology (bird nests), Epidemiology (disease incidence)
PPA is only concerned with what?
- Location
- establish pattern of occurrence and evaluate possible causes
- Don’t need what the value is until the stats stage
John Snow
- Not GoT
- Modern epidemiology and spatial pattern analysis
- Cholera in London, deaths at address and pattern
- Linked deaths to contaminated wells, not air transmission
What is point pattern analysis needed for?
Objective, quantitative measures of spatial pattern because visual interpretation is not enough
- Ex. Crime analysts cannot necessarily pick out true clusters of crime just by looking at a map
What is the Null hypothesis?
- No pattern present
- Random
What is the Alternative hypothesis?
- Pattern present
- Some stats can tell if clustered or dispersed
What are the must haves for PPA?
- Proper coordinates (Location, not centroid or polygon, or areal units, or ‘representative over area)
- Proper projection (preferably preserve distances)
- Study are ‘objectively determined’ (recall edge effects, MAUP)
What can projections distort/preserve?
Angles, Area, Shape, or Distance
Shape, examples for Physical Geography and Human Geography
- Physical: Woodlands rectangles
- Human: Administrative boundary polygons
Study Area: Why is it better to have geometrically regular shapes?
- Easier calibration and model fit
What are some possible subsequent analysis for PPA?
- Trend Surface
- Issue -> Edge effects
Why must distance be preserved in projections?
- Result can be skewed if distance not correct
Same area, different phenomena
Attribute
Different area, same phenomenon
Location
PPA Exploration prereqs?
- Scatter plot
- Visual (location, binary data)
- Outliers (measurement error)
Poisson Distribution
- Compare observations to poisson (observed vs. expected if random)
- Probability of an event happening rarely, if at all, and if it does occur, time and place of occurrence are independent and random
- No spatial or temporal autocorrelation
Poisson eqn
Mu = sigma^2
CSR
Complete Spatial Randomness
- Poisson density function
Poisson Density Function
p (x) = e^-lambda x lambda^x/x!
- Lambda = density = mean occurrence for time unit
- ! (factorial) = number of permutations of X
Spatial Poisson Distribution ‘Poisson Process’
- No interactions btwn subareas, whether inhibitory or attraction
- No possibility of multiple groupings of individuals w/in each subarea (no point clusters)
- No tendency for neighbouring areas to display similar traits
Chi-squared test, X^2
- Accepts or rejects null hypothesis
- = VMR (m - 1) or Sum of (observed - expected)^2/expected
- where m is number of quadrats
- Use table to get critical chi value
- is test <> critical value for significance level used
What is a possible problem with too small quadrats?
- Leave black areas or create clustering
- Size Matters!
PPA analysis /correlation of x, y data
- Locational, z is not important at this point
- More about spatial relationships on data distribution
- Look at z for more robust analysis as to why that spatial pattern may exist
Steps of Quadrat Analysis
- The scoring of points that fall into a quadrat
- Divide study area into quadrats
- Count points in each quadrat
- Sum the results
- Compare with CSR (poisson)
- Test against chi-square
Optimal quadrat size
= (2 x Area)/n
Quadrat analysis: Property
Mean = Variance
VMR
Variance Mean Ratio
= Variance/ mean cell frequency
= n, number of points/obs divided by ‘m’ (number of cells/quadrats)
VAR
- Number of points per cell
- Variance of the frequency
QA: Index of dispersion
VAR = [((Sum of fi xi^2) - (Sum of fi xi)^2)/m]/ (m-1)
- fi is frequency of cells with i number of points
- xi is the number of points per cell
- m is number of quadrats
QA, Null hypothesis
VMR = 1
- Point pattern is random
QA, VMR does not = 1
Point pattern is not random
QA, Alternative hypothesis
- VMR > 1, point pattern is more clustered than random
- VMR < 1, Point pattern is more dispersed than random
NNA
- Nearest Neighbour Analysis
- Measurement of distance btwn one point and its neighbours
- Next step from PPA
- Mean of observed distances is compared to an expected avg distance based on random poisson distribution
- Null is random, alt is more or less dispersed than random
- Test statistic to see if result is significant
NNA eqn
Sum of Nearest Neighbour Distances (NND)/ number of obs/points
- NND is distance to the next nearest point
NND = 0
Perfect clustering
- The closer to 0 NND is, the more clustered it becomes
NNDr = 1/2 x sq root of density
NNDr, Random clustering
NND = 1/sq. root of density
Regular square lattice
NNDd= 1.07453/sq. root of density
Regular hexagonal lattice
NNA, adjustment for edge effects
- NND = 1/2 x sq. root (A/n) plus (0.514 plus 0.412/sq. root of n) x p/n
- and Sigma^2 = 0.070 x (A/n^2) plus 0.035p((sq. root of A)/n^5/2)
NNA, null hypothesis
Ho: NND = NNDr
- Point pattern is random
NNA, Alternative hypotheses
- NND does not = NNDr, point pattern is not random
- NND > NNDr, point pattern is more dispersed
- NND < NNDr, point pattern is more clustered
NNA Test statistic
Zn = (NND - NNDr)/ sigma of NND
- Sigma of NND = 0.26136/sq. root of (n x density)
NNA, standard nearest neighbour index
R = NND/NNDr R = 2.149, perfect dispersion R = 1.5, more dispersed than random R = 1, random R = 0.5, more clustered than random R = 0, perfectly clustered
Ripley’s K-Function
- One step further than NNA, finds where distances are located
- For Distance, d
- Avg number of events found in circle of radius d around event, divided by mean intensity of the process
- Mean intensity is number of events divided by study area
Ripley’s K-Function Formula
- K (d) = …complex (Area/number of points^2 x sum of C(si, d)
- C (si, d) is a circle of radius d centered at si
- Take lambda =n/a out of equation by inverting to a/n
- Uses search radius bands for number of points that fall in each ring
- Averages number of points w/in distance
What happens if K-Function distance is too small?
Clustering
K-Function results
- ArcGIS outputs a graph, distance vs. K(d)
- Expected looks like straight line
- Clustered when observed greater than expected
- Dispersed when observed less than expected
- Cluster size and separation distance (flat line on graph?)
What do you do after analyzing point patterns?
- Test for spatial autocorrelation
- (After Quadrat, NNA, and K-function)