Lecture 8 - Spatial Regression Flashcards
Spatial considerations in the analysis
- often spatial relationships are ignored
- weakens the ability to create meaningful inferences about studied processes
- spatial regression models include relationships between variables and their neighbour’s values
- include as explanatory variables: values of error terms, x or y values in surrounding regions
-allows to examine impact one observation has on other proximate observations
Using regression without spatial considerations
- estimated regression coefficient is biased/inconsistent
- r2 statistic is exagerated
- incorrect inferences
- model isn’t accurate
OLS diagnostics to find spatial autocorrelation
If we have SA in data we have a violation of assumptions:
- OLS regression is a global linear regression model
- residuals should be independent of each other
- run a test with residuals
A. if diagnostics indicate no SA or other violations of regression assumptions, OLS model fine
B. if diagnostics indicate SA present, need to consider ways to measure & incorporate spatial structure
How to find out if spatial weights are needed for regression?
- include spatial weights into regression (table of spatial relationships for polygons that tells which are neighbours)
- execute diagnostic tests (Moran’s I)
- strength of autocorrelation in residuals
- if Moran’s I is close to 0 = no spatial regression
- determine whether to proceed with spatial regression - Lagrange Multiplier test (LM)
- which spatial regression model?
- spatial error or spatial lag
GeoDa tools for regression
- OLS
- Spatial lag regression
- Spatial error regression
ArcGIS tools for regression
- OLS
- Geographically weighted regression (GWR)
Lagrange Multiplier Test
- tells us which model to choose
- is there spatial autocorrelation in the residuals? –> use spatial error model
- is there spatial autocorrelation in dependent variables? –> use spatial lag model
- is there spatial autocorrelation in both? –> use robust LM
Spatial Error
- examines spatial autocorrelation between residuals of adjacent areas
- spatial effects incorporated via an error term
- error will be similar for adjacent areas because they are likely related
Spatial regression in GeoDa
- get data to include areas with attributes
- create spatial weights table (topology) can choose rook or queen and the contiguity order
- select regression and choose the dependent and independent variables
- run classic regression model
- check r2, adjusted r2, akaike info criterion, then look at variables and p-values, multicollinearity condition number
- compare Moran’s I and Lagrange for spatial error and spatial lag models to find the best one
r2
> 0.8 good, model is explaining 80% of the phenomena relationships
akaike info criterion
to check the quality of the model for using to make predictions
lower numbers better, the number gives the estimate of prediction error (amount of data lost)
p-values of variables
p-value is probability that the relationship is random, low number means not random, need <0.05
Interpreting Moran’s I values
<0.01 very close to 0 random
0.05 slight correlation but close to random
>0.1 there is spatial autocorrelation
Interpreting Lagrange values
LM-lag 0.0000035
LM-error 0.00000001
both close to 0 = significant so check robust LMs for both
RLM-lag 0.7354991
RLM-error 0.0111464
only RLM-error is significant
Spatial Regression in ArcMap
- run OLS
- plot residuals and look for over and under-predictions
- test with Moran’s I
- then run geographically weighted regression
- check r2, akiake, p-values, etc
- plot residuals again and test with Moran’s I
- can also plot each variable that explains the phenomena
- produce final map with predictions