GEOG364 Final Flashcards
runs count
a one dimensional autocorrelation measure
joins count
a two dimensional autocorrelation measure
spatial autocorrelation generally explained
the correlation of a variable to itself through space
similarity in position vs similarity in attributes
free sampling and example
the outcome is always random and not determined by previous results
example being flipping a coin
non-free sampling and example
when the outcome is affected by the previous result
example being a card being picked from a deck. each card taken affects the probability of the next card
4 factors that can dramatically influence spatial autocorrelation results
a sample size smaller than 30
one category of values occurs in less than 20% of the data
the region is elongated and has few joins
there are a couple of features with many joins and some with very few
name a limitation of joins counts
it does not work for numeric data
numbers can be reclassed as “high/low,” but this throws away much information
what are the two alternatives to use so for joins/counts to measure spatial autocorrelation
moran’s i
geary’s c
in general, what does moran’s i and geary’s c measure?
they compare the differences in neighbors compared differences in values in the entire study area
in moran’s i or geary’s c what does it mean if the difference between neighboring features is less than between all other features
it would mean that the neighboring features could be considered clustered
which spatial autocorrelation uses squared differences between adjacent cases
geary’s c
which spatial autocorrelation measure uses a covariance term
moran’s i
name two similarities between geary’s c and moran’s i FORMULAs
they both divide by total “w” to account for the number of pairs of cases
they both divide by a variance term in order to account for range of data
explain what -1, 0, and 1 would mean in a spatial autocorrelation analysis
it would mean you are using moran’s i
-1 means negative autocorrelation and the data is dispersed
0 means there is no autocorrelation and pattern is random
1 would mean positive autocorrelation and attributes are clustered
explain what 0, 1, and 2 would mean in a spatial autocorrelation analysis
it would mean you are using geary’s c
0 means positive autocorrelation and values are clustered
1 means no autocorrelation with random values and no apparent pattern
2 means negative spatial autocorrelation with dispersed value (high-low)
match the numbers of moran’s i to geary’s c
-1 = 2 = negative spatial auto 0 = 1 = no autocorrelation 1 = 0 = positive autocorrelation
what does the w represent in a spatial autocorrelation analysis?
the weight given to a measure to set adjeacency
for example, what distance/time/cost would make two features neighbors?
what is the alternative method for etsting significance when etsting geary’s c or moran’s i?
the monte carlo simulation
what does monte carlo simulation do?
it generates a sample distribution for a given test statistic. this test statistic can then be used to assess significance
global statistics
value summarizes a characteristic for an entire study region
why is it important to use measures of autocorrelation in a region?
spatial homogeneity does not exist over global regions/entire study area
what do you call it when autocorrelation is low in one area of a region and high in another
spatial heterogeneity
LISA
local indicators of spatial autocorrelation
local versions of geary’s c and moran’s i
what does LISA measure that is different than geary’s c or moran’s i?
LISA measures levels of particular clusters, not overall clustering
what is the preffered tets of choice for local clustering measures
moran’s i
name 4 objectives of a regression analysis
to determine whether a relationship exists
to describe the nature of the relationship mathematically
to assess the degree of accuracy with which the model represents the relationship
in the case of multiple regression, to understand the relative importance of individual independent vairables
regression VS correlation
correlation provides us with the extent of a relationship between two variables
a regression analysis provides us with the nature of that relationship
y in regression
the dependent variable
x in regression
the independent variable
a and b in regression
the correlation coefficients
e in regression
the random error or residual that the model does not account for
what is the line and what does it show in regression
the line is a statistical model that shows the expected mean value of y for each value x
how do we create a regression line
by applying a least square criterion
what does the least square criterion do
it chooses the line that minimizes the differences between the line it creates and the data points that are given
what are the 4 steps to a regression analysis
- specify independent and dependent variables
- use sample data to estimate a and b in the model
- estimate model error and check assumptions
- evaluate the statistical usefulness of the model
can regression describe causality?
NO, it only helps describe the nature of the relationship
what are the 4 assumptions made in a regression analysis
- mean error is 0
- variance of the error is constant across x values
- error is normally distributed
- no relationship exists between y and the residual/error
is regression an extension of correlation or is correlation an extension of regression?
regression is an extension of correlation
what does ANOVA measure
is measures the variance and overall significance of a regression model
how does the size of residuals affect a regression model
smaller residuals mean that the line is a good fit and the model is accurate
what is the range for r squared values
0-1
what does an r squared value of 0 or 1 mean
o means the line is excellent and there is no difference
1 means the line is horribly off and there is large differences
what may r squared look like in a software output
ESS
what does standard error of estimates show
it estimates the standard deviation of the errors/residuals
how close are the observed values to the line?
how many values fall within 95% of the value of the fitted line
what is a regression model not good for?
estimating a value outside the range of observed value EXTRAPOLATION
what is the difference between multiple regression and simple regression
multiple regression uses multiple independent variable
name an example of a multiple regression
a linear trend surface
multicollinearity
an assumption in a multiple regression analysis
assuming that independent variables do not exhibit high correlation among each other
what is trend surface analysis an example of
how regression analysis can be applied to spatial problems
what does ANOVA stand for
analysis of variance
what is a synonym of ANOVA
statistical analysis, but ANOVA goes over the top
what does ANOVA address?
different types of variance and then relates them to overall variance
how could you apply ANOVA to following regression
predicting plant growth by fertilizer application
you could additionally asks whether different types of fertilizer has varying effects on plant growth
what is a name for two or more categorical predictor variables in ANOVA
factors
in terms of columns and rows what does ANOVA compare?
the difference between variables within one column to the overall variation between two different columns
name the 4 probability distributions
normal
z
t
f