Module 10 Flashcards
interpolation
interpolation is filling in data points between the data
you already have
• eg, regression analysis and trendlines only apply
to the data set (from xmin to xmax); temperature is
measured only at weather stations, can we
estimate temperature between the weather
stations
extrapolation
extrapolation is filling in data points beyond the data
that you have
• eg, using regression analysis to predict values
beyond the scale of the observations; estimating
temperature beyond the network of weather
stations
• extrapolation methods assume that the world
outside the data behaves the same or similar to
the world inside the data
IDW
• inverse distance weighting estimates a value of each
location by taking the distance
-weighted average of
the values of known points in its neighbourhood
• the closer a known point is to the location being
estimated, the more influence or weight it has in the
averaging process (ie, each known point has a local
influence that diminished with distance)
Tobler’s First Law of Geography
The First Law of Geography, according to Waldo Tobler, is “everything is related to everything else, but near things are more related than distant things.”
IDW importance of the Power
• the power parameter determines the significance of the known
points on the interpolated value
• a higher power (eg, > 2) puts more emphasis on the nearby points
and produces a more varying and less smooth surface
• a lower power (eg, < 2) gives more influence to the distant points,
resulting in a smoother surface
• neighbourhood size can be defined by the radius of a circle, or by the number of known points – in general, the ______ the neighbourhood the smoother the interpolated surface since the averaging procedure incorporates more of the actual data
larger
Primary Features of an IDW result
.the surface passes through the sample points
• the interpolated values are always within the range of the measured values of
known points and will never be beyond the maximum and minimum values of the
known points
Natural Neighbor
the natural neighbour interpolation method estimates the value of an unknown location
by finding the closest subset of known points to the location being estimated, then
applying weights to them based on proportionate areas
• each polygon contains 1 known point, and any unknown point within a given polygon is closer to the polygons known point than to any other known point contained in other polygons
• this technique originated as a method to generate rainfall estimates, and has since spread throughout spatial science
• a new polygon is created around the given unknown point, which also adjusts the surrounding polygons but maintains the basic proximity rules
• only the known points belonging to polygons that have been adjusted will be included in the subset of points for interpolation, and the weight applied to each known point is proportional to the amount of overlap between the new polygon and the original polygons
Trend Surface Interpolation: 3 types
- a trend surface interpolation fits a smooth surface defined by a polynomial function to a set of known points, then uses the polynomial function to estimate the values of unknown locations
- the trend surface is analogous to a least-squares regression equation – use a subset of points to define the relationship, then predict the z value of each point in the sample area
- like regression analysis, there is a prediction error (the residual) at each point – for the trend interpolation,
1st Order Polynomial:Planar Surface(flat)
2nd Order Polynomial:Quadratic(some degree of curve)
3rd Order Polynomial: Cubic Surface(very curvy)
trend surfaces are also an effective tool for smoothing the data – much like a filter, the trend surface removes high and low values and reveals the underlying spatial trend of the dataset
• orders 1 – 4 are most commonly used (ArcGIS allows up to 12th -order) - it is difficult to justify that some natural phenomenon behaves as an 8th-order polynomial, so it is best to avoid these cases
• trend surface interpolation is highly susceptible to extreme outliers (just like regression analysis), so examining the dataset beforehand and objectively removing the outliers is important
______ order polynomial equations need many data points to produce the surface, so a
bigger dataset is needed-for trend surface interpolation
higher
Spline
Estimates values at unknown locations using a mathematical function that minimizes overall surface curvature
-while there are several different types of spline functions, the most commonly used in GIS are thin-plate splines, which produce a surface that passes exactly through the known points while ensuring the surface is as smooth as possible
- both regularized splines and splines with tension create smooth, gradually changing surfaces with estimated values that may lie outside the range of the maximum and minimum values for the known points
- regularized splines run into significant problems by estimating steep gradients in data-poor regions – these are known as overshoots; in general, when t > 0.5 there are a greater number of overshoots
- splines with tension allow the user to control the tension to be applied at the edges of the surface as a method of reducing overshoots
while there are several different types of spline functions, the most commonly used in GIS are _____ splines, which produce a surface that passes exactly through the known points while ensuring the surface is as smooth as possible
thin-plate
Kriging
• kriging is a geostatistical method for spatial interpolation that is similar to IDW in that it
estimates the value of a variable at a location by computing a weighted average of the
known z values in its neighbourhood;however, the weights in kriging are dependent on the spatial variability in the values of the known points
• kriging assumes that in most cases spatial variations observed in environmental
phenomena (eg, variations in soil qualities, changes in the grade of ores) are random
but spatially correlated, and the data values characterizing such phenomena conform
to Tobler’s first law of geography – ie, spatial autocorrelation
• the exact nature of spatial autocorrelation varies from dataset to dataset, and each
set of data has its own unique function of variability and distance between known
points, which can ultimately be represented by the semivariogram
Semivariogram
a semivariogram is a graph of the semi variance on the y-axis and the distance between known points (the lag) on the x-axis
in order to estimate the semivariance at any given
distance, the data points are fitted with a continuous curve called a semivariogram modelnthere are several different models, each designed to fit different types of phenomena and having different effects on the estimation of the unknown values, especially for nearby
points
Kriging: range
the range represents the maximum
distance between points where spatial autocorrelation occurs -small ranges indicate that data values change more rapidly over space
-the range is used in kriging for defining the size of the
neighbourhood so that spatially correlated known points are selected for interpolation
Kriging: the sill
the sill represents the semivariance at the range value, and is typically the same as the variance of the whole
dataset theoretically, at lag = 0, semivariance =
0, but most natural phenomena exhibit a nugget effect, where semivariance > 0 at lag = 0
the nugget value represents a degree of randomness attributed to measurement error and/or spatial variations that occur at scales smaller than the
sampling scale
2 main forms of kriging used by ARCGIS
• Ordinary Kriging (for random data): assumes that there is no trend in the data and that the mean of the dataset is unknown – the weights are derived by solving a system of linear equations
which minimize the expected variance of the data values
• Universal Kriging (for trending data): assumes that there is an overriding trend in the data in addition to spatial autocorrelation among the known points, and this trend can be modeled by a
polynomial function