Interpolation Flashcards

1
Q

Spatial Sampling

A
  • Location of sample points can be critical for subsequent analysis
  • for mapping, ideally, samples should be located evenly over the area. Regular sampling could be biased, and completely random location has also drawbacks

Examples:
- Regular sampling
- Random Sampling
- Stratified random sampling (good compromise between random and regular —> individual points are located randomly within regularly blocks or strata
- cluster sampling –> can be used to examine spatial variation at several different scales
- Transect sampling
- Contour sampling –> making DEM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Definition Spatial interpolation

A

*general definition: Procedure of predicting the value of attributes at unsampled sites from measurements made at
point locations within the same area.

  • Predicting value of an attribute at sites outside the area covered by existing information is called extrapolation

Point interpolation is used to convert data from point observations to continuous fields so that the spatial patterns sampled by these measurements can be compared with the spatial patterns
of other spatial entities

Necessary when:
- data do not cover the domain of interest completely.
- discretized surface has a different level of resolution from that required.
- data model is different from that required represents a continuous surface.

Examples: Elevation, thickness of soil, perimeters of trees, soil organic carbon content, depth to
groundwater, precipitation, heavy metal levels in soil or plants.

Tasks to be fulfilled:
- Catch the important features of the data
- estimate the average value over a large area
- and estimate unknown values at unsampled locations
- estimate average values over small areas, check the performance of the estimation methodology.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Definition
- Exact interpolation
- Support

A

Exact Interpolation predicts a value of an attribute at a sample point, which is identical to the measured value.

Support is the volume (or area or length) of the physical sample on which a measurement is
made (difference between 1g or 1 kg) less variation if the volume increases. Important in mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Spatial interpolation Methods

A

-Global Methods (3)
* - Classification using external Information.
- Trend surfaces on geometric coordinates.
- Regression models on
surrogate attributes.

-Local Deterministic Methods (2)
- Thiessen Polygon.
- Inverse Distance Weighting.
- Splines

-Geostatistical Methods
- Kriging (Ordinal, universal, indicator)
- Co-kriging
- Conditional Simulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

GLobal Method –> Classification
- Definiton
- classification method

A
  1. GLOBAL INTERPOLATORS:
    - Use all available data to provide predictions for the whole area of interest.
    - **Classification methods use easily available information to divide the area into
    regions that can be characterized by the statistical moments (mean, variance) of attributes measured at locations within those regions:

a) Global prediction using global classification methods:
- In some cases, it is convenient to
assume that the observations are taken from a statistically stationary population, that
means mean and variance are independent of both location and support.
- We can decide that these observation points sampled coordinate spatial change, so we can select a classificatory approach based on the spatial units or perform the standard analysis of variance ANOVA. The simplest statistical model is the ANOVA model: (Formel anschauen)
Assumptions:
- - variations of “Z” within the map units are random and not spatially contiguous (we have sharp class boundaries in the examples of flood classes–> but unrealistic).
- all mapping units have the same within class variance (noise, same error around the means)
- all attributes are normally distributed.
- spatial changes take place at the class boundaries, and they are sharp, and not gradual.
- transformation of original data can be done in order to achieve normal distribution. (ex. log)

  • The analysis of variance compares the variance between classes to the within class (error) variance.
    -The ratio of between and within variances is called F-value (MSB/MSW=F)
    -It is compared against a tabulated critical value that fives the maximum F-value that is at the selected probability level purely random. If F > Fcrit, then the difference between classes is non-random (there is at least one class different).
    -To check which classes differ from
    each other, another test (t-test) must be applied.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Global Method –> Trend surfaces on geometrical coordinates

A

Global Interpolation using trend surfaces:
- When variation of an attribute occurs
continuously over a landscape, it may be possible to model it by a smooth mathematical surface.
-The simplest way to model is by a multiple regression of attribute values versus
geographical locations
.
-The idea is to fit a polynomial line or surface that minimizes the sum of squares for 𝑍
̂(𝑋𝑖) − 𝑍(𝑋𝑖). X and Y are independent, and Z is normally distributed. Regression errors are independent of location.

  • 𝑍 (𝑋) = 𝑏𝑜 + 𝑏1𝑋 + 𝜀
  • linear trend: 𝑍(𝑋, 𝑌) = 𝛽𝑜 + 𝛽1𝑋 + 𝛽2𝑌 + 𝜀 (Fluctuation over the plane)
  • quadratic trend: 𝑍(𝑋, 𝑌) = 𝛽𝑜 + 𝛽1𝑋 + 𝛽2𝑌 + 𝛽3𝑋2 + 𝛽4𝑌2 + 𝛽5𝑋𝑌 + 𝜀
  • The significance of a trend surface can be tested by the technique of analysis of
    variance to partition the variance between trend and residuals from the trend. The
    goodness of fit (𝑅2) values show that even higher order surfaces coefficients do not fully represent all the variations in the data. Even if significally better fits can be obtained
    with higher orders polynomial, it is not physically sensible to choose because it has not physical explanation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Global Method–> Regression models on surrogate attributes (cheap to measure attributes)

A
  • attributes like “distance to river” or “elevation”
  • Regression model has the form:
    𝑍(𝑋) = 𝑏𝑜 + 𝑏1𝑃 1 + 𝑏2𝑃2 + 𝜀
  • 𝑏𝑜, 𝑏1, … are regression coefficients; 𝑃 1, 𝑃2, … independent properties (most important
    properties that influences the simulation/interpolation)
  • The most important point is that the regression model makes physical sense, and not all are inexact interpolation.
  • The result gives a continuous data with a confidential interval of 95%.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Local Deterministic Methods
- Definiton
a) nearest Neighbors: Triangulation and Tessellation

A
  • Local Methods of interpolation
    that use the information from the nearest data points directly.
  • For this approach, interpolation involves defining a search area or neighborhood around the point to be
    predicted
    , finding data points in this neighborhood, choosing mathematical functions to represent the variation over this limit number of points, and evaluating it for the point on a regular grid.
  • The procedure is done until all the points on the grid have been computed

a) nearest Neighbors: Triangulation and Tessellation
- When we talk about Voronoi polygons and spatial predictions, it basically means that we are trying to predict attributes for locations where we have no data. We do this by looking at the nearest available data points and using their attributes to make predictions for the missing locations. It’s a rough approximation, but sometimes that’s the best method we have.
- Thiessen polygons divide the region in a way that is totally determined by the configuration of the data points, with one observation per cell
- Thiessen polygons are often used in GIS as a quick method for relating point data to space, for meteorology data for a given site.
+ can be easily used with qualitative data like vegetation class or land use, all needed is a choropleth map.
+ They are exact predictors because all
predictions equal the values at the data points.
- cons: has sharp borders (discontinuities are undesirable and have little to do with the
reality)

  • The lines joining the data points show the Delaunay triangulation, which is the topology as TIN.
  • overcomes this problem of polygonal method, removing possible discontinuities between adjacent points by filling a plane through three samples that surround the point being estimated.
  • pycnophylactic method: Mass-preserving relocation from primary data.
  • Ensures that the volume of the attribute in a spatial entity remains the same, irrespective of if the
    global variation of the attribute is represented by homogenous, crisp polygons or a continuous surface.
    -The total volume of the attribute per polygon is in variable.
  • The constraining surface is assumed to vary smoothly so that neighboring location have
    similar values. Conversion of the data to a density function.
  • The resulting pattern is
    similar to smooth interpolators, but it is not an exact interpolator
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Local Deterministic Methods
b) Inverse Distance Interpolation - Linear Interpolation

A
  • Combine the idea of proximity of
    Thiessen polygons with the gradual change of the trend surface.
    -The assumption is that the value of an attribute Z at some point is a distance weight average of data points
    occurring within a neighborhood or window
    -It is used to create raster surface from point data.
  • exact interpolator
  • FORMULAR and Example
  • The further away the point is, the lower the influence to the estimated point.
  • r parameter for smoother result (r=2)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Local Deterministic Methods
b) Spline Interpolation

A
  • Form of interpolation where the interpolant is a special type of piece wise polynomial called spline.
    -Interpolation error can be made small even when using low degree polynomials.
  • Gives a smoother result.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Geostatistical Methods
- definition

A
  • It assumes that the data point’s values represent a sample
    from some underlying true population. By analyzing this sample, it is often possible to
    derive a general model that describes how the sample values vary with distance (and
    direction)
  • Core Concepts in Geostatistics:
    Frequency tables and histogram:
  • A frequency table records how often observed values fall within certain intervals or classes. (Graphically=histogram)
  • common to use a constant class width for the histogram so that the height of each bar
    is proportional to the number of values within that class.
  • Cumulative frequency tables
    can be used, and histograms might be prepared after ranking the data in descending order.

Probability plots:
- A normal probability plot is a type of cumulative frequency plot that
helps to see if the distribution is close to Gaussian distribution.
- On a normal probability
plot Y–axis is scaled in such a way that the cumulative frequencies will plot as a straight
line if the distribution is Gaussian.

Statistical Parameters from experimental data:
Mean: mx= 1/n * ∑xi
Variance: S = 1/n* ∑(xi - mx)^2
Standard deviation: Sx = Wurzel aus Varianz
Covariance: 𝐶𝑋𝑌 =1/𝑛 * ∑ (𝑋𝑖− 𝑚𝑋)(𝑌 𝑖− 𝑚𝑌)

Scatterplot:
- Most common display of bivariate data, which is an X–Y graph of the data on which the X coordinate corresponds to the value of one variable and the Y coordinate
to the value of the other variable.
- There is some scatter in cloud, the larger values of the variable “V” tend to be associated with the larger values of the variable “U”, and smaller values of the variable “V” tend to be associated with the smaller values of the variable “U”.
- The shape of a cloud of points on an h-scatter plot tells us how continuous the data
values are over a certain distance in a particular direction. If the data values at locations separated by “h” are very similar, then the pairs will plot close to the line X = Y, a 45- degree line passing through the origin. As the data values become less similar, the cloud of points on the h–scatterplot becomes fatter and more diffuse.

Correlogram, covariance function and variogram:
- The relationship between the
correlation coefficient of an h–scatterplot, and “h” is called the correlation function or
correlogram. The correlation coefficient depends on “h”, which has both, a magnitude
and a direction.
* Correlogram = change of the coefficient of correlation against space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

a) Experimental Variogram
explain Nugget, Range and Sill

A

Experimental Variogram
- Variogram is a function describing the degree of spatial dependence of a spatial random field or stochastic process.
= defined as the variance of the difference between field values at two location (X, Y) across realizations of the field.
- main goal of a variogram
analysis is to construct a variogram that best estimates the autocorrelation structure of
the underlying stochastic process.
- Useful information for interpolation, optimization sampling and determining spatial patterns.

  • parameters:
    *Nugget: (Small scale variation):
    a micro-scale variation and measurement error. it is estimated from variogram when the y=0.
    Sill: it’s the variance of the random field up to which the variogram doesn’t change.
  • Range: Distance (if any) at which data are no-longer autocorrelated. It describes how
    inter-site differences are spatially dependent. Within the range, the closer together the
    sites are, the more similar they are likely to be. If the distance to an unvisited site from
    data point is big, the data point can make non-useful contribution to the interpolation.

computation steps:
1. Form all possible data pairs.
2. Group the data pairs into distance classes.
3. Compute the difference between the two values of each data pair.
4. Square all the difference.
5. Compute the average value of the squared differences for each distance class.
6. Divide by two (by definition)

  • Variograms can be nested
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Kriging interpolation

A

Def.: It is based on the variogram, and yields the Best Linear Unbiased Estimator (BLUE). Produces not only estimates, but also their error variances. It is linear because it estimates linear combinations of the data, unbiased because it attempts to have a mean residual error of zero, and best because it minimizes the error variance

  • types of kriging:
    Ordinary kriging
    Block
    Stratified
    Indicator
    Co-kriging
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

steps in kriging

A
  1. Examine data for normally and spatial trends and carry out appropriate
    transformation. If using indicator kriging, transform data to binary (0/1) values.
  2. Compute experimental variogram, and fit a suitable model to it. If spatial variation
    is pure nugget, the interpolation is not sensible. (*Pure nugget: no special structure,
    no spatial variation. Example: pH soil)
  3. Check the variogram model by cross-validation (jack–knifing).
  4. Use the variogram model to interpolate sites on a regular grid, where the sites are
    either equal in size to the original samples (point kriging), or larger blocks of land
    (block kriging).
  5. Either display results as grid cell maps or by threading contours, singly or dropped
    over other data layers.
  6. Input the results to the GIS and use them in condition with other data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

kriging and conditional simulation:

A
  • Kriging yields the best (linear, unbiased) estimates at unsampled locations, but
    interpolated surfaces is smoother than the data.
  • Conditional simulation creates a random field with the same variance–covariance
    structure of the data. Created surface passes through the data. At unsampled locations,
    conditional simulation does not yield the best estimate.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

types of kriging

A

Prediction by ordinary kriging:
- similar procedure to that used in
weighted interpolation except that now the weights are derived from a geostatistical analysis of the data rather than from an empirical model.
- The weights are chosen so Z0-estimator is unbiased.
- Ordinary kriging is an exact interpolator in sense that the interpolated values or best
local average will coincide with the values at the data points.
- The result is smoother than the inverse distance maps, and avoid larger values linked to isolated data points. Prediction variance is linked to the density of data points.
- takes account of gradual spatial change better than other methods.

Block kriging
- Point or punctual kriging imply that all interpolated values relate to the same support (area or volume of an original sample), and when the result have many sharp spikes.
-To overcome this, kriging can be modified estimating 𝑍0 over a block of
land B.
- Kriging estimates and standard deviations with anisotropy. Incorporating anisotropy into the ordinary kriging is a matter of modifying the conversion of the
distance matrix into a matrix of semi variances, taking into consideration the variation of semi variance with direction, the results are plausible

Stratified kriging:
-It is used when there is enough soft information to classify the
area into meaningful subareas, and when there are enough data to compute variograms for each domain.
- Avoids class boundaries. Reduces interpolation uncertainty.
- The greater the information from data points and external sources, the smaller the
prediction standard deviation.
- Geostatistical prediction is effective at reducing interpolation errors, where spatial variation is continuous, and combination with
stratification gives lower errors. Better results require testing with independent dataset.
Geostatistical and soft data together greatly improve predictive power of GIS.
* Non–linear Kriging: log–normal kriging is the interpolation of log–normally distributed.
Data is first transformed to natural log, or base 10 log. Useful for physical data with
positive skew distributions.

Cokriging:
Use of information about spatial variation of “U” dataset (measured
frequently) to determine “V” dataset (no often measured). Cokriging can produce
predictions for both points and blocks in analogy to ordinary kriging. It needs
information about joint spatial covariation of the two variables (cross variogram). An
improvement can include cross–validation (jack–knifing):
1. Interpolate the value at sampled locations without using the respective
measured value.
2. Plot estimated values against measured values at sampled locations, and
calculate appropriate statistics such as root means square error RMSE

indicator kriging:
Non-linear from ordinal kriging, in which original data are transformed from a continue scale to binary scale. The resulting maps display continuous data in the range of 0 – 1, indicating then probability that Tj (threshold) has been or not exceeded. It can be combined with soft information.
1. Select (one or more) thresholds
2. Covert original data to binary data
3. calculate and model indicator variograms
4. Apply ordinary kriging procedure
5. Plot interpolated values. Map gives the probability that a given location exceeds

17
Q

Isotropic and anisotropic variations:

A
  • If directional effects are so small that they can be ignored, the spatial structure of the variable is isotropic and the variogram is the same in all directions.
  • If the structure is anisotropic, we
    compute variograms for specific directions β.
    Anisotropy can occur in the ranges or the sills or both. This may be due, e.g., to sedimentation of river sediments.
  • Geometric anisotropy, range changes with direction, while sill
    remains constant. Zonal Anisotropy, sill changes with direction while range remains
    constant. To deal with changes of
    range and sill with direction, we need
    to identify anisotropy axes, using
    variogram surface maps, or
    knowledge of the phenomenon. For
    more than one variable, the point-to-
    point correlation can be used. Cross
    variograms are used to describe the
    cross–continuity between two
    variables.
  • Example: Within 150 m, there is
    spatial relationship between clay content and drag force.
18
Q

Comparing Kriging, inverse distance weighting and global polynomial interpolation:

A
19
Q

Indicator Kriging and its possible using way:

A

Non-linear from ordinal kriging, in which original data are transformed from a continue scale to binary scale. The resulting maps display continuous data in the range of 0 – 1, indicating then probability that Tj (threshold) has been or not exceeded. It can be combined with soft information.
1. Select (one or more) thresholds
2. Covert original data to binary data
3. calculate and model indicator variograms
4. Apply ordinary kriging procedure
5. Plot interpolated values. Map gives the probability that a given location exceeds

20
Q

How can the robustness of a fitted variogram model be tested?

A

with this function y(h) = y1(h) + y2(h) + ….

21
Q

Please explain the differences between exact and inexact interpolation methods (Spatial data analysis and interpolation).

A

*Exact interpolation methods aim to reproduce the original data values at the known data points exactly without introducing any errors or deviations.
These methods ensure that the interpolated surface passes exactly through the data points, preserving the values and characteristics of the original dataset.

*Inexact interpolation methods do not necessarily reproduce the original data values exactly at the known data points. Instead, they provide estimates or approximations of the data values between the points.
These methods may introduce some degree of error or deviation from the true data values, especially when extrapolating beyond the range of the known data.

22
Q

Explain and compare the techniques of “Thiessen polygons”, “inverse distance weighting” and “kriging”.

A
23
Q

What does Kriging set apart from other interpolation methods? In which way and why does the Kriging procedure make use of the variogram

A

Essentially, Kriging takes into account not only the distance between data points but also the direction and strength of spatial correlation, resulting in more accurate and reliable predictions. Additionally, Kriging provides estimates of prediction uncertainty, which is valuable for decision-making and risk assessment in spatial analysis.