Vocab! Flashcards
Difference between a node and a point
both represent locations in a spatial dataset but their roles and significance differ
- Points are basic geometric entities representing a specific location with a set of coordinates. They exist independently and do not inherently define relationships with other features
- Nodes, however, are specific points that play a crucial role in defining topological relationships. They occur at the intersection of two or more lines (arcs) or mark the beginning and end points of a line. Nodes in a topological structure serve as connection points, defining how lines connect to form a network or how they enclose areas to create polygons
- all nodes are points, but not all points are nodes - point becomes a node when it participates in defining the connectivity and structure of a topological network
Visual hierarchy
Visual Hierarchy: Just like variations in symbol size, colour, or shape, variations in typographic elements (font, size, style, weight) can be used to establish a visual hierarchy on the map. This hierarchy guides the viewer’s attention, highlighting the most important elements and differentiating between different types of information. Example:
- Larger or bolder fonts can emphasize major cities or features.
- Smaller fonts can be used for less prominent locations or details.
- Italic fonts can be used to differentiate water features from land features.
Sliver polygon
A sliver polygon is a narrow, elongated polygon that often arises as an undesirable artifact during data digitization or overlay operations.
Sliver polygons typically occur when:
- There are slight discrepancies in the digitization of common boundaries between adjacent polygons
- Data from different sources are overlayed, and the boundaries don’t align perfectly
Sequential-sequential color scheme
These schemes combine two sequential schemes, resulting in a mix of colours based on two hues. The hue mixtures can create a third hue. For instance, magenta and cyan sequences can produce various purple-blue hues.
Sequential-Sequential-Sequential Color Scheme
Use when we need to present 3 numerial variables at a same time. It is based on 3 hues. It is a kind of three axel graph that is all combination of color in 3 sequential schemes
Face complexity
face complexity (CF) as a measure of map complexity that can be used in optimal classification methods to determine the best number of classes for a map.
The CF is calculated as the ratio of actual polygons to potential polygons. Higher CF values indicate that the map is more complex.
CF = actual polygons / potential polygons
This formula suggests that a map with many small polygons will have a higher face complexity than a map with fewer, larger polygons. The sources do not elaborate on the definition of “potential polygons.”
Oriented line topology
Oriented line topology is a way of structuring spatial data to define the relationships between lines and points in a network. Lines are digitized as arcs with two nodes and an identifier. In an oriented network, each arc has a begin node and an end node, indicating the direction of flow or movement.
This type of data structure is used to represent networks like roads, rivers, or utility lines. It facilitates network analysis, such as determining the shortest distance between points, identifying connectivity, and analyzing neighbourhood effects
Cumulative frequency diagram and it’s components - Value, Frequency, CF, CF%
A cumulative frequency diagram is a graph that shows the total number of observations that are less than or equal to a particular value in a dataset. It is useful for understanding the distribution of data and identifying potential break points for data classification.
The components of a cumulative frequency diagram include:
- Value: The values in the dataset, typically plotted along the horizontal axis.
- Frequency: The number of times each value appears in the dataset.
- CF (Cumulative Frequency): The running total of frequencies, indicating the number of observations less than or equal to a specific value.
- CF% (Cumulative Frequency Percentage): The CF expressed as a percentage of the total number of observations.
Douglas-Peucker data reduction algorithm
The Douglas-Peucker algorithm is used to simplify lines by reducing the number of points required to represent them. This simplification process is important for reducing data storage requirements and improving the efficiency of spatial analysis.
The algorithm works by:
1. Selecting the first and last points of a line as anchor points.
2. Creating a buffer zone around the line segment connecting the anchor points. The width of the buffer is defined by a tolerance distance (e).
3. Identifying the point farthest from the line segment within the buffer zone.
4. If a point lies outside the buffer, it is retained as a new anchor point. The line is split at this point, and the process repeats for the two resulting line segments.
5. If all points lie within the buffer zone, the points between the anchor points are eliminated.
This process continues until the line is simplified to the desired level of detail.
Diverging color scheme
A diverging colour scheme emphasizes the progression of data values away from a critical midpoint in the data range. It combines two sequential colour schemes that share a light colour for the critical midpoint and progress to darker colours of different hues at each extreme.
This type of colour scheme is well-suited for representing data that deviates above and below an average or median value, like deviations in death rates from a disease.
The sources suggest using two Munsell colours to create a diverging scheme, as this system ensures that equal steps in the colour model represent equal perceptual steps.
Cartogram
A cartogram is a map that distorts the size and shape of geographic areas to represent data values. For example, in an area-proportional cartogram, the size of each enumeration area is adjusted to be proportional to the value of a secondary dataset, such as population.
This type of map can be used to visually represent non-area-related phenomena, like population density or election results, in a way that highlights the distribution of the data rather than the actual size of the areas
Typographic variables
Typographic variables are the characteristics of text that can be manipulated to enhance the visual hierarchy and communication effectiveness of a map. These variables include:
- Font (Typeface): The design of the characters (e.g., Helvetica, Times New Roman).
- Size: The height of the characters, typically measured in points.
- Weight: The thickness of the character strokes, often described as light, regular, or bold.
- Form: Whether the text is roman (upright) or italic (slanted).
- Colour: The hue, saturation, and value of the text.
- Spacing: The distance between characters, words, and lines.
- Case: Whether the text is uppercase, lowercase, or a combination
GADF
The Goodness of Absolute Deviation Fit (GADF) is a measure used to evaluate the effectiveness of different data classification schemes, particularly in the context of creating choropleth maps. It helps determine the optimal number of classes for a given dataset.
GADF = (ADAM - ADCM) / ADAM
- The GADF value ranges from 0 to 1.
- A higher GADF value indicates a better classification, meaning the classes effectively group similar values.
- A GADF value of 1 would mean there’s no variation within classes, which is ideal but rarely achievable.
ADAM
ADAM (Sum of Absolute Deviations Around the Median for the Entire Dataset)
ADAM measures the spread or dispersion of the entire dataset around its median. It tells you how much individual data points deviate from the central tendency of the dataset as a whole.
To calculate ADAM:
1. Calculate the median of the entire dataset. The median is the middle value when the data is sorted in ascending order. If the dataset has an even number of values, the median is the average of the two middle values.
2. Calculate the absolute difference between each data point in the dataset and the median. The absolute difference is the value of the difference, ignoring its sign. It is calculated as: |data point – median|.
3. Sum all the absolute differences calculated in step 2. This sum is your ADAM value
ADCM
ADCM (Sum of Absolute Deviations Around the Class Median)
ADCM measures the spread or dispersion of data within each class around that class’s median. It tells you how much individual data points within a class deviate from the central tendency of that class.
To calculate ADCM:
1. Divide your data into classes. You can use any classification method for this (equal interval, quantile, natural breaks, etc.).
2. Calculate the median for each class.
3. Calculate the absolute difference between each data point in the class and the median of that class. Again, this is: |data point – class median|.
4. Sum all the absolute differences calculated in step 3 for each class. These sums are your ADCM values for each class.
Raster Data
A spatial data model that represents geographic features as a grid of equally sized cells (pixels). Each cell contains a value that represents the attribute of the feature at that location.
Vector Data
A spatial data model that represents geographic features as points, lines, and polygons defined by their coordinates