Lecture 7 - Raster Data Flashcards
What is a raster data model?
Divides geographic area into regular grid of cells in specific sequence (identified by row and column)
- each cell contains a single attribute value
- space-filling: every location corresponds to a cell in the raster (regular tessellation that can be conceptualized as a matrix-like series of cells)
- large file
How are feature coordinates expressed in raster?
They are implict
- store grid origin (cell in upper left corner)
- store grid resolution: minimum linear dimension of smallest unit of geographic space sampled
- find coordinates indirectly
How are attributes expressed in raster?
They are explicit
- often only a single attribute assigned to a cell
- otherwise, key identifier in each cell links to related database files containing multiple attributes for each grid cell
How do you create a raster file?
Although this is rarely done, here are the conceptual steps:
- choose grid resolution (choose raster cell size 1/2 length of smallest feature on map)
- set data type (integer, real)
- overlay grid over study area
- assign attribute code to each grid cell
- repeat process for each map layer
What are raster input methods?
- manual raster coding
- raster scanning
- existing digital raster data
- RS imagery
- vector to raster conversion
What is the manual raster coding input method?
Uses spreadsheet, text editor, or digitizer (not popular)
- overlay transparent grid on existing map
- record attribute for each cell (decision rules for mixed cells)
- some digitizer software allows coding of attributes from digitizer puck
What is raster scanning input method?
Scans maps or aerial photos
- several scanner types: drum scanners for large docs and desktop for small docs
- problems: resampling, editing, raster to vector conversion
What is the existing digital raster data input method?
- elevation data commonly available in raster form (govt agencies)
- much raster data already in digital form as images
- resampling likely needed so pixels in images coincide with cells in other data layers
What is the remote sensing data input method?
- airborne imagery: air photos
- satellite imagery: landsat, radarsat
What is the vector to raster conversion input method?
- coded polygons
- grid overlay with appropriate cell sizes
- Each cell is assigned the attribute code of the polygon which it belongs to
How would you convert back to vector from raster?
- each raster is assigned attribute value
- boundaries set up b/w different attribute classes
- polygon is created by storing x and y coordinates for the points adjacent to boundaries
What conversion errors can occur?
- polygons turn blocky if converted back and forth
- may depict incorrect info
What does the type of cell value being used depend on?
- depends on feature being coded and the GIS used
What kind of data is grid-cell representation often used for?
- categorical data (nominal/ordinal scale)
- quantitative data (interval/ratio scale)
What can raster cell data be coded as?
- whole numbers (integers)
- real values (decimal)
- alphabetic values (text)
What are the different cell measurement values?
- nominal
- ordinal
- interval
- ratio
Explain nominal cell measurement values
- categories with no order
- identifiers with no relation to a fixed point or linear scale
- legend or linked table provides meaning
- ex. postal codes, soil types
- ex 2. 0 = no data, 1 = residential, 2 = commercial, etc.
Explain ordinal cell measurement values
- lists of discrete classes with inherent order, but without magnitude or relative proportions
- cell value has meaning
- ex. primary, secondary, undergrad, graduate
Explain interval cell measurement values
- classes not only with natural sequence, but also with meanings attached to distance b/w sequential values
- cell values have meaning
- ex. time of day, temperatures
Explain ratio cell measurement values
- same characteristics as interval variables, but have a natural zero as a starting point (can’t be negative)
- cell values have meaning
- ex. age, distance, income
What is spatial resolution?
The area within a grid cell (cell size) defining how much and area the pixel covers
- smaller the cell, greater resolution and accuracy is
- always trade-off b/w resolution and cost of storage and processing
How do coded grid cells work?
- a line number and column number define cell’s position in raster data
- data stored in table giving number and attribute value of each cell
What are the methods for encoding cells?
- presence/absence
- centroid of cell
- dominant type
- percent occurence
Explain the presence/absence encoding method
- single feature like a well (point) or river (line) is identified as occurring in a cell, no matter how much space it occupies
- for polygons, the polygon which covers highest proportion of cell is recorded
Explain the centroid-of-cell encoding method
- presence of entity is recorded only if portion of it occurs directly at central point of each cell
- only good for areal (manmade areas) data
- better for continuously variable quantities, like elevation
- constraining, so not typically used
Explain the dominant type encoding method
- encodes presence of entity if it occupies more than 50% of cell
- most used method for polygons
Explain the percent occurrence encoding method
- if 3 types land use occur in a cell, then each type would be represented as a percentage of cell it occupies
- only good for areal data
- can only show percentage for one variable, so how do we know what the other percentages are?
Describe raster map layers
- layer comprised of one set of cells and associated values (multiple items of info require multiple layers)
- raster data can represent a multiplicity of things (visual images, discrete value, continuous value, null values)
- ex. elevation, counties, roads all would be 3 layers in raster GIS
What are the data organization methods and why do we want to organize data?
Useful to organize a raster into a 1D data stream for computer file storage and processing.
- band interleaved by line (BIL)
- band interleaved by pixel (BIP)
- band sequential (BSQ)
- can convert b/w them all
Explain the band interleaved by line (BIL) data organization method
- rows follow each other for each characteristic
Explain the band interleaved by pixel (BIP) data organization method
- all values for a pixel grouped together
- good if focus on multiple area characteristics
- bad if you want to remove or add a layer
Explain the band sequential (BSQ) data organization method
- stores each characteristic in separate file (ex. elevation file, temp file, etc.)
- good for compression and if focus is on one characteristic
- bad if focusing on one area
What is the main issue with raster data storage?
- databases consist of many separate grid layers, thereby compounding the file size issue (large storage requirements)
What are the common methods of raster data storage?
- run length encoding
- chain encoding
- block encoding
- quadtrees
Explain run length encoding
- code sequence of cells with same attribute value
- save data storage space by counting runs of equal values in cells and storing counts
- limited b/c file is read left to right, one row at a time
- does not work for column redundancy
- basically useless if there is no redundancy at all
- ex. XXXXWWRRRX turns into 4X 2W 3R 1X
Explain chain encoding
- similar to run length, but scans rows and columns to define 2D regions with same cell values
- defines region boundary by giving starting point (origin) and cardinal direction to follow as we progress around boundary
- steps: assign number 1-4 to cardinal directions; assign how many grids to move in each direction; assign grid cel value for entire area
Explain block encoding
- 2D run length encoding where areas of common cell values are represented with a single value
- square blocks used to tile the area to be represented (array - series of square blocks of largest size possible)
- store origin (centre or bottom left) and radius of each square
Explain quadtrees
- recursive subdivision of raster cells into quadrants with the same cell value until a square is homogeneous
- uses variable cell resolution to reduce data storage requirements
- hierarchical tree where each level has four-way branching
- efficient for relatively homogeneous areas (highly dependent on embedding of spatial objects in image space)
What are the disadvantages of raster data structure?
- reduced spatial accuracy of discrete objects (decreases reliability of area and distance measures)
- need for large storage capacity
- highly complex analyses can be slow
- some spatial relationships (ex. contiguity and connectivity) may be altered or lost (linear features may be discontinuous, blocky, or merged with others; some features may not be represented)
What are the advantages of raster data structure?
- abundant data sources (RS, aerial photos, scanning)
- easy to conceptualize as a method of representing space
- raster programs (algorithms) often computationally simpler and faster than vector
- sampling is done uniformly across space
- better for modelling continuous features than vector
- better for analyses that involve spread, flow, or diffusion processes (ex. surface modelling, overlay)