Spatial data Flashcards
fundamental aspects/views
Geodatabase: GIS DATA = features, images and attributes):
- This is a spatial database
containing datasets that represent geographic information in terms of GIS data models (features, raster, attributes, topologies, networks).
- GIS datasets are often organized in layers.
- information is - directly or indirectly - georeferenced
Geovisualization: (Map, scenes and globes):
- A GIS is a set of intelligent maps and other views that show features and feature relationships on the earth’s surface.
- Geographic information can be constructed and used as windows into geographic database to support query, analysis and editing of geographic information.
- 2D or 3D map applications provide rich tools for working with geographic information through these views.
Geoprocessing (Data processing models and scripts):
- A GIS contains a set of information
tools that derive new information from existing datasets.
- takes information from
existing datasets, applies analytic functions, and writes results into derived datasets.
- It involves the ability to string together a series of operations that users can perform. The functions depend on the capabilities of the GIS.
Elements of geographic information
1) Features:
- Points, lines and polygons.
- representations of things located on or near the surface of the earth.
- Points: describe discrete location of geographic features too small to be depicted as lines or areas like address, GPS coordinates, etc.
- Lines: represent shape and location of geographic objects too narrow to be areas (streets, streams), or
contour lines and administrative boundaries.
- Polygons: enclosed areas that represent shape and location of homogeneous features.
2) Attributes: Descriptive information managed in tables, which are based on series of
simple, essential relational database concepts (e.g. ownership)
3) Imaginary:
- Raster data obtained from various sensors carried in satellites and aircrafts.
- Images are managed as a raster data type composed of cells organized in a grid of row and columns.
- In addition to map projections, coordinate system, the raster includes cell size and reference coordinates
4) Surface:
- Describes an occurrence that has a value for every point on Earth. representing values at all locations on earth is not possible wiith continious datasets.
–> alternatives exist for surfaces using features or raster.
- Example:
- Contour lines (isolines such as elevation contours), contour bands (areas where surface value is within a specified range such as bands of average annual rainfall),
-raster datasets (DEM to represent surface elevation),
-TIN layer (data structure for representing surfaces as
connected network of triangles).
- Exact entities: diskrete oder abgegrenzte Objekte oder Features, die klar definiert sind und eine bestimmte Identität haben. Beispiele für exakte Entitäten sind Gebäude, Straßen, Gewässer, Grenzen von Ländern oder Gemeinden sowie andere klar abgegrenzte geografische Merkmale. Diese Entitäten haben feste Grenzen und können individuell identifiziert und dargestellt werden
- continous field: Regular tessellation with square cells (raster) with a separate
layer for each attribute. Irregular tesselation with triangular
elements (TIN). Keine klare Abgrenzung, fortlaufende Eigenschaften zum Beispiel für Elevation, Temperature etc.
4 shells of data modeling
- Data modeling can be subdivided into four different shells.
- they comprise the spatial representation of objects (features, coordinate systems)
-
conceptual aspects (which real-world objects are of interest)
-logical aspects (relationships between entities/objects)
-physical aspects (how data are physically stored, e.g., on a hard disk)
Properties of spatial objects
They may be:
- Geometric (line, polygon)
- Topologic (Relations in space, neighborhood relationships)
-semantic (attributes)
-Dynamic (objects may change in time)
Types of attribute data
Categorial data::
- Nominal data (soil type)
- Ordinal data (low-moderate-high soil erodibility)
Numerical data:
- Interval data (temperature)
-Ratio data (population density)
Explain the term Entity-Relationship-Model. What are the common relationships between tables and relational databases?
- Mathematical representation of logical relationships (how entities are related to each other. Relationships can be one-to-one, one-to-many, or many-to-many, depending on the cardinality of the relationship between entities.) between entities (objects) to be represented in a database.
- Possible relationships between entities and attributes (properties or characteristics of entities. Attributes describe the data associated with each entity and provide details about the entities’ properties):
1) ONE TO ONE: One entity from set A can be associated with one entity from set B and vice versa
2) ONE TO MANY: One entity from set A can be associated with more than one entity from set B. An entity of set B can be associated with at most one entity from set A.
3) MANY TO ONE: More than one entity from set A can be associated with at most one entity of set B, however, one entity from set B can be associated with more than one entity from set A.
4)MANY TO MANY: One entity from set A can be associated with more than one from set B and vice versa.
Logical Data models
= Describe the data in as much detail as possible without regard to how they will be physical implemented in the database. The illustrative example deals with property relationships in two (commercial or residential) areas and is taken from Chang (2006). PIN: Parcel Identification Number.
Flat table database:
- Data are stored in the system as an ordinary flat file (like a normal spreadsheet).
- Simple database system in which each database is represented as a single
table in which all of the records are stored as single rows of data.
- Problem of maintenance and redundancy
Hierarchical database:
- Data model in which the data is organized into a tree-like structure
- The data records are connected to one another through links
-One to many relationships can be stored. Redundancy can be a problem (see “Smith”).
Network database:
- Data model representing objects and their relationships in a flexible
way.
- The structure is in many to many relationships
- difficult to program and visualize it
Relational database:
- mostly used
- Based on the relational data model.
- organizes data in many tables and uses keys to identify datasets (rows; primary key [parent database] and foreign key [child database] or common field [both databases]).
Keys in relational data bases
- Superkey: set of attributes that can be used to identify the tupels in a set (meaning that they are, in combination, different in all rows of the table)
- Candidate key: minimal superkey, only uses as many attributes as necessary to not lose the superkey (identifying) property; may consist of only 1 attribute e.g., ID number
- Every relation has at least 1 candidate key
- One of candidate keys is selected as primary key; primary keys used to reference the
table - The referencing (child) table uses a foreign key to link up with (the primary key of) the
referenced (parent) table - Primary key and foreign key can be equal (common field)
- A table can have more than 1 foreign key and these need not have superkey (identifying)
properties
Normalization of data tables
= set of procedures to eliminate non-simple domains and redundancies of data (space, causing integrity problems)
steps:
1. Unnormalized table as starting point
2. First normal form with filling in missing values and identify each set of related data with primary key
3. Second normal form by splitting the table such that all non-key attributes are fully functional on the primary key
4. Third normal form: Break the table further to remove dependencies