W06 - Database Concepts and Data Sources Flashcards
how is spatial and attribute data used with GIS?
spatial data relate to the geometries of spatial features
attribute data describe the characteristics of the spatial features
how does the georelational data model (eg. a coverage) store spatial and attribute data?
separately and links the two by the feature ID. the 2 datasets are synchronized so they can be queried, analyzed, and displayed in unison
how does the object-based data model (eg. a geodatabase)
combines both geometries and attributes in a single system. each spatial feature has a unique object ID and an attribute to store its geometry
how does the raster data model work?
cell value corresponds to the value of a continuous feature at the cell location
the value attribute table summarizes cell values and their frequencies in the raster.
how is attribute data stored?
in tables, organized by rows (record) and columns (field).
what are the 2 types of attribute tables in GIS?
feature attribute table and and tables of nonspatial data
what is a feature attribute table?
an attribute table that has access to the geometries of features.
every vector data set must have a feature attribute table
in the georelational data model, the feature attribute table uses the feature ID to link to the feature’s geometry
in the object-based data model, the feature attribute table has a field that stores the feature’s geometry
have default fields that summarize the feature geometries (ex. length for line features and area & perimeter for polygon features)
what are tables of non-spatial data?
these tables do not have direct access to the feature geometry but has a field linking the table to the feature attribute table.
ex. delimited text files, dBASE files, excel files, access files, other db files from SQL, oracle, etc.
what is a database management system (DBMS)?
software package that lets us build and manipulate a database.
provides tools for data input, search, retrieval, manipulation and output
ArcGIS for Desktop uses Access for managing personal geodatabases
how is the geodatabase implemented?
implemented in a relational database management system and stores both geometries and attributes in a single database
what is a client-server distributed database system?
a client sends a request to the server, retrieves data from the server, and processes the data on the local computer
what are methods of classifying attribute data?
by data type, by measurement scale
what are the different data types?
determines how an attribute is stored, typically included in the metadata of geospatial data
ex. number, text (string), date, binary large object (BLOB)
how can numbers (data type) be stored?
integers (no decimal digits), float/floating point
integers can be short or long.
float can be single precision or double precision
what do BLOBs store?
store images, multimedia and feature geometrics as long sequences of binary numbers
what are the ways to classify data by measurement scale?
nominal, ordinal, interval, and ratio data
what is nominal data
different kinds / categories of data, such as land-use types or soil types
what is ordinal data
differentiates data by a ranking relationship
what is interval data
have known intervals between values (ex. 60F vs 70F differ by 10F)
what is ratio data
same as interval data but ratio data are based on a meaningful zero value (ex. population densities)
categorical data
includes nominal and ordinal scales
numerical data
includes interval and ratio scales
what are the types of database designs?
- flat file
- hierarchical
- network
- relational
what is a flat file?
stores all data in a large table (ex. spreadsheet)
what is a hierarchical database?
organizes its data at different levels and uses only one-to-many associations between levels (ex. zoning > parcel > owner)
what is a network database?
builds connections across tables
what is a common problem with hierarchical and network databases?
the linkages between tables must be known in advance and built into the database at design time. could make the database complicated and inflexible
what is a relational database?
collection of tables (or relations) that can be connected to each other by keys
what is a primary key?
represents one or more attributes whose values can uniquely identify a record in a table
cannot be null and should never change
what is a foreign key?
one or more attributes that refer to a primary key in another table
common field
primary and foreign key with the same name
what are the benefits of a relational database?
simple and flexible
each table in the database can be prepared, maintained and edited separately from the other tables
tables can remain separate until a query or analysis requires attribute data from different tables to be linked together (efficient for data management and data processing)
what is the SSURGO and who produces it?
the Soil Survey Geographic database, produced by the Natural Resources Conservation Service (NRCS)
SSURGO data collected from field mapping, archiving data in 7.5 minute quadrangle units, organized by soil survey area, which may consist of a county, multiple counties, or part of multiple counties
database consists of spatial and tabular data
for each soil survey area, spatial data contained a detailed soil map, made of soil map units (which may be made of one or more noncontiguous polygons). `
a soil map unit represents a set of geographic areas for which a common land-use management strategy is suitable.
what is normalization?
process of decomposition, taking a table with all the attribute data and breaking it down into small tables while maintaining the links between them
what are the objectives of normalization?
- avoid redundant data in tables that waste space and can cause data integrity problems
- ensure attribute data in separate tables can be maintained and updated separately and linked when necessary
- facilitate a distributed database
normalization performance issues
higher normal forms than the third can slow down data access and create higher maintenance costs.