W06 - Database Concepts and Data Sources Flashcards

1
Q

how is spatial and attribute data used with GIS?

A

spatial data relate to the geometries of spatial features

attribute data describe the characteristics of the spatial features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how does the georelational data model (eg. a coverage) store spatial and attribute data?

A

separately and links the two by the feature ID. the 2 datasets are synchronized so they can be queried, analyzed, and displayed in unison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how does the object-based data model (eg. a geodatabase)

A

combines both geometries and attributes in a single system. each spatial feature has a unique object ID and an attribute to store its geometry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how does the raster data model work?

A

cell value corresponds to the value of a continuous feature at the cell location

the value attribute table summarizes cell values and their frequencies in the raster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how is attribute data stored?

A

in tables, organized by rows (record) and columns (field).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the 2 types of attribute tables in GIS?

A

feature attribute table and and tables of nonspatial data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a feature attribute table?

A

an attribute table that has access to the geometries of features.

every vector data set must have a feature attribute table

in the georelational data model, the feature attribute table uses the feature ID to link to the feature’s geometry

in the object-based data model, the feature attribute table has a field that stores the feature’s geometry

have default fields that summarize the feature geometries (ex. length for line features and area & perimeter for polygon features)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are tables of non-spatial data?

A

these tables do not have direct access to the feature geometry but has a field linking the table to the feature attribute table.

ex. delimited text files, dBASE files, excel files, access files, other db files from SQL, oracle, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is a database management system (DBMS)?

A

software package that lets us build and manipulate a database.

provides tools for data input, search, retrieval, manipulation and output

ArcGIS for Desktop uses Access for managing personal geodatabases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how is the geodatabase implemented?

A

implemented in a relational database management system and stores both geometries and attributes in a single database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is a client-server distributed database system?

A

a client sends a request to the server, retrieves data from the server, and processes the data on the local computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are methods of classifying attribute data?

A

by data type, by measurement scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the different data types?

A

determines how an attribute is stored, typically included in the metadata of geospatial data

ex. number, text (string), date, binary large object (BLOB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how can numbers (data type) be stored?

A

integers (no decimal digits), float/floating point

integers can be short or long.

float can be single precision or double precision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what do BLOBs store?

A

store images, multimedia and feature geometrics as long sequences of binary numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the ways to classify data by measurement scale?

A

nominal, ordinal, interval, and ratio data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is nominal data

A

different kinds / categories of data, such as land-use types or soil types

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is ordinal data

A

differentiates data by a ranking relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is interval data

A

have known intervals between values (ex. 60F vs 70F differ by 10F)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is ratio data

A

same as interval data but ratio data are based on a meaningful zero value (ex. population densities)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

categorical data

A

includes nominal and ordinal scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

numerical data

A

includes interval and ratio scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what are the types of database designs?

A
  • flat file
  • hierarchical
  • network
  • relational
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is a flat file?

A

stores all data in a large table (ex. spreadsheet)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

what is a hierarchical database?

A

organizes its data at different levels and uses only one-to-many associations between levels (ex. zoning > parcel > owner)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

what is a network database?

A

builds connections across tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

what is a common problem with hierarchical and network databases?

A

the linkages between tables must be known in advance and built into the database at design time. could make the database complicated and inflexible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

what is a relational database?

A

collection of tables (or relations) that can be connected to each other by keys

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

what is a primary key?

A

represents one or more attributes whose values can uniquely identify a record in a table

cannot be null and should never change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

what is a foreign key?

A

one or more attributes that refer to a primary key in another table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

common field

A

primary and foreign key with the same name

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

what are the benefits of a relational database?

A

simple and flexible

each table in the database can be prepared, maintained and edited separately from the other tables

tables can remain separate until a query or analysis requires attribute data from different tables to be linked together (efficient for data management and data processing)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

what is the SSURGO and who produces it?

A

the Soil Survey Geographic database, produced by the Natural Resources Conservation Service (NRCS)

SSURGO data collected from field mapping, archiving data in 7.5 minute quadrangle units, organized by soil survey area, which may consist of a county, multiple counties, or part of multiple counties

database consists of spatial and tabular data

for each soil survey area, spatial data contained a detailed soil map, made of soil map units (which may be made of one or more noncontiguous polygons). `

a soil map unit represents a set of geographic areas for which a common land-use management strategy is suitable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

what is normalization?

A

process of decomposition, taking a table with all the attribute data and breaking it down into small tables while maintaining the links between them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

what are the objectives of normalization?

A
  • avoid redundant data in tables that waste space and can cause data integrity problems
  • ensure attribute data in separate tables can be maintained and updated separately and linked when necessary
  • facilitate a distributed database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

normalization performance issues

A

higher normal forms than the third can slow down data access and create higher maintenance costs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

what are the different types of relationships between records in tables?

A

one to one
one to many
many to one
many to many

origin and destination

38
Q

one to one

A

one record in a table is related to only one record in another table

39
Q

one to many

A

one record in a table may be related to many records in another table

40
Q

many to one relationship

A

many records in a table may be related to one record in another table (ex. several households may share the same street address)

41
Q

many to many

A

many records in a table may be related to many records in another table

42
Q

what is a join

A

brings together 2 tables by using a common field or a primary key + foreign key

ex. joining attribute data from a nonspatial data table to a feature attribute table

recommended for one to one or many to one relationships

doesn’t work for one to many or many to many because only the first matching record from the destination will be assigned to the origin record

43
Q

what is a relate?

A

operation that temporarily connects 2 tables but keeps the tables physically separate

works for all types of relationships, but slows down data access

44
Q

what is a relationship class?

A

relationships between objects, predefined and stored in a geodatabase. for the object-based data model

can be one to one, many to one, one to many and many to many

for the first 3, records in the origin are directly linked to records in the destination

for many to many, an intermediate table sorts out the associations between records

45
Q

field definition

A

define each field in the table, usually include

  • field name
  • data width (# of spaces reserved for a field)
  • data type
  • number of decimal digits (part of the definition for the float type)

field definition becomes a property of the field so it is important to consider how the field will be used before defining it

46
Q

methods of data entry

A

import attribute files, but if they don’t already exist, then typing it in.

for map unit symbols or feature IDs, best to enter them directly in a GIS. for nonspatial data, better to use word processing or spreadsheet packages (excel, notepad)

47
Q

what are the 2 steps to attribute data verification?

A

1) make sure that attribute data are properly linked to spatial data (feature ID should be unique and contain no null values)
2) verify the accuracy of attribute data

48
Q

what is an effective method for preventing data entry errors?

A

use attribute domains in the geodatabase

attribute domains allows the user to define a valid range of values or a valid set of values for an attribute

49
Q

what does field management entail?

A

adding or deleting fields and creating new attributes through classification and computation of existing attribute data

50
Q

why is it good to delete unnecessary fields after downloading data from the internet?

A

reduces confusion in using the data set and also saves computer time for data processing

51
Q

creating new attribute data by classification

A

data classification reduces a data set to a small number of classes (ex. reclassifying elevations into groups)

1) define a new field for saving the classification result
2) select a data subset using a query
3) assign a value to the selected data subset

52
Q

creating new attribute data by computation

A

1) define a new field

2) compute the new field values from the values of existing attributes

53
Q

what is the purpose of data exploration?

A

allows you to examine the general trends in the data, take a look at subsets, focus on possible relationships between data sets

purpose is to better understand the data and provide a starting point for formulating research questions and hypotheses

54
Q

data visualization

A

discipline that uses a variety of exploratory techniques and graphics to understand and gain insight into data

55
Q

how does data exploration in GIS differ from data exploration in statistics?

A

1) data exploration in GIS involves both spatial and attribute data
2) includes map and map features

besides descriptive statistics and graphics, data exploration in GIS must also cover map-based data manipulation, attribute data query, and spatial data query

56
Q

range

A

difference between the minimum and the maximum

57
Q

median

A

the midpoint value (50th percentile)

58
Q

first quartile

A

the 25th percentile

59
Q

third quartile

A

the 75th percentile

60
Q

mean

A

average of data values

61
Q

variance

A

measure of the spread of the data about the mean

sum of (value - mean) ^2 divided by # of values

62
Q

standard deviation

A

square root of the variance

63
Q

z score

A

standardized score

(x - mean) / standard deviation

64
Q

cumulative distribution graph

A

line graph that plots the ordered data values against the cumulative distribution values

the cumulative distribution value is (i - 0.5)/n

the values fall between 0 and 1

65
Q

bubble plots

A

a variation of scatterplots that uses varying-sized bubbles that represent a third variable

66
Q

boxplots

A

show min, first quartile, median, third quartile, max

used to tell if the distribution is symmetric or skilled or if there are any outliers

67
Q

QQ plots

A

quantile-quantile plots

compare the cumulative distribution of a data set with some theoretical distribution (ex. a normal distribution)

points in a QQ plot fall in a straight line if the data set follows the theoretical distribution

68
Q

dynamic graphs

A

graphics displayed in multiple and dynamically linked windows where we can directly manipulate data points

69
Q

brushing

A

allows the user to graphically select a subset of points from one chart and view related data points in other graphics

70
Q

geovisualization

A

data visualization that focuses on geospatial data and the integration of cartography, GIS, image analysis, and exploratory data analysis

71
Q

what are the different types of map-based data manipulations?

A

data classification, spatial aggregation, and map comparison

72
Q

what are the different methods of doing map comparisons?

A

1) superimpose layers on top of one another and have them be represented on the map differently, or turn the layers on and off, or use transparency

2) use map symbols that can show two data sets
ex. bivariate choropleth map
ex. cartogram, where the unit areas are sized proportional to a variable (ex state population) and the area symbols are used to represent the second variable

3) temporal animation can be used if there is time-dependent data

73
Q

attribute data query

A

process of retrieving data by working with attributes (ex. SQL commands)

74
Q

SQL

A

data query language designed for manipulating relational databases, used in the GIS to communicate with a database

select
from
where

ex. select Parcel.Sale_date
from Parcel
where Parcel.PIN = ‘P101’

ex. select Parcel.Sale_date
from Parcel, Owner
where Parcel.PIN = Owner.PIN AND Owner_name = ‘Costello’

query joins the two tables and then actually queries it

75
Q

procedural differences when querying a local database in a GIS package

A

1) only have to enter WHERE in the query expression box because typically the field and table have already been selected
2) an attribute query dialog is typically designed for a single table, so if the query involves attributes from two tables, they have to be joined first.

76
Q

query expressions

A

the where conditions with Boolean expressions and connectors

77
Q

Boolean expression

A

contains 2 operands and a logical operator

operands can be a field, number, or text

logical operators can be =, >, =, <> (not equal to)

can also contain arithmetic operators

78
Q

boolean connectors

A

AND, OR, XOR, NOT

XOR is the opposite of AND. only records that satisfy one and only one of the expressions are selected

79
Q

what are the types of operations that can act on a data set?

A

add more records to a subset
remove records from a subset
select a smaller subset

80
Q

relational database query

A

works with a relational database, selects a data subset in the table and also selects records related to the subset in other tables

81
Q

what is the difference between join operation and relate operation?

A

join operations combines the attribute data from 2 or more tables into a single table. relate dynamically links the tables but keeps the tables separate

82
Q

spatial data query

A

process of retrieving a data subset from a layer by working directly with feature geometries. the results can be simultaneously inspected in the map, linked to the records in the table and displayed in charts

can select features spatially using a cursor, a graphic or the spatial relationship between features

83
Q

feature selection by graphic

A

draw a shape (graphic) to select objects of interest (ex. restaurants within a 1 mile radius of a hotel)

84
Q

feature selection by spatial relationship

A

selects features based on their spatial or topological relationships to other features

ex. roadside rest areas within 50 mile radius of selected rest area; rest areas within each county

spatial relationships used for querying include containment, intersect, and proximity

85
Q

containment (spatial query)

A

selects features that fall completely within features for selection

86
Q

intersect (spatial query)

A

selects features that intersect features for selection

87
Q

proximity (spatial query)

A

selects features that are within a specified distance of features for selection

88
Q

spatial adjacency

A

features to be selected and features for selection share common boundaries and the specified distance is 0

89
Q

raster data query - query by cell value

A

use the raster instead of a field in the operand to query a feature

can query multiple rasters, which may be integer, floating point, or a mix of both. querying multiple rasters directly is unique to raster data

90
Q

raster data query - query by select features

A

features can be used to query a raster and it returns an output raster with values for cells that correspond to the query and no data in the other cells