W3 Flashcards
What is geospatial querying
The process of selecting information (features and associated attribute data in tables) by some from of query operation that includes a spatial constraint
- helps you later do analysis
What is a non-spatial query?
select by attribute
What is a spatial query?
Select by location
- something is selected by its spatial relationship with another something
What is structured query language?
the language that writes an attribute query and is used by Geographic information databases and systems
What are the two types of joins?
- attribute (join by attribute)
- spatial (join by location)
Common use of join by attribute
Connecting explicit geographic data (such as census tracks) and descriptive data, which could have geographic content, but are lacking explicit geographic info, (such as road names)
How to execute a join by attribute
- One must find a common attribute between the two datasets. This is a called a key and is the basis for the join (ex. ID)
- you can have duplicate key value (ex. one to many, many to one, many to many)
- if the join isn’t a perfect match you can decide between keeping all records and keeping only matching records
How does a spatial join work?
involves inserting the columns from one feature class to another based on location or proximity
- common case would be between a point layer and a polygon layer where you want to retain the point geometries and grab the attributes of the intersecting polygons
What is data classification
The process of ordering your data so it is easier to interpret visually and statistically
- it allows you to spot patterns in the data more easily
How do you preform a classification
By grouping similar features into classes by assigning the same cartographic symbol to each member of the class
What the four classification techniques commonly applied to geographic phenomena
- equal interval
- quantile
- natural breaks
- standard deviation
Describe equal interval
- divides the range of attribute values into equally sized classes
- the number of class is determined by the user
- this method is best used for continuous datasets (ex. temp)
Describe quantile
- places equal numbers of observations into each class
- number of class is determined by the user
- best for data that is evenly distributed across its range
- disadvantage: features in the same class can have wildly differing values, especially if the data isn’t evenly distributed across its range ( the opposite can also happen, where values with little difference and placed in different classes, creating the illusion of wider difference than actually exists)
Describe natural breaks (jenks)
- This method utilizes an algorithm to group values in classes that are separated by distinct break points
- works best when data is unevenly distributed but not skewed toward either end
- disadvantage: can create classes that contain widely varying number ranges
- additionally it is hard to compare to other amps because the class ranges are very specific to each dataset
Describe standard deviation classification
- forms each class by adding and subtracting the standard deviation from the mean of the dataset
- best suited to data that conforms to a normal distribution (gaussian)