Data exploration and classification Flashcards
lecture 14
what is data exploration?
the process of examining data prior to formal structured data analysis
What is data classification in Acr GIS?
the data classification tool is a tool which can be used to explore spatial data and is based on descriptive stats.
What does data exploration include in GIS?
- In GIS it involves both spatial and attribute data (how & where?)
- Media used in GIS includes maps (spatial), graphs, and tables.
What is the crime rate like in Gauteng and where is the highest crime found in this province?
projected on a map with stats
What does data visualisation invlove?
- Rendering – what to show in a graphic plot & what type of plot to
make - Manipulation – how to operate on individual plots and how to
organise multiple plots
What are the fundamental tasks for data exploration?
- Finding patterns
- Posing queries, i.e. exploring data characteristics and data subsets
- Making comparisons, i.e. between variables or data subsets
Q – Which portion of my field produces the highest / lowest yield
Q2 – Why do certain portions of my land produce higher yields?
Q – Which areas of Tanzania are most suitable for growing Pinotage?
Q – How does wildfire susceptibility vary across a nature reserve ?
Q – What is the groundwater recharge potential of the Winelands municipality
Q – How does deforestation rates vary across the Peruvian Amazon?
spatial data exploration statistics?
can be:
Descriptive
Inferential
What are descriptive statistics?
Statistics that provide a statistical summary of a dataset (summary statistic)
1. Measures of central tendency - Describes data by identifying central position.
2. Measures of dispersion .
3. Skewness
4. Kurtosis
What are inferential statistics?
generalizing from a sample to a population with a calculated degree of certainty.
drawing conclusions.
What are measures of central tendency?
Median, mode, mean
What are measures of dispersion?
Look at the statistical spread or
distribution of a dataset.
Include:
1. Standard deviation / Standaard afwyking
2. Variance/ Variansie
3. Standardised score (z score)
Observe the spread of or trends in
data - can be used to identify outliers.
What is the standard deviation?
Shows how much variation or “dispersion” exists from the average.
What is the variance?
Measure of how far a set of numbers is spread out.
What is the standard score (z score)
The standardized or z score informs how many standard deviations a
reading is above or below the mean.
What is classification?
the process of reducing a large number of individual quantitative values to a smaller number of ordered categories, each of which comprises a portion of the original data value range.