14 - Data exploration and classification Flashcards

1
Q

The power of GIS resides in

A

its ability to process spatial data
and its associated attributes providing answers and solutions to
real life spatial issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data exploration def

A

the process of examining data prior to formal structured data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Spatial data exploration

A

the process of examining the
attribute data of spatial features prior to a formal structured
data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

data classification tool is based on what?

A

descriptive stats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is data exploration different to statistics?

A
  1. Both spatial and attribute data involved
  2. Media used in GIS is maps, graphs and tables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data visualization involves?

A
  1. Rendering
  2. Manipulation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Rendering def

A

what to show in a graphic plot & what type of plot to make

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Manipulation def

A

how to operate on individual plots and how to organise multiple plots.
Organise graphs so that easy to interpret

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

3 NB tasks when exploring data

A
  1. Find patterns
  2. Pose queries (characteristics and subsets)
  3. Make comparisons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Inferential stats def

A

generalizing from a sample to a population with calculated degree of certainty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Descriptive stats def

A

are statistics which provide a statistical summary of a dataset - measures of central tendency or summary stats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Descriptive stats (4)

A
  1. Measures of central tendency
  2. Measures of dispersion
  3. Skewness
  4. Kurtosis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Measures of dispersion

A
  1. Standard deviation
  2. Variance
  3. Z score
  4. Range
  5. Standard difference
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A low standard deviation indicates

A

the data points tend to be very close to the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

high standard deviation indicates

A

the data points are spread out over a large range of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Standard dev def

A

Shows how much variation or dispersion” exists from the average

17
Q

Variance

A

Measure of how far a set of numbers is spread out (Standard
deviation squared)

18
Q

Standard score (z score) def

A

The standardized or z score informs how many standard deviations a
reading is above or below the mean

19
Q

Classification def

A

the process of reducing a large number of individual quantitative values to a smaller number of ordered categories, each of which comprises a portion of the original data value range

20
Q

Classification types

A
  1. Natural breaks
  2. Equal interval classes
  3. Geometric interval
  4. Mean and standard deviation
  5. Quantile
  6. User defined
21
Q

Fundamental principle of classification

A

– ALWAYS mutually exclusive AND exhaustive
– Monochrome - 5-7 classes
– multi hue - 9 or less

22
Q

Considerations for classification

A
  1. Available symbols
  2. Communication goal
  3. Complexity of spatial pattern
23
Q

Quantitative precision

A
  • larger no. of classes
  • represent a small range
    – too much info
    – indistinct symbols
24
Q

Immediate graphic impact

A
  • small no. of classes
  • graphically clear, imprecise quanti
    – overs simplification
    – class may have varying data values
25
Q

Natural breaks def

A

method seeks to reduce the variance within classes and
maximize the variance between classes

26
Q

Why is equal intervals not good for rectangular data distri?

A
  • Since each bin will have an equal width, but the uniform distribution ensures that each bin will have approximately the same number of data points, the resulting visualization might misleadingly suggest that the data is more evenly spread than it actually is.
  • In reality, the uniform distribution means each data point is equally likely, but equal interval binning does not convey any particular areas of interest within the data.
27
Q

Why is user defined intervals not good for skewed data?

A

Many classes will be empty and not
mapped

28
Q

Why is mean & std dev not good for skewed data?

A

For skewed data, the mean is pulled toward the skew, and the standard deviation may not accurately reflect the dispersion around the central values, resulting in misleading class boundaries

29
Q
A