8 Image Classification II Flashcards

1
Q

Principle of image classification

Supervised classification 1/2

A

–Using training data
–Interaction with the operator
–A priori knowledge of the image is required
–Training data (Knowledge of the image or field observation)
–Classes are defined by operator during the training process
–Pixels in the image are assigned to the class to which they fit best

*notes for training data:
the operator goes to the field, take some polygon in different classes and define it as training data or look at the image (oh this is forest, bare soil, urban, etc) and define the training data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Principle of image classification

Supervised classification 2/2

A

–Supervised classification needs a high level of interaction with the operator (timeconsuming and expensive)

–A successful supervised classification strongly depends on the correct selection of the spectral classes by the operator

–Often, it is assumed that the classes follow a normal distribution.

–If a class contain different spectral classes (a multi-modal distribution), the classification does not work properly

–The operator usually defines thematic classes and not spectral classes and rarely has knowledge about the number of spectral classes in a thematic class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Principle of image classification

Unsupervised classification (clustering)

A

–No training data needed

–Clusters are automatically defined by the algorithm

–No prior knowledge assumed about data

–Classifies image into different unknown classes

–The classes are spectrally homogenous

–Unsupervised classifications are usually done iteratively

–The algorithm is iterated until a criterion is satisfied

–Operator usually has control over
•Number of classes
•Number of iterations
•Convergence thresholds

–The operator assigns labels to the clusters (transform them into thematic classes)

–Main unsupervised classification algorithms
•K means
•ISODATA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Unsupervised classification algorithms

K-means

A
  1. A number of cluster centers are positioned randomly or evenly randomly or evenly in feature space
  2. Pixels are assigned to their nearest cluster
  3. The mean location is recalculated for each cluster calculated for each cluster
  4. Steps 2 and 3 are repeated until cluster centers are stable or are stable or are moved below the threshold.
  5. Class types are assigned to spectral clusters.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Unsupervised classification algorithms

K-means notes

A

–Number of classes is defined by the operator

–It is important to select a reasonable number of classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Unsupervised classification algorithms

ISODATA (Iterative Self Organizing Data Analysis Technique)

A

–Extension of k means

–No need to know the number of clusters

–The algorithm splits and merge clusters if needed

–Considers standard deviation of clusters

–The algorithm is iterated until a threshold is reached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Unsupervised classification algorithms

Operator controls in ISODATA (Iterative Self Organizing Data Analysis Technique)

A
  • Initial number of clusters
  • Thresholds for termination
  • Minimum number of members in a cluster
  • Minimum distance between clusters
  • Maximum standard deviation of each cluster
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Unsupervised classification algorithms

ISODATA (Iterative Self Organizing Data Analysis Technique) steps

A

Step 3
•Clusters that are too small are deleted
•Distance between clusters are calculated
•Clusters are combined if the distance is closer than a threshold
•Standard deviation of each cluster is calculated.
•Clusters are split if its standard deviation in any dimension is larger than a threshold

Step 4
•Step 2 and 3 are repeated until:
–Reached convergence threshold
–Maximum number of iterations reached

Step 5
•Class types are assigned to spectral clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Unsupervised classification algorithms

ISODATA (Iterative Self Organizing Data Analysis Technique) advantages

A
  • No prior information about the data needed
  • Interaction with operator is minimum
  • Flexible about the number of clusters
  • Very effective at identifying spectral clusters in data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Unsupervised classification algorithms

ISODATA (Iterative Self Organizing Data Analysis Technique) disadvantages

A
  • Might be time-consuming for unstructured data

* Some times the algorithm leaves an unreasonable number of classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Unsupervised classification facts

A

–Useful in areas where no ground truth is available or difficult to obtain

–Can often produce information that is not obvious by visual inspection

–Results may not coincide with desired land cover classes

–Often used to trigger a subsequent supervised classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Classification validation

A
  • The results of classification is a raster file in which each pixel is labelled
  • Is the classification map appropriate to use?
  • The quality of output should be checked.
  • It is not possible to check the whole classification map.

•Quality check is usually done by sampling a number of elements from the
classification results and compare them with ground truth.

•In general it is not a good idea to use the training sites (that had been used for classification, anyway) also for accuracy assessment. -> we need sample in the area for accuracy assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Classification validation

Sampling ground truth data

A

–principles of statistical sampling must be employed

–Any sampling scheme should:
•have a low probability of accepting a map of low accuracy
•Have a high probability of accepting a map of high accuracy
•Require a minimum number of ground truth samples

–Sampling methods
•Random sampling
•Stratified random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Sampling methods

A

Random sampling
+statistical sound
-small area

Stratified random sampling
+representative for all areas
-a priori knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Error matrix (confusion matrix)

A

An error matrix is formed from which different accuracy measures can be calculated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Overall accuracy (total accuracy) derived from Error matrix (confusion matrix)

A

–Proportion Correctly Classified (PPC)

–Number of correctly classified samples / the total number of samples

17
Q

Overall accuracy (total accuracy) facts

A

–Overall accuracy provides one figure for the classification results as a whole.

–Does not provide any information on individual classes

–No information on whether the error is evenly distributed between classes of some classes
were more accurate than others

–Often a certain class might have a high accuracy while others may have poor accuracy

–In general, the more classes there are, the more confusion –> lower overall accuracy

–Separating crop/no crop yields much higher accuracy values than the distinction of 10
crop types.

–Including large areas of easily identifiable features (e.g. sea/lake) yields high overall accuracy -> one class has high accuracy and maybe the other classes are not. but it leads to overall over estimated accuracy

18
Q

There are also measures derived from the error matrix calculated per class

A

–Errors of omission
–Errors of commission
–User’s accuracy
–Producer’s accuracy

19
Q

Errors of omission (Type I error) - column

omission = kelalaian

A

–Refers to reference samples that are omitted from the correct class in the classification map

–Real world samples are reviewed for incorrect classification

–Incorrectly classified reference samples in class / total number of reference samples in the class

20
Q

Errors of comission (Type II error) - row

A

–Refers to pixels that are incorrectly classified to a reference class

–Pixels in classified map are reviewed for incorrect classification

–Incorrectly classified pixels in class / total number of pixels in the class

21
Q

Producer’s accuracy 1/2

A

–Provides a measure of accuracy from the perspective of the operator of the
classification

–How well can the situation on the ground be mapped?

–How often are real features on the ground correctly shown in the classified map?

–Complement of the omission error (producer’s accuracy = 100% omission error)

22
Q

Producer’s accuracy 2/2

A

–Number of reference samples correctly identified in a given class / number of referencesamples in a class

–Refers to the columns in error matrix

23
Q

User’s accuracy 1/2

A

–Provides a measure of accuracy from the perspective of the user of the classification map

–How reliable is the map?

–For a given class in reference data, how often the class on the map will actually be present on the ground.

– Complement of the commission error (User’s accuracy = 100% commission error)

24
Q

User’s accuracy 2/2

A

–Number of pixels correctly identified in a given class / number of samples in a class

–Refers to the rows in error matrix

25
Q

Kappa coefficient 1/2

A

–Even assigning labels randomly results in a certain degree of accuracy.

–Kappa coefficient evaluates how classification performs better than a random label assignment.

–Kappa also allows comparison of two datasets to check whether they have different
accuracies

26
Q

Kappa coefficient 2/2

A
–Kappa coefficient can range between 1 and 1.
•0: no better than random classification
•<0: significantly worse than random
•>0: significantly better than random
•1: perfect classification

K = (P0 - Pc) / (1 - Pc)

27
Q

Kappa coefficient notes

A
  • when accuracy of some classes are more important than others, weights 𝑤ij might be assigned to each 𝑝0 and 𝑝c terms.
  • positive value of Kappa coefficient means the classification is better than random assignment but not close to 1 so the classification can be improved (maybe if we had better training data or if we had more realistic number of spectral classes in our classification)