8 Image Classification II Flashcards

Question 1

Q

Principle of image classification

Supervised classification 1/2

Answer

A

–Using training data
–Interaction with the operator
–A priori knowledge of the image is required
–Training data (Knowledge of the image or field observation)
–Classes are defined by operator during the training process
–Pixels in the image are assigned to the class to which they fit best

*notes for training data:
the operator goes to the field, take some polygon in different classes and define it as training data or look at the image (oh this is forest, bare soil, urban, etc) and define the training data

Question 2

Q

Principle of image classification

Supervised classification 2/2

Answer

A

–Supervised classification needs a high level of interaction with the operator (timeconsuming and expensive)

–A successful supervised classification strongly depends on the correct selection of the spectral classes by the operator

–Often, it is assumed that the classes follow a normal distribution.

–If a class contain different spectral classes (a multi-modal distribution), the classification does not work properly

–The operator usually defines thematic classes and not spectral classes and rarely has knowledge about the number of spectral classes in a thematic class

Question 3

Q

Principle of image classification

Unsupervised classification (clustering)

Answer

A

–No training data needed

–Clusters are automatically defined by the algorithm

–No prior knowledge assumed about data

–Classifies image into different unknown classes

–The classes are spectrally homogenous

–Unsupervised classifications are usually done iteratively

–The algorithm is iterated until a criterion is satisfied

–Operator usually has control over
•Number of classes
•Number of iterations
•Convergence thresholds

–The operator assigns labels to the clusters (transform them into thematic classes)

–Main unsupervised classification algorithms
•K means
•ISODATA

Question 4

Q

Unsupervised classification algorithms

K-means

Answer

A

A number of cluster centers are positioned randomly or evenly randomly or evenly in feature space
Pixels are assigned to their nearest cluster
The mean location is recalculated for each cluster calculated for each cluster
Steps 2 and 3 are repeated until cluster centers are stable or are stable or are moved below the threshold.
Class types are assigned to spectral clusters.

Question 5

Q

Unsupervised classification algorithms

K-means notes

Answer

A

–Number of classes is defined by the operator

–It is important to select a reasonable number of classes

Question 6

Q

Unsupervised classification algorithms

ISODATA (Iterative Self Organizing Data Analysis Technique)

Answer

A

–Extension of k means

–No need to know the number of clusters

–The algorithm splits and merge clusters if needed

–Considers standard deviation of clusters

–The algorithm is iterated until a threshold is reached

Question 7

Q

Unsupervised classification algorithms

Operator controls in ISODATA (Iterative Self Organizing Data Analysis Technique)

Answer

A

Initial number of clusters
Thresholds for termination
Minimum number of members in a cluster
Minimum distance between clusters
Maximum standard deviation of each cluster

Question 8

Q

Unsupervised classification algorithms

ISODATA (Iterative Self Organizing Data Analysis Technique) steps

Answer

A

Step 3
•Clusters that are too small are deleted
•Distance between clusters are calculated
•Clusters are combined if the distance is closer than a threshold
•Standard deviation of each cluster is calculated.
•Clusters are split if its standard deviation in any dimension is larger than a threshold

Step 4
•Step 2 and 3 are repeated until:
–Reached convergence threshold
–Maximum number of iterations reached

Step 5
•Class types are assigned to spectral clusters.

Question 9

Q

Unsupervised classification algorithms

ISODATA (Iterative Self Organizing Data Analysis Technique) advantages

Answer

A

No prior information about the data needed
Interaction with operator is minimum
Flexible about the number of clusters
Very effective at identifying spectral clusters in data

Question 10

Q

Unsupervised classification algorithms

ISODATA (Iterative Self Organizing Data Analysis Technique) disadvantages

Answer

A

Might be time-consuming for unstructured data

* Some times the algorithm leaves an unreasonable number of classes

Question 11

Q

Unsupervised classification facts

Answer

A

–Useful in areas where no ground truth is available or difficult to obtain

–Can often produce information that is not obvious by visual inspection

–Results may not coincide with desired land cover classes

–Often used to trigger a subsequent supervised classification

Question 12

Q

Classification validation

Answer

A

The results of classification is a raster file in which each pixel is labelled
Is the classification map appropriate to use?
The quality of output should be checked.
It is not possible to check the whole classification map.

•Quality check is usually done by sampling a number of elements from the
classification results and compare them with ground truth.

•In general it is not a good idea to use the training sites (that had been used for classification, anyway) also for accuracy assessment. -> we need sample in the area for accuracy assessment

Question 13

Q

Classification validation

Sampling ground truth data

Answer

A

–principles of statistical sampling must be employed

–Any sampling scheme should:
•have a low probability of accepting a map of low accuracy
•Have a high probability of accepting a map of high accuracy
•Require a minimum number of ground truth samples

–Sampling methods
•Random sampling
•Stratified random sampling

Question 14

Q

Sampling methods

Answer

A

Random sampling
+statistical sound
-small area

Stratified random sampling
+representative for all areas
-a priori knowledge

Question 15

Q

Error matrix (confusion matrix)

Answer

A

An error matrix is formed from which different accuracy measures can be calculated.

Question 16

Q

Overall accuracy (total accuracy) derived from Error matrix (confusion matrix)

Answer

A

–Proportion Correctly Classified (PPC)

–Number of correctly classified samples / the total number of samples

Question 17

Q

Overall accuracy (total accuracy) facts

Answer

A

–Overall accuracy provides one figure for the classification results as a whole.

–Does not provide any information on individual classes

–No information on whether the error is evenly distributed between classes of some classes
were more accurate than others

–Often a certain class might have a high accuracy while others may have poor accuracy

–In general, the more classes there are, the more confusion –> lower overall accuracy

–Separating crop/no crop yields much higher accuracy values than the distinction of 10
crop types.

–Including large areas of easily identifiable features (e.g. sea/lake) yields high overall accuracy -> one class has high accuracy and maybe the other classes are not. but it leads to overall over estimated accuracy

Question 18

Q

There are also measures derived from the error matrix calculated per class

Answer

A

–Errors of omission
–Errors of commission
–User’s accuracy
–Producer’s accuracy

Question 19

Q

Errors of omission (Type I error) - column

omission = kelalaian

Answer

A

–Refers to reference samples that are omitted from the correct class in the classification map

–Real world samples are reviewed for incorrect classification

–Incorrectly classified reference samples in class / total number of reference samples in the class

Question 20

Q

Errors of comission (Type II error) - row

Answer

A

–Refers to pixels that are incorrectly classified to a reference class

–Pixels in classified map are reviewed for incorrect classification

–Incorrectly classified pixels in class / total number of pixels in the class

Question 21

Q

Producer’s accuracy 1/2

Answer

A

–Provides a measure of accuracy from the perspective of the operator of the
classification

–How well can the situation on the ground be mapped?

–How often are real features on the ground correctly shown in the classified map?

–Complement of the omission error (producer’s accuracy = 100% omission error)

Question 22

Q

Producer’s accuracy 2/2

Answer

A

–Number of reference samples correctly identified in a given class / number of referencesamples in a class

–Refers to the columns in error matrix

Question 23

Q

User’s accuracy 1/2

Answer

A

–Provides a measure of accuracy from the perspective of the user of the classification map

–How reliable is the map?

–For a given class in reference data, how often the class on the map will actually be present on the ground.

– Complement of the commission error (User’s accuracy = 100% commission error)

Question 24

Q

User’s accuracy 2/2

Answer

A

–Number of pixels correctly identified in a given class / number of samples in a class

–Refers to the rows in error matrix

Question 25

Q

Kappa coefficient 1/2

Answer

A

–Even assigning labels randomly results in a certain degree of accuracy.

–Kappa coefficient evaluates how classification performs better than a random label assignment.

–Kappa also allows comparison of two datasets to check whether they have different
accuracies

Question 26

Q

Kappa coefficient 2/2

Answer

A

–Kappa coefficient can range between 1 and 1.
•0: no better than random classification
•<0: significantly worse than random
•>0: significantly better than random
•1: perfect classification

K = (P0 - Pc) / (1 - Pc)

Question 27

Q

Kappa coefficient notes

Answer

A

when accuracy of some classes are more important than others, weights 𝑤ij might be assigned to each 𝑝0 and 𝑝c terms.
positive value of Kappa coefficient means the classification is better than random assignment but not close to 1 so the classification can be improved (maybe if we had better training data or if we had more realistic number of spectral classes in our classification)