8 Image Classification II Flashcards
Principle of image classification
Supervised classification 1/2
–Using training data
–Interaction with the operator
–A priori knowledge of the image is required
–Training data (Knowledge of the image or field observation)
–Classes are defined by operator during the training process
–Pixels in the image are assigned to the class to which they fit best
*notes for training data:
the operator goes to the field, take some polygon in different classes and define it as training data or look at the image (oh this is forest, bare soil, urban, etc) and define the training data
Principle of image classification
Supervised classification 2/2
–Supervised classification needs a high level of interaction with the operator (timeconsuming and expensive)
–A successful supervised classification strongly depends on the correct selection of the spectral classes by the operator
–Often, it is assumed that the classes follow a normal distribution.
–If a class contain different spectral classes (a multi-modal distribution), the classification does not work properly
–The operator usually defines thematic classes and not spectral classes and rarely has knowledge about the number of spectral classes in a thematic class
Principle of image classification
Unsupervised classification (clustering)
–No training data needed
–Clusters are automatically defined by the algorithm
–No prior knowledge assumed about data
–Classifies image into different unknown classes
–The classes are spectrally homogenous
–Unsupervised classifications are usually done iteratively
–The algorithm is iterated until a criterion is satisfied
–Operator usually has control over
•Number of classes
•Number of iterations
•Convergence thresholds
–The operator assigns labels to the clusters (transform them into thematic classes)
–Main unsupervised classification algorithms
•K means
•ISODATA
Unsupervised classification algorithms
K-means
- A number of cluster centers are positioned randomly or evenly randomly or evenly in feature space
- Pixels are assigned to their nearest cluster
- The mean location is recalculated for each cluster calculated for each cluster
- Steps 2 and 3 are repeated until cluster centers are stable or are stable or are moved below the threshold.
- Class types are assigned to spectral clusters.
Unsupervised classification algorithms
K-means notes
–Number of classes is defined by the operator
–It is important to select a reasonable number of classes
Unsupervised classification algorithms
ISODATA (Iterative Self Organizing Data Analysis Technique)
–Extension of k means
–No need to know the number of clusters
–The algorithm splits and merge clusters if needed
–Considers standard deviation of clusters
–The algorithm is iterated until a threshold is reached
Unsupervised classification algorithms
Operator controls in ISODATA (Iterative Self Organizing Data Analysis Technique)
- Initial number of clusters
- Thresholds for termination
- Minimum number of members in a cluster
- Minimum distance between clusters
- Maximum standard deviation of each cluster
Unsupervised classification algorithms
ISODATA (Iterative Self Organizing Data Analysis Technique) steps
Step 3
•Clusters that are too small are deleted
•Distance between clusters are calculated
•Clusters are combined if the distance is closer than a threshold
•Standard deviation of each cluster is calculated.
•Clusters are split if its standard deviation in any dimension is larger than a threshold
Step 4
•Step 2 and 3 are repeated until:
–Reached convergence threshold
–Maximum number of iterations reached
Step 5
•Class types are assigned to spectral clusters.
Unsupervised classification algorithms
ISODATA (Iterative Self Organizing Data Analysis Technique) advantages
- No prior information about the data needed
- Interaction with operator is minimum
- Flexible about the number of clusters
- Very effective at identifying spectral clusters in data
Unsupervised classification algorithms
ISODATA (Iterative Self Organizing Data Analysis Technique) disadvantages
- Might be time-consuming for unstructured data
* Some times the algorithm leaves an unreasonable number of classes
Unsupervised classification facts
–Useful in areas where no ground truth is available or difficult to obtain
–Can often produce information that is not obvious by visual inspection
–Results may not coincide with desired land cover classes
–Often used to trigger a subsequent supervised classification
Classification validation
- The results of classification is a raster file in which each pixel is labelled
- Is the classification map appropriate to use?
- The quality of output should be checked.
- It is not possible to check the whole classification map.
•Quality check is usually done by sampling a number of elements from the
classification results and compare them with ground truth.
•In general it is not a good idea to use the training sites (that had been used for classification, anyway) also for accuracy assessment. -> we need sample in the area for accuracy assessment
Classification validation
Sampling ground truth data
–principles of statistical sampling must be employed
–Any sampling scheme should:
•have a low probability of accepting a map of low accuracy
•Have a high probability of accepting a map of high accuracy
•Require a minimum number of ground truth samples
–Sampling methods
•Random sampling
•Stratified random sampling
Sampling methods
Random sampling
+statistical sound
-small area
Stratified random sampling
+representative for all areas
-a priori knowledge
Error matrix (confusion matrix)
An error matrix is formed from which different accuracy measures can be calculated.