EX2 - Image Classification Flashcards
What is image classification?
Digital image classification is the process of assigning pixels to classes.
Pixels are compared to one another and to pixels of known identity. Then, similar pixels are grouped into classes of interest.
These classes form regions on a map or image.
Pixels within classes are spectrally more similar to one another than they are to pixels in other classes
Theoretically, each class is homogeneous Practically, each class has some diversity
Spectral /point classifiers
Classifier refers loosely to a computer program that implements a specific procedure for image classification
consider each pixel individually, assigning it to a class based on its several values measured in separate spectral bands
Pros: simplicity
Cons: doesn’t exploit the relationship contained in relationships between each pixel and those that neighbor it.
Human interpreters, for example, derive little information using this point by point approach; humans derive information from context and patterns of
brightness of groups of pixels
Supervised Classification
4 advantages and 5 disadvantages
Advantages:
1) Analyst controls selection of information classes.
2) Classification tied to areas of known training areas
3) No need to match spectral classes to information
4) Easier to detect serous errors in training data
Disadvantages:
1) May imposes structure on data that may not match natural classes in data
2) Training areas are defined primarily with respect to informational category and secondarily with respect to spectral properties (100% forest?)
3) If the area to be classified is large and complex, training data may not be representative of conditions encountered throughout the image
4) Selection of training data can be time consuming, tedious and expensive
5) Supervised classification may not be able to recognize and represent special or unique categories not represented in the training data (because they are unknown to the analyst or they occupy very small areas on the image)
Unsupervised Classification
5 advantages and 3 disadvantages
Advantages
1) No extensive prior knowledge required
2) Objective;minimum human influenceor bias or error
3) Unique classes (big and small)
4) Works fast
5) Works the same way always
Dis-advatages:
1) Identifies spectrally homogeneous classes that may not correspond to the informational categories of interest to the analyst
2) Analyst has little control
3) Spectral properties of specific informational classes will changeover time (seasonally and over years). Relationships defined for one image cannot extend to others
Informational Classes and Spectral subclasses
FOREST > shadowed > pine > etc etc
Categories of interest to users, Object of analysis
Classes that we wish to derive from the data. We derive these using BV
Each information class is composed of numerous spectral subclasses.
In classification, we treat spectral subclasses as distinct units during classification but then display several spectral classes under a single informational class for the final image
Unsupervised classification
Basic strategy
Euclidean Distance Measure
K-means algorithm
Unsupervised classification
Euclidean Distance Measure
- differences between pixel a and pixel b in each band
- square the differences
- Total the differences squared
- get the square root of the total
- You have euclidean distance measure
if pixels AB is less than the same operation for AC, we know the pixel A is closer to B than to C.
Unsupervised classification
K-means algorithm (5 Steps)
- choose the number of clusters
- randomly assigns initial positions on cluster centroids
- Assigns points to nearest centroids
- Recompute for closer centroids
- if solution coverges then stop!
Key components of the algorithm:
- Effective methods of measuring distances in data space
- Identifying class centroids
- Testing the distinctness of classes
Objective, but nor completely objective because the analyst decides:
- The data to be examined
- The algorithm to be used
- The # of classes to be found
- (Possibly) the uniformity and distinctness of classes
Assignment of spectral classes to informational classes ( 2 serious practical problems with unsupervised classification)
Some informational categories may not have direct spectral counterparts, and vice versa. clear matches are not always possible.
Analyst cannot control the nature of categories generated (hence, comparison between places and over time are hard to make). The same set of informational
categories are not always generated on both images
Supervised Classification
3 Basic Strategies
1) Analyst uses prior knowledge to guide the classification.
2) Analyst identifies “training areas” to represent the typical spectral classes that make up the informational classes. Training areas are digitized polygons
3) The classification algorithm then classifies each pixel in the rest of the image based on comparisons to training data
Supervised Classification
Training fields
Training fields are vector polygons that are digitized over pixels of known identity.
pixels of unknown identity are identified or assigned to a particular informational class by comparing their spectral signature to the spectral signature of the pixels within the training field
Supervised Classification
6 Key characteristics of training fields
# of pixels: at least 100 pixels for each
category
Size: Large enough to provide accurate estimates of spectral characteristics of each category; but not large enough to include spectral in homogeneity
Location: Each informational category must be represented by several training areas positioned throughout the image (even distribution)
Number: Better to define many small training areas than few large ones.Ideally, 5-10 training areas for each
category (to ensure representation of spectral subclasses)
Placement: Training field boundaries must be placed well away from the edges of contrasting parcels so that they do not encompass edge pixels (avoid mixels!)
Uniformity: Data within each training
area should exhibit a unimodal frequency
distribution for each spectral band to be
used.
Parametric Classification
Decision rules are based entirely on the statistics
(min, max, mean, # of pixels in training field, # of bands in input image) produced from the training field.
non-parametric classification
Decision rules are spatial; based on location
of pixel in multidimensional spectral space
Common classifiers for supervised classification
a.Parallelpiped classification (STD DEV)
Non parametric
Classifies solely on the ranges (or on standard deviations) of spectral values in the training data to define regions within multidimensional space
pixels that match the training data range (1 or 2 standard deviations from the mean) are assigned to the appropriate categories
can be extended to as many bands or as many categories as needed