Supervised classification is a technique for extracting information
from image data. The goal is to classify pixels in an image into
different classes based on features of the pixels. There are two
stages: training stage and classification stage. During the
training stage, a set of vectors (each vector is associated with a
pixel) called training samples are used to train a classifier. Each
training sample vector is made up of the class the pixel belongs to
and feature values of the pixel. In the classification stage, the
trained classifier is used to classify pixels with known feature
values but unknown class.
User has a choice of
- train (and save) a classifier; or
- load a previously saved classifier
to perform classification.
To do the former, type in a new filename.
To do the latter, select a filename from the list.
Train and save a
classifier
User is required to provide the number of training samples to use.
Note that in addition to training, evaluation of the
classifier is also performed. E.g., if the number of training
samples to use is 5000, then 5000 x 2 = 10,000 samples will be
extracted. The first 5000 will be used as training samples and the
remaining 5000 will be used as test samples for evaluation.
Two separate files are created: one with extension .class and
one with extension .xml. The evaluation results are in a file
with extension .txt.
User can choose to train on raster or vectors.
Train on Raster
The user can choose one band (from the first product listed in the
ProductSet-Reader) as the training band. If none is chosen, the
first band will be used as the training band.
The user can choose bands from all the source products as feature
bands. If none is chosen, all bands (except for the training
band) will be used as feature bands.
There is an option to quantize class values if the values of the
chosen training band are not already discrete.
If the training band consists of data that is discrete labels such
as landcover classes, then there is no need to quantize.
However, if the training band data is continuous like biomass, then
there will be as many classes as there are biomass values in the
training set. It is recommended to quantize the values in such
cases.
E.g., if the range of values in the training band is [0.0, 1.0],
the user can set min class value to 0.0, class value step size to
0.1 and class levels to 10 to quantize the values to 10 levels:
0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9.
Train on Vectors
The user can choose a number of training vectors (from the first
product listed in the ProductSet-Reader) as classes. E.g., the
training vectors could be regions (polygons) each representing a
separate class such as water, urban or forest. A training vector
called "water" will become a class label called "water".
Regions can be created using the "New Vector Data Container" tool
and other drawing tools such as "Rectangle drawing tool".
A pixel inside a training vector region will have the name of the
region as its class instead of its data value.
Feature bands are chosen in the same manner as train on raster.
The operator will endeavour to extract the same number samples for
each class when constructing the training or test samples set.
Load a previously saved
classifier
The minimum information the user needs to know to use a saved
classifier is the list of features which is contained in the XML
file among other useful information.
The user can specify more than one source feature product.
For each name in "featureNames" in the XML file, the operator will
search for a band in the feature products whose name
contains it. It loops through the products in the order they are
listed in ProductSet-Reader and uses the first band it can find
that contains the feature name. E.g., if the name is "g0" and there
are two feature products and both contain a band named "g0", then
the band from the first feature product will be used.