Train a classifier from multiple pairs of images and training vector data.
This application performs a classifier training from multiple pairs of input images and training vector data.
Samples are composed of pixel values in each band optionally centered and reduced using an XML
statistics file produced by the ComputeImagesStatistics application.
The training vector data must contain polygons with a positive integer field representing the class label. The
name of this field can be set using the ”Class label field” parameter. Training and validation
sample lists are built such that each class is equally represented in both lists. One parameter
allows controlling the ratio between the number of samples in training and validation sets.
Two parameters allow managing the size of the training and validation sets per class and per
image.
Several classifier parameters can be set depending on the chosen classifier. In the validation process, the
confusion matrix is organized the following way: rows = reference labels, columns = produced
labels. In the header of the optional confusion matrix output file, the validation (reference) and
predicted (produced) class labels are ordered according to the rows/columns of the confusion
matrix.
This application is based on LibSVM and on OpenCV Machine Learning classifiers, and is compatible with
OpenCV 2.3.1 and later.
This section describes in details the parameters available for this application. Table 4.128, page 714 presents a summary of these parameters and the parameters keys to be used in command-line and programming languages. Application key is TrainImagesClassifier.
Parameter key | Parameter type |
Parameter description |
io | Group |
Input and output data |
io.il | Input image list |
Input Image List |
io.vd | Input vector data list |
Input Vector Data List |
io.imstat | Input File name |
Input XML image statistics file |
io.confmatout | Output File name |
Output confusion matrix |
io.out | Output File name |
Output model |
elev | Group |
Elevation management |
elev.dem | Directory |
DEM directory |
elev.geoid | Input File name |
Geoid File |
elev.default | Float |
Default elevation |
sample | Group |
Training and validation samples parameters |
sample.mt | Int |
Maximum training sample size per class |
sample.mv | Int |
Maximum validation sample size per class |
sample.bm | Int |
Bound sample number by minimum |
sample.edg | Boolean |
On edge pixel inclusion |
sample.vtr | Float |
Training and validation sample ratio |
sample.vfn | String |
Name of the discrimination field |
classifier | Choices |
Classifier to use for the training |
classifier libsvm | Choice |
LibSVM classifier |
classifier boost | Choice |
Boost classifier |
|
||
classifier dt | Choice |
Decision Tree classifier |
classifier gbt | Choice |
Gradient Boosted Tree classifier |
classifier ann | Choice |
Artificial Neural Network classifier |
classifier bayes | Choice |
Normal Bayes classifier |
classifier rf | Choice |
Random forests classifier |
classifier knn | Choice |
KNN classifier |
classifier.libsvm.k | Choices |
SVM Kernel Type |
classifier.libsvm.k linear | Choice |
Linear |
classifier.libsvm.k rbf | Choice |
Gaussian radial basis function |
classifier.libsvm.k poly | Choice |
Polynomial |
classifier.libsvm.k sigmoid | Choice |
Sigmoid |
classifier.libsvm.m | Choices |
SVM Model Type |
classifier.libsvm.m csvc | Choice |
C support vector classification |
classifier.libsvm.m nusvc | Choice |
Nu support vector classification |
classifier.libsvm.m oneclass | Choice |
Distribution estimation (One Class SVM) |
classifier.libsvm.c | Float |
Cost parameter C |
classifier.libsvm.opt | Boolean |
Parameters optimization |
classifier.libsvm.prob | Boolean |
Probability estimation |
classifier.boost.t | Choices |
Boost Type |
classifier.boost.t discrete | Choice |
Discrete AdaBoost |
|
||
classifier.boost.t real | Choice |
Real AdaBoost (technique using confidence-rated predictions and working well with categorical data) |
classifier.boost.t logit | Choice |
LogitBoost (technique producing good regression fits) |
classifier.boost.t gentle | Choice |
Gentle AdaBoost (technique setting less weight on outlier data points and, for that reason, being often good with regression data) |
classifier.boost.w | Int |
Weak count |
classifier.boost.r | Float |
Weight Trim Rate |
classifier.boost.m | Int |
Maximum depth of the tree |
classifier.dt.max | Int |
Maximum depth of the tree |
classifier.dt.min | Int |
Minimum number of samples in each node |
classifier.dt.ra | Float |
Termination criteria for regression tree |
classifier.dt.cat | Int |
Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split |
classifier.dt.f | Int |
K-fold cross-validations |
classifier.dt.r | Boolean |
Set Use1seRule flag to false |
classifier.dt.t | Boolean |
Set TruncatePrunedTree flag to false |
classifier.gbt.w | Int |
Number of boosting algorithm iterations |
classifier.gbt.s | Float |
Regularization parameter |
classifier.gbt.p | Float |
Portion of the whole training set used for each algorithm iteration |
classifier.gbt.max | Int |
Maximum depth of the tree |
classifier.ann.t | Choices |
Train Method Type |
classifier.ann.t reg | Choice |
RPROP algorithm |
classifier.ann.t back | Choice |
Back-propagation algorithm |
|
||
classifier.ann.sizes | String list |
Number of neurons in each intermediate layer |
classifier.ann.f | Choices |
Neuron activation function type |
classifier.ann.f ident | Choice |
Identity function |
classifier.ann.f sig | Choice |
Symmetrical Sigmoid function |
classifier.ann.f gau | Choice |
Gaussian function (Not completely supported) |
classifier.ann.a | Float |
Alpha parameter of the activation function |
classifier.ann.b | Float |
Beta parameter of the activation function |
classifier.ann.bpdw | Float |
Strength of the weight gradient term in the BACKPROP method |
classifier.ann.bpms | Float |
Strength of the momentum term (the difference between weights on the 2 previous iterations) |
classifier.ann.rdw | Float |
Initial value Delta_0 of update-values Delta_ij in RPROP method |
classifier.ann.rdwm | Float |
Update-values lower limit Delta_min in RPROP method |
classifier.ann.term | Choices |
Termination criteria |
classifier.ann.term iter | Choice |
Maximum number of iterations |
classifier.ann.term eps | Choice |
Epsilon |
classifier.ann.term all | Choice |
Max. iterations + Epsilon |
classifier.ann.eps | Float |
Epsilon value used in the Termination criteria |
classifier.ann.iter | Int |
Maximum number of iterations used in the Termination criteria |
classifier.rf.max | Int |
Maximum depth of the tree |
classifier.rf.min | Int |
Minimum number of samples in each node |
classifier.rf.ra | Float |
Termination Criteria for regression tree |
|
||
classifier.rf.cat | Int |
Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split |
classifier.rf.var | Int |
Size of the randomly selected subset of features at each tree node |
classifier.rf.nbtrees | Int |
Maximum number of trees in the forest |
classifier.rf.acc | Float |
Sufficient accuracy (OOB error) |
classifier.knn.k | Int |
Number of Neighbors |
rand | Int |
set user defined seed |
inxml | XML input parameters file |
Load otb application from xml file |
outxml | XML output parameters file |
Save otb application to xml file |
|
||
|
||
|
||
|
||
|
Input and output data This group of parameters allows setting input and output data.
Elevation management This group of parameters allows managing elevation values. Supported formats are SRTM, DTED or any geotiff. DownloadSRTMTiles application could be a useful tool to list/download tiles related to a product.
Training and validation samples parameters This group of parameters allows you to set training and validation sample lists parameters.
Classifier to use for the training Choice of the classifier to use for the training. Available choices are:
set user defined seed Set specific seed. with integer value.
Load otb application from xml file Load otb application from xml file
Save otb application to xml file Save otb application to xml file
To run this example in command-line, use the following:
To run this example from Python, use the following code snippet:
None
This application has been written by OTB-Team.
These additional ressources can be useful for further information: