Unsupervised KMeans image classification


Unsupervised KMeans image classification. This is a composite application, using existing training and classification applications. The SharkKMeans model is used.

This application is only available if OTB is compiled with Shark support(CMake option OTB_USE_SHARK=ON).

The steps of this composite application:

  1. ImageEnvelope: create a shapefile (1 polygon),
  2. PolygonClassStatistics: compute the statistics,
  3. SampleSelection: select the samples by constant strategy in the shapefile (1000000 samples max),
  4. SampleExtraction: extract the samples descriptors (update of SampleSelection output file),
  5. ComputeImagesStatistics: compute images second order statistics,
  6. TrainVectorClassifier: train the SharkKMeans model,
  7. ImageClassifier: perform the classification of the input image according to a model file.

It’s possible to choice random/periodic modes of the SampleSelection application. If you want keep the temporary files (sample selected, model file, ...), initialize cleanup parameter. For more information on shark KMeans algorithm [1].


Input Image -in image Mandatory
Input image filename.

Output Image -out image [dtype] Mandatory
Output image containing class labels

Number of classes -nc int Default value: 5
Number of modes, which will be used to generate class membership.

Training set size -ts int Default value: 100
Size of the training set (in pixels).

Maximum number of iterations -maxit int Default value: 1000
Maximum number of iterations for the learning step.

Centroid filename -outmeans filename [dtype]
Output text file containing centroid positions

Available RAM (MB) -ram int Default value: 256
Available memory for processing (in MB).

Sampler type -sampler [periodic|random] Default value: periodic
Type of sampling (periodic, pattern based, random)

  • Periodic sampler
    Takes samples regularly spaced
  • Random sampler
    The positions to select are randomly shuffled.

Periodic sampler options

Jitter amplitude -sampler.periodic.jitter int Default value: 0
Jitter amplitude added during sample selection (0 = no jitter)

Validity Mask -vm image
Validity mask, only non-zero pixels will be used to estimate KMeans modes.

Label mask value -nodatalabel int Default value: 0
By default, hidden pixels will have the assigned label 0 in the output image. It’s possible to define the label mask by another value, but be careful to not take a label from another class. This application initialize the labels from 0 to N-1, N is the number of class (defined by ‘nc’ parameter).

Temporary files cleaning -cleanup bool Default value: true
If activated, the application will try to clean all temporary files it created

Random seed -rand int
Set a specific random seed with integer value.

Load parameters from XML -inxml filename.xml
Load application parameters from an XML file.

Save parameters to XML -outxml filename.xml
Save application parameters to an XML file.


From the command-line:

otbcli_KMeansClassification -in QB_1_ortho.tif -ts 1000 -nc 5 -maxit 1000 -out ClassificationFilterOutput.tif uint8

From Python:

import otbApplication

app = otbApplication.Registry.CreateApplication("KMeansClassification")

app.SetParameterString("in", "QB_1_ortho.tif")
app.SetParameterInt("ts", 1000)
app.SetParameterInt("nc", 5)
app.SetParameterInt("maxit", 1000)
app.SetParameterString("out", "ClassificationFilterOutput.tif")
app.SetParameterOutputImagePixelType("out", 1)



The application doesn’t support NaN in the input image