With the Pleiades launch a few weeks ahead, there has been a lot of ongoing work from the OTB team to prepare for these new data. Since a lot of discussions happened off the list, either with phone meetings, or on other mailing lists (we will see later on why), and since we now have a comprehensive knowledge of what the support of Pleiades images in OTB will be, it is about time we explain it to users and developers.
Reminder on Pleiades images
We will start with a reminder : Pleiades images will be available in Jpeg2000 format allowing a high compression rate, tiled in 2048×2048 pixels, so a standard Pleiades image will contain a few hundreds of tiles. A typical Pleiades image size is 40 000 x 40 000 pixels (corresponding to a 20 km by 20 km area), and one of the standard product is a pan-sharpened one, which is a merge of the high resolution panchromatic and lower resolution multispectral imagery to create a single high resolution color image. These images would be very heavy without Jpeg2000 compression : for instance, if the Jpeg2000 file weights 1.7 Go, the decompressed file weights 7.3 Go. To decode those JPEG2000 files, there are a few commercial libraries, the most popular beeing Kakadu. There is also some open-source alternatives, among which OpenJPEG seems to be the most advanced. Of course, for Pleiades support in OTB, we need an open-source solution, even if GDAL can be compiled with a driver based on Kakadu (has to be the commercial version, not the trial one available for free under restrictions of use).
Open-source JPEG 2000 implementation
Back in 2007, the OTB people here at CNES already spotted OpenJPEG as the best open-source bet for JPEG2000 support in OTB. Since the library was missing some important features (like partial decoding and MCT support), they set-up a CNES contract with the CS Company in order to add these features to OpenJPEG. But when this contract was over, the new features were not integrated in OpenJPEG trunk but left on a side branch (the so-called v2), because nobody in OpenJPEG community had in-dept knowledge of what had been done in this new version. The development of the trunk went on with bug-fixes and enhancements, while the v2 branch did not evolve. In the meantime, we started a driver in OTB based on the v2 OpenJPEG version, but this was not a sustainable option, because the v2 was barely maintained. Thanks to its more advanced functionnalities, the v2 branch of OpenJPEG also received interest from other FOSS projects in need for advanced decoding capabilities, and got integrated in at least ITK, GDCM and GDAL.
Four months ago, we agreed with the OpenJPEG community that we needed to get the v2 merged with the OpenJPEG trunk. Mickaël Savinaud from the CS OTB team got involved into OpenJPEG development to get the merge between v2 and trunk done, and we now have a full-featured version of OpenJPEG in trunk thanks to his great work and to the support of the OpenJPEG community.
Now, it is time to face the truth : even with this new version, in terms of decoding performances, this state-of-the-art open-source software is way worse than Kakadu : decoding one tile is way slower than with Kakadu, and one tile of 2048×2048 pixels is the atomic unit for decoding (i.e. if you need one pixel, you will still need to first decode the whole tile).
We looked into optimization : reducing file seeking, implementing some macroscopic code optimisation (mainly avoiding pessimisation) of the Tier1 part which is clearly identify as the performances bottleneck (see figure below with profiling reports using kcachegrind). We made some clear progress, but the JPEG2000 standard is complex, and without knowledge of the big picture, we could not gain a lot.
What does it mean for Pleiades data ?
It means that simply decoding a full Pleiades image at full resolution will take about 20 minutes on a decent i5 CPU with 4 Go of RAM, while the same image in standard TIF format would have taken 8 minutes and Kakadu takes only 4 minutes. Any OTB processing pipeline streaming the whole image will be limited by this decompression time. And we are not even talking of sub-sampling or estimating statistics at some point in the pipeline.
Of course, we could tell our users to buy a Kakadu licence and compile GDAL with the Kakadu driver enabled, but we can not even tell them how much it will cost, and this is clearly in contradiction with the open-source philosophy of OTB. Still, this remains an option to get high performances JPEG2000 support in OTB, but you are on your own if you choose this solution.
Now, what will it be possible to do using Pleiades data with the open-source solution OpenJPEG in OTB ?
On OTB side: No specific handling will be added. It is the responsibility of the developer to know that the JPEG2000 driver is not as fast as the other drivers. The developer has the possibility to easily select the appropriate resolution and accordingly set the size streamed region (to reduce the number of decoding operations). Of course, metadata related to geometric and radiometric calibration will be supported through OSSIM. Moreover, we will support the GMLJP2 box included in the Pleiades to support geo-information in case of ortho-image products.
On applications side:
We will provide an application able to convert a Pleiades image, either full or extract, from any JPEG2000 resolution, into another file format (like TIF for instance). Of course, one will need a lot of disk space to handle the decompressed data. Please note that this is not fully supported by Kakadu trial version, which decodes a maximum of 3 channels (and which use is restricted by the way),
All applications will support Pleiades images, but it will be highly recommended to first decompress the image to disk for reasonable performances and if more than one processing is to be executed on the image,
On Monteverdi side:
Pleiades images will open in a dedicated new type of data in Monteverdi interface, and the user will be able to select the resolution level for decoding,
The viewer module will accept this new type of data and we will be able to efficiently navigate within Pleiades images : fast quicklook computation due to JPEG2000 capabilities, instant navigation into a 4 tiles squared area thanks to caching, and quick-enough refresh when moving outside this cached area thanks to parallel tiles decoding. We expect the navigation experience to be quite good !
The extract ROI module will accept this new type of data and allow to select an extract of the image using the fast quicklook ability. Caching (which will decompress image to disk in TIF format) will be recommended (this behaviour might be extended to other modules later).
The writer module will accept this new type of data for conversion purposes.
All other modules will not accept this new type of data : to use them, the user will first have to use the extract ROI module which will produce a plain image type accepted by all modules. Caching will be recommended for this purpose.
We can hope that the widening use of JPEG2000 in spatial imagery will encourage the improvement of open-source JPEG2000 libraries performances. When and if this becomes available, you will be able to use those datasets in OTB seamlessly, as any other image.
Until then, we can look at the bright side :
OTB will provide an open-source and free access to Pleiades data for those not owning a commercial image processing software (please note that Gdal is looking into the new OpenJPEG version and might also support Pleiades through it in a near future).
OTB will provide an efficient solution for Pleiades data decompression and visualization.
Last, the whole set of OTB processing remains fully available, at the expense of extra decoding time or once your Pleiades data have been decompressed to disk.
This has also been a great experience of collaboration between open-source projects towards a common goal. We would like to thank again the members of the OpenJPEG community, as well as the OTB Team at CS for their great work !