The TIFF image format is the recommended input format for use with IIPImage due to it’s flexibility, unrivaled speed as well as its ability to handle multiple bit depths, scientific imaging and multidimensional datasets. However, how a TIFF is encoded and structured can make a large difference to the resulting file size as well as to decoding speed performance.
TIFF Structure
TIFF allows image data to be structured in ways that allow for fast random access by means of tiling and through the creation of multi-resolution image pyramids. Tiling allows random regions from an image to be quickly decoded without the need to decode the entire image. Multiple resolutions can also be stored in a pyramid structure to allow fast access to images at any size. The combination of tiling and multi-resolution pyramids provides extremely fast random access to any part of an image at any size, allowing gigapixel or even terapixel large image or data sets to be handled quickly and efficiently.
Compression
TIFF is, in addition, able to support multiple types of compression. The compression is applied either to the whole image or, if the image is tiled, to each individual tile. The TIFF specification allows any type of compression can be applied. However, only a certain number are explicitly listed in the official TIFF 6.0 specification: LZW, Deflate and JPEG. These three compression methods are all widely supported in both commercial and open-source software. In addition to these “official” compression methods, the de-facto open-source TIFF reference library, libtiff, also supports several additional methods to provide a wide choice of both lossless and lossy methods:
ZStandard, WebP and LERC are relatively recent additions to libtiff. ZStandard and WebP have been supported since libtiff version 4.0.10 (released in November 2018) and LERC since libtiff version 4.3.
Comparing Compression Methods
Each of these compression methods performs differently in terms of the resulting file size, encoding speed and decoding speed. To empirically measure these differences, a set of 1000 high resolution museum images were used to test the behavior of each type of compression when used within a tiled multi-resolution pyramid TIFF.
Each of these 1000 images was encoded into a tiled multi-resolution pyramid TIFF using each of the different compression methods. This encoding was carried out using vips. For example, for lossy WebP:
Vips, however, does not support TIFF with LZMA or LERC compression. Thus, only LZW, Deflate, ZStandard, JPEG and WebP (both lossless and lossy) were tested as well as uncompressed tiled multi-resolution pyramid TIFF.
Timing results, of course, measure not only the performance of the compression method itself, but also that of the codec used. For these tests the default reference codec libraries used by libtiff were used: libwebp for WebP, libzstd for ZStandard, libdeflate for Deflate and libjpeg-turbo for JPEG.
File Size
The averaged compressed file sizes for each compression method are shown below with the equivalent result for the JPEG2000 format provided for reference. All sizes have been normalized with respect to the size of the raw uncompressed pixel data. In other words, 1.0 on the y-axis represents the size of the raw uncompressed pixel data. For lossless encoding, the default compression levels for Deflate and ZStandard were used (6 and 9 respectively). Whereas for lossy encoding, a quality factor of 90 was used in order to provide a similar perceptual image quality across the three formats. For JPEG2000, images were compressed using Kakadu with the Qfactor parameter (rather than the rate parameter) used to obtain a perceptual quality similar to JPEG.
As we can see from the (left-most) uncompressed TIFF bar, the tiled multi-resolution pyramid structure adds around 40% overhead to the raw data size. Using compression, however, can significantly offset this. For lossless compression, LZW and Deflate result in file sizes of around 90% that of the raw data even with this overhead. WebP and ZStandard both outperform LZW and Deflate, while lossless JPEG2000 produces the smallest files, essentially because of the lack of this structural overhead.
For lossy compression, WebP comfortably outperforms JPEG at an equivalent perceptual quality and even provides better compression than lossy JPEG2000.
Encoding Speed
The following chart shows the encoding speed for each of the compression methods. All 1000 TIFF files were encoded using variations of the vips command shown earlier and carried out on a high-end 16-core Linux production server with RAID 10 storage. Each encoding run was carried out three times with the results averaged to produce a single timing for each image and each compression type. These results were then averaged for each compression type and the results are shown below.
The times are all normalized so that the fastest encoding time (no compression) is set to one. Thus Deflate is around 12x slower than using no compression. Although LZW is the fastest of the lossless encoders, it should be noted that it is possible to tune the compression level of both Deflate and ZStandard, which can provide a trade-off between speed and level of compression. The compression times here correspond to the default levels of 6 for Deflate and 9 for ZStandard. Also the lossless WebP bar has been truncated as the encoding times are very slow, being around 10x slower than Deflate.
Decoding
In most cases, encoding time is not a critical factor as encoding is usually only carried out once offline. For use with an image server such as IIPImage, it is decoding time that is the key criteria as it is an important source of latency, which heavily impacts the performance of the server. This is especially true when using an image server in conjunction with a pan-and-zoom viewer which makes large numbers of requests for individual image tiles.
To measure the impact of the choice of compression on decoding speed, a subset of the 50 largest images was selected and encoded using the various compression methods. A set of 100 random tile requests at different locations and resolutions that simulate the working of a typical pan-and-zoom viewer was then prepared for each of 50 different high resolution images, resulting in 5000 requests in total. These requests were made to an instance of the latest version of iipsrv running on a high-end Linux production server with the individual tile decoding times taken from the iipsrv log file. This was performed three times for each set of differently compressed images and the times averaged for each compression method. The results can be seen in the following chart, where the times are normalized such that the fastest (no compression) is set to one.
We can see that not using any compression is, of course, extremely fast. LZW is reasonably fast, but Deflate and lossless WebP are slow (around 15x slower than no compression). However, ZStandard is by far the fastest lossless compression method, being almost 4x faster than Deflate. For lossy decoding, both JPEG and WebP are quite similar to each other with JPEG around 30% faster than WebP and slightly faster than the lossless LZW, but slower than ZStandard.
Conclusion
If speed is the main criteria, uncompressed tiled multi-resolution pyramid TIFF is, by far, the fastest solution. For comparison, tile decoding of an optimally encoded HTJ2K JPEG2000 image is about 50x slower than uncompressed tiled multi-resolution pyramid TIFF. If, however, storage space is an issue and you don’t need compatibility with closed-source software, ZStandard is an excellent choice for lossless compression with both fast decoding and smaller file sizes than both LZW and Deflate.
For lossy compression, again if compatibility with closed-source software is not an issue, WebP provides a significant improvement in file size over JPEG and even beats lossy JPEG2000 at equivalent levels of quality. Lossy WebP decoding is, however, about 30% slower than JPEG, but is still reasonable and is faster than Deflate.
IIIF Presentation
A presentation of these results was given at the annual IIIF 2004 online meeting on the 12-14th November 2024. The video of the presentation and session is available below.
These results showed that HTJ2K (JPEG2000 Part 15) provides a significant improvement in decoding speed over standard JPEG2000 (Part 1). However, the tests also highlighted the fact that tiled multi-resolution pyramid TIFF comfortably out-performs JPEG2000 for typical IIIF pan-and-zoom viewer use. Performance, however, was closer between HTJ2K and TIFF when used for decoding entire images or regions of images.