use /LZWDecode filter for GIF and matching TIFF images #174
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The GIF image format uses LZW compression, which is also a valid compression format for TIFF images. The PDF format provides an
/LZWDecodefilter that seems like a perfect match to embed GIF and certain TIFF images without transcoding image data.A test with img2pdf 3.3.0 (Ubuntu 20.04) shows that GIF and LZW compressed TIFF images are always transcoded into
/FlateDecodestreams by img2pdf.Wouldn't it make sense to preserve LZW encoded image data in these cases?
There's one caveat, though. The PDF/A standard explicitly forbids
/LZWDecodedata streams. That is, when the--pdfaoption is given, image data should be transcoded to use/FlateDecodestreams, again.As an example, a PDF file using the
/LZWDecodefilter can be created using GraphicsMagick like this:Yes, that would make sense if it is possible.
Do you have very large GIF images were re-encoding them slows you down?
One problem is, that Pillow does not give me access to the compressed data and img2pdf would need to learn how to access and extract just the right bits from the input image.
Would you like to propose a patch?
Well, I do not really have a use-case for
/LZWDecodestreams in PDF files. In fact, I'm actually trying to avoid those, because they're forbidden for PDF/A compliant files. I have stumbled across the issue while testing what error my PDF/A validation software (veraPDF) returns when presenting a file usingLZWDecodefilter. Knowing that img2pdf preserves image data as far as possible I fired-up img2pdf just to find me wondering veraPDF not triggering an alarm. Only after inspection, I found out img2pdf output uses/FlateDecodestreams.And then, I'm neither a Python nor C guy. Sorry, all I can contribute is an idea for enhancement.
The
libtiff-toolspackage contains a tool tiff2pdf that seems to serve a similar purpose than img2pdf, but just for TIFF images. Though, it doesn't seem to preserve LZW compression.However, it is able to preserve image data of some sub-formats without transcoding. From the man page:
Maybe the code can be used as a reference.
Thank you! I've tagged this issue as "enhancement" and "help wanted", so if somebody finds some time to implement this I'd be happy to review patches. 😄