forked from josch/img2pdf
As noted by @phmccarty in josch/img2pdf#184 (comment) and subsequent comments, we were not properly stripping end-of-page and end-of-file segments. These are valid segments in a JBIG2 file, but not when embedded in PDF. From the PDF spec: > The JBIG2 file header, end-of-page segments, and end-of-file segment > shall not be used in PDF. We were already stripping out the JBIG2 file header, but not yet the end-of-page and end-of-file segments. For this, I'm expanding the approach that we were already taking, of only supporting a narrow subset of JBIG2 files. We assert that the input file has such a footer, and then we strip it. We validated that the issue raised by @phmccarty is indeed resolved by running the following code before and after applying this commit: ```sh src/img2pdf.py src/tests/input/mono.jb2 > test.pdf pdfimages -tiff test.pdf img ``` Before this commit, this returned "Syntax Error (1143): Unknown segment type in JBIG2 stream". After this commit, the error is gone. |
||
---|---|---|
.. | ||
tests | ||
img2pdf.py | ||
img2pdf_test.py | ||
jp2.py |