2012-03-29 09:11:23 +00:00
|
|
|
img2pdf
|
|
|
|
=======
|
2012-03-29 09:08:32 +00:00
|
|
|
|
2012-03-29 09:53:57 +00:00
|
|
|
Lossless conversion of images to PDF without unnecessarily re-encoding JPEG and
|
|
|
|
JPEG2000 files. Thus, no loss of quality and no unnecessary large output file.
|
2012-03-29 09:08:32 +00:00
|
|
|
|
2012-03-29 09:11:23 +00:00
|
|
|
background
|
|
|
|
----------
|
|
|
|
|
2012-03-29 09:53:57 +00:00
|
|
|
PDF is able to embed JPEG and JPEG2000 images as they are without re-encoding
|
|
|
|
them (and hence loosing quality) but I was missing a tool to do this
|
|
|
|
automatically, thus I wrote this piece of python code.
|
2012-03-29 09:08:32 +00:00
|
|
|
|
2012-03-29 09:53:57 +00:00
|
|
|
If you know how to embed JPEG and JPEG2000 images into a PDF container without
|
2012-03-29 09:08:32 +00:00
|
|
|
recompression, using existing tools, please contact me so that I can put this
|
|
|
|
code into the garbage bin :D
|
|
|
|
|
2012-03-29 09:11:23 +00:00
|
|
|
functionality
|
|
|
|
-------------
|
|
|
|
|
2012-03-29 09:08:32 +00:00
|
|
|
The program will take image filenames from commandline arguments and output a
|
2012-03-29 09:53:57 +00:00
|
|
|
PDF file with them embedded into it. If the input image is a JPEG or JPEG2000
|
|
|
|
file, it will be included as-is without any processing. If it is in any other
|
|
|
|
format, the image will be included as zip-encoded RGB. As a result, this tool
|
|
|
|
will be able to lossless wrap any image into a PDF container while performing
|
|
|
|
better (in terms of quality/filesize ratio) than existing tools in case the
|
|
|
|
input image is a JPEG or JPEG2000 file.
|
2012-03-29 09:08:32 +00:00
|
|
|
|
|
|
|
For the record, the imagemagick command to lossless convert any image to
|
|
|
|
PDF using zip-encoding, is:
|
|
|
|
|
|
|
|
convert input.jpg -compress Zip output.pdf
|
|
|
|
|
|
|
|
The downside is, that using imagemagick like this will make the resulting PDF
|
2012-03-29 09:53:57 +00:00
|
|
|
files a few times bigger than the input JPEG or JPEG2000 file and can also not
|
|
|
|
output a multipage PDF.
|
2012-03-29 09:08:32 +00:00
|
|
|
|
|
|
|
img2pdf is able to output a PDF with multiple pages if more than one input
|
2012-03-29 09:53:57 +00:00
|
|
|
image is given, losslessly embed JPEG and JPEG2000 files into a PDF container
|
|
|
|
without adding more overhead than the PDF structure itself and will save all
|
|
|
|
other graphics formats using lossless zip-compression.
|
2012-03-29 09:08:32 +00:00
|
|
|
|
2012-03-29 09:11:23 +00:00
|
|
|
bugs
|
|
|
|
----
|
|
|
|
|
2012-03-29 09:53:57 +00:00
|
|
|
If you find a JPEG or JPEG2000 file that, when embedded can not be read by the
|
|
|
|
Adobe Acrobat Reader, please contact me.
|
|
|
|
|
|
|
|
For lossless conversion of other formats than JPEG or JPEG2000 files, zip/flate
|
|
|
|
encoding is used. This choice is based on a number of tests I did on images.
|
|
|
|
I converted them into PDF using imagemagick and all compressions it has to
|
|
|
|
offer and then compared the output size of the lossless variants. In all my
|
|
|
|
tests, zip/flate encoding performed best. You can verify my findings using the
|
|
|
|
test_comp.sh script with any input image given as a commandline argument. If
|
|
|
|
you find an input file that is outperformed by another lossless compression,
|
|
|
|
contact me.
|