Isolating whether elements are from the output PDF file or PDF reader being used #162

Closed
opened 2023-04-10 07:08:32 +00:00 by rastereffects · 7 comments

First of all thanks for all the work. I'm hoping to get a reading experience as similar when opening the original images as to opening the PDF, but I'm having some difficulties:

Upon conversion, by way of img2pdf --viewer-fullscreen "$@" -o "${filename}.pdf" the result seems to not be lossless: the quality is blurrier, the colors change, the dimensions are slightly warped, and the reader seems to think the PDF is way bigger than the images used were. I'm wondering if this is because the conversion to the PDF file itself failed, or because there are issues with the PDF reader I'm using.

First of all thanks for all the work. I'm hoping to get a reading experience as similar when opening the original images as to opening the PDF, but I'm having some difficulties: Upon conversion, by way of `img2pdf --viewer-fullscreen "$@" -o "${filename}.pdf"` the result seems to not be lossless: the quality is blurrier, the colors change, the dimensions are slightly warped, and the reader seems to think the PDF is way bigger than the images used were. I'm wondering if this is because the conversion to the PDF file itself failed, or because there are issues with the PDF reader I'm using.
Owner

You did not show evidence that img2pdf is not lossless. The effect you see can be explained by your pdf viewer doing bicubic scaling when zooming in instead of showing you very large pixels.

If you think there are a bunch of issues, please file one issue for each issue you see. Otherwise it quickly becomes very messy.

I'm going to treat this issue as "img2pdf is not lossless".

Let me show you that img2pdf is indeed lossless on your input:

img2pdf beverly-004.jpg -o beverly-004.pdf
pdfimages -jpg beverly-004.pdf beverly-004
cmp beverly-004.jpg beverly-004-000.jpg

The original file and the file extracted from the pdf are bit-by-bit identical.

You did not show evidence that img2pdf is not lossless. The effect you see can be explained by your pdf viewer doing bicubic scaling when zooming in instead of showing you very large pixels. If you think there are a bunch of issues, please file one issue for each issue you see. Otherwise it quickly becomes very messy. I'm going to treat this issue as "img2pdf is not lossless". Let me show you that img2pdf is indeed lossless on your input: ``` img2pdf beverly-004.jpg -o beverly-004.pdf pdfimages -jpg beverly-004.pdf beverly-004 cmp beverly-004.jpg beverly-004-000.jpg ``` The original file and the file extracted from the pdf are bit-by-bit identical.
Owner

Also notice, that your first image original.png has an embedded color profile. Make sure that all viewer applications that you use are able to make use of this color profile. Applications that are unable to handle color profiles embedded in PNG files will obviously render the colors differently. But that again is not a bug in img2pdf.

Also notice, that your first image `original.png` has an embedded color profile. Make sure that all viewer applications that you use are able to make use of this color profile. Applications that are unable to handle color profiles embedded in PNG files will obviously render the colors differently. But that again is not a bug in img2pdf.
Author

Thanks. It can be hard to visualize the purity since various readers apply different models as you explain, I'm learning of this recently and have edited the issue accordingly. I'll take it that the image is the same from your comparison. What about the scaling issue? Is there some metadata on the PDF telling apps how big it is? I've tried Preview, Acrobat Reader, and Chrome, they all seem to think the output PDF is made of really huge images.

Thanks. It can be hard to visualize the purity since various readers apply different models as you explain, I'm learning of this recently and have edited the issue accordingly. I'll take it that the image is the same from your comparison. What about the scaling issue? Is there some metadata on the PDF telling apps how big it is? I've tried Preview, Acrobat Reader, and Chrome, they all seem to think the output PDF is made of really huge images.
rastereffects changed title from Overall not lossless to Isolating whether elements are from the output PDF file or PDF reader being used 2023-04-10 08:14:03 +00:00
Owner

Yes. Your original.png includes the information that the image has 72 dpi and that information is embededd into the resulting PDF. If you want to ignore the information in your input file, you can manually force a dpi setting using --dpi or manually set a pdf page size using the -S option. Refer to the --help output for more information.

Yes. Your `original.png` includes the information that the image has 72 dpi and that information is embededd into the resulting PDF. If you want to ignore the information in your input file, you can manually force a dpi setting using `--dpi` or manually set a pdf page size using the `-S` option. Refer to the `--help` output for more information.
Author

Great! One thing though, I'm getting the output bellow and there's no mention of the command on --help.

usage: img2pdf [-h] [-v] [-V] [--gui] [--from-file FILE] [-o out] [-C colorspace] [-D] [--engine engine] [--first-frame-only] [--pillow-limit-break] [--pdfa [PDFA]] [-S LxL] [-s LxL] [-b L[:L]] [-f FIT] [-a] [-r ROT] [--crop-border L[:L]] [--bleed-border L[:L]]
               [--trim-border L[:L]] [--art-border L[:L]] [--title title] [--author author] [--creator creator] [--producer producer] [--creationdate creationdate] [--moddate moddate] [--subject subject] [--keywords kw [kw ...]] [--viewer-panes PANES]
               [--viewer-initial-page NUM] [--viewer-magnification MAG] [--viewer-page-layout LAYOUT] [--viewer-fit-window] [--viewer-center-window] [--viewer-fullscreen]
               [infile ...]
img2pdf: error: unrecognized arguments: --dpi
Great! One thing though, I'm getting the output bellow and there's no mention of the command on `--help`. ``` usage: img2pdf [-h] [-v] [-V] [--gui] [--from-file FILE] [-o out] [-C colorspace] [-D] [--engine engine] [--first-frame-only] [--pillow-limit-break] [--pdfa [PDFA]] [-S LxL] [-s LxL] [-b L[:L]] [-f FIT] [-a] [-r ROT] [--crop-border L[:L]] [--bleed-border L[:L]] [--trim-border L[:L]] [--art-border L[:L]] [--title title] [--author author] [--creator creator] [--producer producer] [--creationdate creationdate] [--moddate moddate] [--subject subject] [--keywords kw [kw ...]] [--viewer-panes PANES] [--viewer-initial-page NUM] [--viewer-magnification MAG] [--viewer-page-layout LAYOUT] [--viewer-fit-window] [--viewer-center-window] [--viewer-fullscreen] [infile ...] img2pdf: error: unrecognized arguments: --dpi ```
Owner

This output is automatically generated. The -h option is the short form of --help. Try running it with --help and you'll see it works as expected.

And sorry, there is indeed no --dpi option. Instead you can set the image size in dpi by specifying: -s 999dpi or --imgsize 999dpi.

Have a look at the output of img2pdf -h or img2pdf --help for more information.

This output is automatically generated. The `-h` option is the short form of `--help`. Try running it with `--help` and you'll see it works as expected. And sorry, there is indeed no `--dpi` option. Instead you can set the image size in dpi by specifying: `-s 999dpi` or `--imgsize 999dpi`. Have a look at the output of `img2pdf -h` or `img2pdf --help` for more information.
Author

That works great. Thank you a lot with all the help and the script.

That works great. Thank you a lot with all the help and the script.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: josch/img2pdf#162
No description provided.