Images darker than source #191

Closed
opened 3 months ago by putridterror · 5 comments

I'm having an issue where img2pdf is darkening each image slightly when converting. I tried three seperate readers (Sumatra, Okular, Acrobat) and the results are the same. What am I missing?

I'm on Windows 10, here is the command I am using:
python3 -m img2pdf *.jpg -o test.pdf

I'm having an issue where img2pdf is darkening each image slightly when converting. I tried three seperate readers (Sumatra, Okular, Acrobat) and the results are the same. What am I missing? I'm on Windows 10, here is the command I am using: python3 -m img2pdf *.jpg -o test.pdf
josch commented 3 months ago
Owner

I just opened your screenshot with an editor and compared color values and am unable to see how one is darker than the other. Could you try an image that is just white and then take a screenshot where it shows as gray?

I just opened your screenshot with an editor and compared color values and am unable to see how one is darker than the other. Could you try an image that is just white and then take a screenshot where it shows as gray?
Poster

I tried what you described and was not able to replicate the issue, as the white pages were identical. However upon trying a test page with basic colors the values are indeed different, if only slightly.

The first example may very well be an issue with my choice of viewer as upon exporting the page they are identical, though I've included two additional examples where the top is the source file and bottom is taken from pdf after conversion.

I am not getting any consistency and feel it me be on my end, but hope you may be able to provide some insight.

I tried what you described and was not able to replicate the issue, as the white pages were identical. However upon trying a test page with basic colors the values are indeed different, if only slightly. The first example may very well be an issue with my choice of viewer as upon exporting the page they are identical, though I've included two additional examples where the top is the source file and bottom is taken from pdf after conversion. I am not getting any consistency and feel it me be on my end, but hope you may be able to provide some insight.
josch commented 3 months ago
Owner

I find your Example1.jpg most interesting. I indeed do see the slightly different colors.

First of all, your image includes a color profile, namely sRGB IEC61966-2.1. Do your input images contain a color profile? Different viewers interpret color profiles differently.

Secondly, you said you tried three different PDF viewers with the same result. But did you also try different image viewers? Maybe your image viewer applies some system-specific color correction.

Thirdly, your input image is a JPEG. Even given the exact same JPEG file, there are different ways to decode the data within it into actual pixel values. For example in imagemagick you can say -define jpeg:dct-method=ifast and the djpeg command has the -dct option which allows you to choose between integer, fast and float and depending on the method you choose, your pixels will have a slightly different color. See https://manpages.debian.org/bookworm/libjpeg-turbo-progs/djpeg.1.en.html for details.

So to get to the bottom of this, start ruling out jpeg decoding differences by using png as your input. If you use bmp as input, you can even rule out embedded color profiles because in contrast to jpg and png, bmp does not support color profiles. Lastly, try a different image viewer to make sure that that one does not intefer somehow.

I find your Example1.jpg most interesting. I indeed do see the slightly different colors. First of all, your image includes a color profile, namely sRGB IEC61966-2.1. Do your input images contain a color profile? Different viewers interpret color profiles differently. Secondly, you said you tried three different PDF viewers with the same result. But did you also try different image viewers? Maybe your image viewer applies some system-specific color correction. Thirdly, your input image is a JPEG. Even given the exact same JPEG file, there are different ways to decode the data within it into actual pixel values. For example in imagemagick you can say `-define jpeg:dct-method=ifast` and the `djpeg` command has the `-dct` option which allows you to choose between `integer`, `fast` and `float` and depending on the method you choose, your pixels will have a slightly different color. See https://manpages.debian.org/bookworm/libjpeg-turbo-progs/djpeg.1.en.html for details. So to get to the bottom of this, start ruling out jpeg decoding differences by using png as your input. If you use bmp as input, you can even rule out embedded color profiles because in contrast to jpg and png, bmp does not support color profiles. Lastly, try a different image viewer to make sure that that one does not intefer somehow.
Poster

They do not, the other two source images point to Untagged RGB (8bpc). I also tried png as a source and it led to similar behavior.

On using different viewers however the results were interesting. The two additional programs I tried (IrfanView & JPEGView) showed the exact same image, where the built-in Windows photo viewer I was using differed. This was attempted again with another source image and what was shown was wildly different.

So I can only conclude the issue I am seeing is part of the process you described and not something on img2pdf's end. I'm frankly a bit surprised the built-in viewer decides to decode the images so differently, though given that framework I don't see any other circumstance.

Thank you for your time and for explaining what was going on. I didn't want to continue with a very large batch without being confident img2pdf is not altering the image in any way upon conversion.

They do not, the other two source images point to Untagged RGB (8bpc). I also tried png as a source and it led to similar behavior. On using different viewers however the results were interesting. The two additional programs I tried (IrfanView & JPEGView) showed the exact same image, where the built-in Windows photo viewer I was using differed. This was attempted again with another source image and what was shown was wildly different. So I can only conclude the issue I am seeing is part of the process you described and not something on img2pdf's end. I'm frankly a bit surprised the built-in viewer decides to decode the images so differently, though given that framework I don't see any other circumstance. Thank you for your time and for explaining what was going on. I didn't want to continue with a very large batch without being confident img2pdf is not altering the image in any way upon conversion.
josch commented 3 months ago
Owner

I think it is super useful that you brought up this issue. Being lossless is the main reason for img2pdf's existence. If people do not need this property, there are many other converters out there for them to use. So I'm very happy that you started investigating this! It is very rare that people care about differences as small as in your examples. Since I do care, I was eager to get to the bottom of this. I'm happy that you found out that it's your initial image viewer that is the culprit and not img2pdf. But third party analyses are very rare and you are the first one who found this weird discrepancy. At the very least, now we have documented something for the next person who runs into this.

Thank you!! ❤️

I think it is super useful that you brought up this issue. Being lossless is the main reason for img2pdf's existence. If people do not need this property, there are many other converters out there for them to use. So I'm very happy that you started investigating this! It is very rare that people care about differences as small as in your examples. Since I do care, I was eager to get to the bottom of this. I'm happy that you found out that it's your initial image viewer that is the culprit and not img2pdf. But third party analyses are very rare and you are the first one who found this weird discrepancy. At the very least, now we have documented something for the next person who runs into this. Thank you!! :heart:
josch closed this issue 3 months ago
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: josch/img2pdf#191
Loading…
There is no content yet.