Patch for ICCBased ColorSpace #65

Closed
opened 3 years ago by josch · 0 comments
josch commented 3 years ago
Owner

By iwata74 on 2019-10-11T06:30:16.318Z

I wrote a patch for ICCBased ColorSpace. It will be helpful for some people. iccbased.patch


By josch on 2019-10-11T06:43:18.037Z


Thanks!

Would you happen to know how I could generate some test input for this new feature via imagemagick?


By josch on 2020-04-05T17:25:35.913Z


I tested your patch. It will not insert a profile if the image is palette based (and not warn about it) and it will fail without pdfrw. I already rewrote your patch producing the same output but I have no idea how to test it. For example what is this ~> character at the end of the stream about?


By josch on 2020-04-05T18:16:11.515Z


I autogenerated a PNG image with ICC profile in it normalicc

The problem is, that when I use your diff (or mine, doesn't matter the output is identical) then I'm unable to render this pdf so that the original input png get produced. I tried rendering it using:

gs -dQUIET -dNOPAUSE -dBATCH -sDEVICE="$gsdevice" -r96 -sOutputFile="$tempdir/gs-%00d.png" "$pdf"
pdftocairo -r 96 -png "$pdf" "$tempdir/poppler"
mutool draw -o "$tempdir/mupdf.png" -r 96 "$pdf" 2>/dev/null

But in all cases, using compare from imagemagick yielded differences -- why?


By josch on 2020-04-23T04:39:31.494Z


I now created a branch icc with my attempt at implementing this in this commit: 188619ef5259f27d85194046237fe4adb887d4ae

As I pointed out earlier, this isn't working yet. The test case I created fails because the green part of my test image gets rendered differently by ghostscript, cairo and mupdf. Any help in finding out where the error lies is greatly appreciated.


By iwata74 on 2020-06-15T07:54:27.862Z


Sorry to be late.

  1. The ~> comes from PDF's ASCII base-85 encoding . See 3.3.2: https://ghostscript.com/~tor/pdfref17.pdf

BTW I used ASCII base-85 encoding for ICC profile because many existing PDF files do so. As far as I tested, there is no problem without ASCII base-85 encoding.

  1. Ghostscript or etc. converts the color space of images. They are not good tools in this case. Use pdfimages in poppler-utils.
sudo apt install -y poppler-utils
pdfimages -j some.pdf foo

The foo-000.jpg should match.


By josch on 2020-06-15T08:19:00.592Z


No problem with "being late" -- we are all doing this in our free time as our hobby, so no rush. :)

The fact that your pdfimages example works is not surprising. PDF embeds JPEG files without touching the JPEG data. So the JPEG that pdfimages output will be bit-by-bit identical to the input. So using pdfimages is not something that can be used to test this functionality. All this tests is, that indeed the JPEG that was put in was not modified. But this is not a test of the new /ICCBased stuff that this commit adds.

Even worse: if you are correct and ghostscript, mupdf and poppler are not rendering the PDF correctly, then what is the point in including the profile in the first place? If they don't do the right thing, then we can never check whether including the profile had the correct effect or not.

I suggest to first add support for icc profiles to at least one of these tools and then we can add it to img2pdf.


By iwata74 on 2020-06-15T13:42:34.113Z


I got it. I made a test image for Ghostscript or etc. It compares the results of Ghostscript and ImageMagick.

sRGB.icm

import img2pdf

pdf = img2pdf.convert(['Lenna_16bit-adobergb.png'])
with open('lenna.pdf', 'wb') as f:
    f.write(pdf)
gs -dQUIET -dNOPAUSE -dBATCH -sDEVICE=png16m -r96 -sOutputFile="lenna-gs.png" lenna.pdf
convert -profile sRGB.icm -depth 8 Lenna_16bit-adobergb.png lenna-im.png
compare -metric ae lenna-gs.png lenna-im.png /dev/null

compare should echo 0.


By josch on 2020-06-15T13:53:48.909Z


Thank you for your input. I will look into this.

But please don't use the photo of a naked person from the playboy magazine as an example image -- there is really no practical need to use precisely that image as test input.

Additionally, since neither of us has the copyright on the image, I will remove it to not violate the copyright of the playboy magazine by distributing their content with this gitlab instance.


By josch on 2020-08-06T22:16:44.189Z


Commit f0b57985ee should implement the missing feature. But I still want to test with more than just ghostscript. See issue #83 for my concerns.


By josch on 2020-08-18T07:53:29.114Z


@iwata74 do you think this issue can be closed now with what is in git?


By iwata74 on 2020-08-19T02:09:02.533Z


I see no problem. Thanks.


By josch on 2020-08-19T03:17:47.681Z


Awesome, thanks!


By josch on 2020-08-19T03:17:48.015Z


Status changed to closed

*By iwata74 on 2019-10-11T06:30:16.318Z* I wrote a patch for ICCBased ColorSpace. It will be helpful for some people. [iccbased.patch](/uploads/c26aca98289dd6c23f38b8db542c4fde/iccbased.patch) --- *By josch on 2019-10-11T06:43:18.037Z* --- Thanks! Would you happen to know how I could generate some test input for this new feature via imagemagick? --- *By josch on 2020-04-05T17:25:35.913Z* --- I tested your patch. It will not insert a profile if the image is palette based (and not warn about it) and it will fail without pdfrw. I already rewrote your patch producing the same output but I have no idea how to test it. For example what is this `~>` character at the end of the stream about? --- *By josch on 2020-04-05T18:16:11.515Z* --- I autogenerated a PNG image with ICC profile in it ![normalicc](/uploads/b741e8d48d697a45eb5c6b9525e238ff/normalicc.png) The problem is, that when I use your diff (or mine, doesn't matter the output is identical) then I'm unable to render this pdf so that the original input png get produced. I tried rendering it using: gs -dQUIET -dNOPAUSE -dBATCH -sDEVICE="$gsdevice" -r96 -sOutputFile="$tempdir/gs-%00d.png" "$pdf" pdftocairo -r 96 -png "$pdf" "$tempdir/poppler" mutool draw -o "$tempdir/mupdf.png" -r 96 "$pdf" 2>/dev/null But in all cases, using `compare` from imagemagick yielded differences -- why? --- *By josch on 2020-04-23T04:39:31.494Z* --- I now created a branch `icc` with my attempt at implementing this in this commit: 188619ef5259f27d85194046237fe4adb887d4ae As I pointed out earlier, this isn't working yet. The test case I created fails because the green part of my test image gets rendered differently by ghostscript, cairo and mupdf. Any help in finding out where the error lies is greatly appreciated. --- *By iwata74 on 2020-06-15T07:54:27.862Z* --- Sorry to be late. 1) The `~>` comes from PDF's ASCII base-85 encoding . See 3.3.2: https://ghostscript.com/~tor/pdfref17.pdf BTW I used ASCII base-85 encoding for ICC profile because many existing PDF files do so. As far as I tested, there is no problem without ASCII base-85 encoding. 2) Ghostscript or etc. converts the color space of images. They are not good tools in this case. Use `pdfimages` in `poppler-utils`. ``` sudo apt install -y poppler-utils pdfimages -j some.pdf foo ``` The `foo-000.jpg` should match. --- *By josch on 2020-06-15T08:19:00.592Z* --- No problem with "being late" -- we are all doing this in our free time as our hobby, so no rush. :) The fact that your pdfimages example works is not surprising. PDF embeds JPEG files without touching the JPEG data. So the JPEG that pdfimages output will be bit-by-bit identical to the input. So using pdfimages is not something that can be used to test this functionality. All this tests is, that indeed the JPEG that was put in was not modified. But this is not a test of the new `/ICCBased` stuff that this commit adds. Even worse: if you are correct and ghostscript, mupdf and poppler are *not* rendering the PDF correctly, then what is the point in including the profile in the first place? If they don't do the right thing, then we can never check whether including the profile had the correct effect or not. I suggest to *first* add support for icc profiles to at least one of these tools and *then* we can add it to img2pdf. --- *By iwata74 on 2020-06-15T13:42:34.113Z* --- I got it. I made a test image for Ghostscript or etc. It compares the results of Ghostscript and ImageMagick. [sRGB.icm](/uploads/a67f63645e6d27801bdf9a7742cf3658/sRGB.icm) ``` import img2pdf pdf = img2pdf.convert(['Lenna_16bit-adobergb.png']) with open('lenna.pdf', 'wb') as f: f.write(pdf) ``` ``` gs -dQUIET -dNOPAUSE -dBATCH -sDEVICE=png16m -r96 -sOutputFile="lenna-gs.png" lenna.pdf convert -profile sRGB.icm -depth 8 Lenna_16bit-adobergb.png lenna-im.png compare -metric ae lenna-gs.png lenna-im.png /dev/null ``` `compare` should echo `0`. --- *By josch on 2020-06-15T13:53:48.909Z* --- Thank you for your input. I will look into this. But please don't use the photo of a naked person from the playboy magazine as an example image -- there is really no practical need to use precisely that image as test input. Additionally, since neither of us has the copyright on the image, I will remove it to not violate the copyright of the playboy magazine by distributing their content with this gitlab instance. --- *By josch on 2020-08-06T22:16:44.189Z* --- Commit f0b57985eee77c2572ae94821d684df7659146c3 should implement the missing feature. But I still want to test with more than just ghostscript. See issue #83 for my concerns. --- *By josch on 2020-08-18T07:53:29.114Z* --- @iwata74 do you think this issue can be closed now with what is in git? --- *By iwata74 on 2020-08-19T02:09:02.533Z* --- I see no problem. Thanks. --- *By josch on 2020-08-19T03:17:47.681Z* --- Awesome, thanks! --- *By josch on 2020-08-19T03:17:48.015Z* --- Status changed to closed
josch closed this issue 3 years ago
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: josch/img2pdf#65
Loading…
There is no content yet.