When I convert 1 bit (black and white only) PNGs created by GIMP they become much larger than the orginal.
I think its because the metadata saved with the file is causing the bug.
The metadata are the default metadata that GIMP offered to save PNGs with.
PNG with metadata 1bit.png 74,0651 bytes
Converted to PDF (much larger): 1bit.pdf 172,594 bytes
PNG but no metadata 1bit_no_metadata.png 69,971 bytes
I think the bug may be due to the colorspace used.
I tried forcing the colorspace using the --colorspace option.
Surprisingly forcing the Black and White colorspace results in a PDF thats too large (the input PDF is a 1 bit Black and White image).
But forcing a Grayscale colorspace results in a correclty sized PDF!?
Why does forcing the correct Black and White colorspace casue a larger PDF?
img2pdf seems to be using a TiffImagePlugin (on a PNG) is that part of the problem?
These errors are at the end of the Black and White colorspace conversion log:
Fax3SetupState: Bits/sample must be 1 for Group 3/4 encoding/decoding. DEBUG:img2pdf:encoder error -2 when writing image file DEBUG:img2pdf:Converting colorspace 1 to L
Full conversion logs
Force Grayscale colorspace (result is correct sized PDF)
$ img2pdf -v -o 1bit_force_colorspace_grayscale.pdf --colorspace L 1bit.png
DEBUG:PIL.PngImagePlugin:STREAM b'IHDR' 16 13
DEBUG:PIL.PngImagePlugin:STREAM b'zTXt' 41 203
DEBUG:PIL.PngImagePlugin:STREAM b'iCCP' 256 388
DEBUG:PIL.PngImagePlugin:iCCP profile name b'ICC profile'
DEBUG:PIL.PngImagePlugin:Compression method 0
DEBUG:PIL.PngImagePlugin:STREAM b'iTXt' 656 3448
DEBUG:PIL.PngImagePlugin:STREAM b'PLTE' 4116 6
DEBUG:PIL.PngImagePlugin:STREAM b'pHYs' 4134 9
DEBUG:PIL.PngImagePlugin:STREAM b'tIME' 4155 7
DEBUG:PIL.PngImagePlugin:b'tIME' 4155 7 (unknown)
DEBUG:PIL.PngImagePlugin:STREAM b'IDAT' 4174 8192
DEBUG:img2pdf:PIL format = PNG
DEBUG:img2pdf:imgformat = PNG
DEBUG:img2pdf:input dpi = 300 x 300
DEBUG:PIL.TiffImagePlugin:tag: ImageWidth (256) - type: long (4) - value: b'\xb0\t\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: ImageLength (257) - type: long (4) - value: b'\xb4\r\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: BitsPerSample (258) - type: short (3) Tag Location: 46 - Data Location: 134 - value: b'\x08\x00\x08\x00\x08\x00'
DEBUG:PIL.TiffImagePlugin:tag: Orientation (274) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: XResolution (282) - type: rational (5) Tag Location: 70 - Data Location: 140 - value: b'\xfc)\x00\x00[\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: YResolution (283) - type: rational (5) Tag Location: 82 - Data Location: 148 - value: b'\xfc)\x00\x00[\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: ResolutionUnit (296) - type: short (3) - value: b'\x03\x00'
DEBUG:PIL.TiffImagePlugin:tag: Software (305) - type: string (2) Tag Location: 106 - Data Location: 156 - value: b'GIMP 2.10.34\x00'
DEBUG:PIL.TiffImagePlugin:tag: DateTime (306) - type: string (2) Tag Location: 118 - Data Location: 170 - value: b'2023:03:29 11:59:41\x00'
DEBUG:PIL.TiffImagePlugin:tag: ExifIFD (34665) - type: long (4) - value: b'\xbe\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: ColorSpace (40961) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: ColorSpace (40961) - type: short (3) - value: b'\x01\x00'
DEBUG:img2pdf:rotation = 0°
DEBUG:img2pdf:input colorspace (forced) = Colorspace.L
DEBUG:img2pdf:width x height = 2480px x 3508px
DEBUG:img2pdf:Converting frame: 0
DEBUG:img2pdf:input dpi = 300 x 300
DEBUG:PIL.TiffImagePlugin:tag: ColorSpace (40961) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: ColorSpace (40961) - type: short (3) - value: b'\x01\x00'
DEBUG:img2pdf:rotation = 0°
DEBUG:img2pdf:input colorspace (forced) = Colorspace.L
DEBUG:img2pdf:width x height = 2480px x 3508px
DEBUG:img2pdf:Colorspace is OK: Colorspace.L
DEBUG:img2pdf:read_images() encoded an image as PNG
Force Black and White colorspace (result is a too large sized PDF)
I did some more tesing and the metadata that is triggering the bug seems to be GIMP's "Save color profile" option. GIMP's default PNG options don't have that selected, but including a color profile is part of the offical PNG spec.
I did some more tesing and the metadata that is triggering the bug seems to be GIMP's "Save color profile" option. GIMP's default PNG options don't have that selected, but including a color profile is part of the offical [PNG spec](https://www.w3.org/TR/PNG/#11iCCP).
Hi, could you share a png image that causes this so that I can reproduce this problem? Thanks!
Is your bilevel png image just a palette png with only 2 palette entries? The png format itself does not support bilevel images natively.
The reason the TIFF plugin comes into play is because the CCITT fax encoding compresses bilevel image data much better than the png paeth filter does.
Hi, could you share a png image that causes this so that I can reproduce this problem? Thanks!
Is your bilevel png image just a palette png with only 2 palette entries? The png format itself does not support bilevel images natively.
The reason the TIFF plugin comes into play is because the CCITT fax encoding compresses bilevel image data much better than the png paeth filter does.
Can you confirm that fixes this issue and then close it if it does?
I've pushed some commits which will not include the ICC profile saved by GIMP for bilevel images. See #164 and https://gitlab.gnome.org/GNOME/gimp/-/issues/3438 for details.
Can you confirm that fixes this issue and then close it if it does?
(Note that 1bit_no_metadata.png actually has resolution metadata.)
Is your bilevel png image just a palette png with only 2 palette entries? The png format itself does not support bilevel images natively.
I think so. To create a bilevel image in GIMP I use
Image > Mode > Indexed Colormap > Use Black and White (1-bit) palette
I also have this selected Remove unused and duplicate colors from colormap in the same dialog.
I've pushed some commits ...>
Can you confirm that fixes this issue and then close it if it does?
I'm not very competent with compiling apps. I just run with what's in my distribution's repository. But I'll take a look into it.
> Hi, could you share a png image that causes this so that I can reproduce this problem? Thanks!
>
Sorry for the delay.
I've attached the files that I used in my first comment.
(Attached as zip file, ~~not sure if that will work here?~~)
PNG with metadata
`1bit.png 74,0651 bytes`
Converted to PDF (much larger):
`1bit.pdf 172,594 bytes`
PNG but no metadata
`1bit_no_metadata.png 69,971 bytes`
Converted (correct size):
`1bit_no_metadata.pdf 71,352 bytes`
(Note that `1bit_no_metadata.png` actually has resolution metadata.)
> Is your bilevel png image just a palette png with only 2 palette entries? The png format itself does not support bilevel images natively.
I think so. To create a bilevel image in GIMP I use
`Image > Mode > Indexed`
`Colormap > Use Black and White (1-bit) palette`
I also have this selected `Remove unused and duplicate colors from colormap` in the same dialog.
> I've pushed some commits ...>
> Can you confirm that fixes this issue and then close it if it does?
I'm not very competent with compiling apps. I just run with what's in my distribution's repository. But I'll take a look into it.
I thought that I had to wait for a follow-up. Sorry for the misunderstanding! :)
I now looked into it a bit more and I'm able to confirm your observation and understand the problem. This is indeed another instance of the GIMP bug https://gitlab.gnome.org/GNOME/gimp/-/issues/3438 but this time for PNG images and not for TIFF images. My solution for issue #164 only covered TIFF images and thus this issue is not fixed.
The PNG format supports 1-bit (bilevel) grayscale images but that's not the kind of image produced by GIMP. Instead what you showed here are palette images but with only two colors in the palette: black and white.
So the solution to this problem would be to add some code similar to what was added for TIFF which auto-detects palette PNG images with only two colors as they are created by GIMP and drop the ICC profile for those.
Aha, whoops, sorry, when you wrote:
> But I'll take a look into it.
I thought that I had to wait for a follow-up. Sorry for the misunderstanding! :)
I now looked into it a bit more and I'm able to confirm your observation and understand the problem. This is indeed another instance of the GIMP bug https://gitlab.gnome.org/GNOME/gimp/-/issues/3438 but this time for PNG images and not for TIFF images. My solution for issue #164 only covered TIFF images and thus this issue is not fixed.
The PNG format supports 1-bit (bilevel) grayscale images but that's not the kind of image produced by GIMP. Instead what you showed here are palette images but with only two colors in the palette: black and white.
So the solution to this problem would be to add some code similar to what was added for TIFF which auto-detects palette PNG images with only two colors as they are created by GIMP and drop the ICC profile for those.
diff --git a/src/img2pdf.py b/src/img2pdf.py
index 06f2e7b..ee7acfc 100755
--- a/src/img2pdf.py
+++ b/src/img2pdf.py
@@ -1436,11 +1445,22 @@ def get_imgmetadata(
iccp = None
if "icc_profile" in imgdata.info:
iccp = imgdata.info.get("icc_profile")
- # GIMP saves bilevel tiff images with an RGB ICC profile which is useless
+ # GIMP saves bilevel TIFF images and palette PNG images with only black and
+ # white in the palette with an RGB ICC profile which is useless
+ # https://gitlab.gnome.org/GNOME/gimp/-/issues/3438
# and produces an error in Adobe Acrobat, so we ignore it with a warning.
# imagemagick also used to (wrongly) include an RGB ICC profile for bilevel
# images: https://github.com/ImageMagick/ImageMagick/issues/2070
- if iccp is not None and color == Colorspace["1"] and imgformat == ImageFormat.TIFF:
+ if iccp is not None and (
+ (color == Colorspace["1"] and imgformat == ImageFormat.TIFF)
+ or (
+ imgformat == ImageFormat.PNG
+ and color == Colorspace.P
+ and rawdata is not None
+ and parse_png(rawdata)[1]
+ in [b"\x00\x00\x00\xff\xff\xff", b"\xff\xff\xff\x00\x00\x00"]
+ )
+ ):
with io.BytesIO(iccp) as f:
prf = ImageCms.ImageCmsProfile(f)
if (
@@ -1448,7 +1468,14 @@ def get_imgmetadata(
and prf.profile.manufacturer == "GIMP"
and prf.profile.profile_description == "GIMP built-in sRGB"
):
- logger.warning("Ignoring RGB ICC profile in bilevel TIFF produced by GIMP.")
+ if imgformat == ImageFormat.TIFF:
+ logger.warning(
+ "Ignoring RGB ICC profile in bilevel TIFF produced by GIMP."
+ )
+ elif imgformat == ImageFormat.PNG:
+ logger.warning(
+ "Ignoring RGB ICC profile in 2-color palette PNG produced by GIMP."
+ )
logger.warning("https://gitlab.gnome.org/GNOME/gimp/-/issues/3438")
iccp = None
Here is a patch that should work:
```patch
diff --git a/src/img2pdf.py b/src/img2pdf.py
index 06f2e7b..ee7acfc 100755
--- a/src/img2pdf.py
+++ b/src/img2pdf.py
@@ -1436,11 +1445,22 @@ def get_imgmetadata(
iccp = None
if "icc_profile" in imgdata.info:
iccp = imgdata.info.get("icc_profile")
- # GIMP saves bilevel tiff images with an RGB ICC profile which is useless
+ # GIMP saves bilevel TIFF images and palette PNG images with only black and
+ # white in the palette with an RGB ICC profile which is useless
+ # https://gitlab.gnome.org/GNOME/gimp/-/issues/3438
# and produces an error in Adobe Acrobat, so we ignore it with a warning.
# imagemagick also used to (wrongly) include an RGB ICC profile for bilevel
# images: https://github.com/ImageMagick/ImageMagick/issues/2070
- if iccp is not None and color == Colorspace["1"] and imgformat == ImageFormat.TIFF:
+ if iccp is not None and (
+ (color == Colorspace["1"] and imgformat == ImageFormat.TIFF)
+ or (
+ imgformat == ImageFormat.PNG
+ and color == Colorspace.P
+ and rawdata is not None
+ and parse_png(rawdata)[1]
+ in [b"\x00\x00\x00\xff\xff\xff", b"\xff\xff\xff\x00\x00\x00"]
+ )
+ ):
with io.BytesIO(iccp) as f:
prf = ImageCms.ImageCmsProfile(f)
if (
@@ -1448,7 +1468,14 @@ def get_imgmetadata(
and prf.profile.manufacturer == "GIMP"
and prf.profile.profile_description == "GIMP built-in sRGB"
):
- logger.warning("Ignoring RGB ICC profile in bilevel TIFF produced by GIMP.")
+ if imgformat == ImageFormat.TIFF:
+ logger.warning(
+ "Ignoring RGB ICC profile in bilevel TIFF produced by GIMP."
+ )
+ elif imgformat == ImageFormat.PNG:
+ logger.warning(
+ "Ignoring RGB ICC profile in 2-color palette PNG produced by GIMP."
+ )
logger.warning("https://gitlab.gnome.org/GNOME/gimp/-/issues/3438")
iccp = None
```
I should have replied back. I don't understand PIP and what it does and what changes it makes to a system.
I've seen a few warnings about PIP being risky and a potential source of malware (obviously not your code).
I'm short of time at the moment (who isn't?!) so I can't dig into PIP some more and try to figure out what it does and what its risks are (and if I'm honest I'm not sure I want to!)
I'm not a developer, just a Linux user who has had to look under the hood a few times.
tl;dr I'm not comfortable using PIP.
I should have replied back. I don't understand PIP and what it does and what changes it makes to a system.
I've seen a few warnings about PIP being risky and a potential source of malware (obviously not your code).
I'm short of time at the moment (who isn't?!) so I can't dig into PIP some more and try to figure out what it does and what its risks are (and if I'm honest I'm not sure I want to!)
I'm not a developer, just a Linux user who has had to look under the hood a few times.
@monobot no problem! I just wanted to give you the opportunity to test if those changes do what you expect. That way we reduce the chance of me fixing something that is not really the problem you observed. If you don't want to try out the diff I posted for whatever reason that's totally fine. I think and hope the diff fixes what I understood as your problem. It is obviously not your responsibility to test any diff.
I'm happy you reported this problem and hope that this is fixed with the next release!
If you don't use pip, where do you obtain img2pdf from?
@monobot no problem! I just wanted to give you the opportunity to test if those changes do what you expect. That way we reduce the chance of me fixing something that is not really the problem you observed. If you don't want to try out the diff I posted for whatever reason that's totally fine. I think and hope the diff fixes what I understood as your problem. It is obviously not your responsibility to test any diff.
I'm happy you reported this problem and hope that this is fixed with the next release!
If you don't use pip, where do you obtain img2pdf from?
@josch. Thanks for giving me the chance to test the changes.
It's not that I don't want to test the changes, it's I'm not able to!
PIP is totally new to me and I've no idea what it does (and more importantly what it does behind the scenes).
I install img2pdf using Debian's apt installer.
I just run sudo apt install img2pdf and it gets installed.
Is pip a common way for regular users to install apps? I thought it was more for developers.
Thanks for working on this issue. And thanks for img2pdf, it's an extremely useful app.
@josch. Thanks for giving me the chance to test the changes.
It's not that I don't *want* to test the changes, it's I'm *not able* to!
PIP is totally new to me and I've no idea what it does (and more importantly what it does behind the scenes).
I install img2pdf using Debian's `apt` installer.
I just run `sudo apt install img2pdf` and it gets installed.
Is pip a common way for regular users to install apps? I thought it was more for developers.
Thanks for working on this issue. And thanks for img2pdf, it's an extremely useful app.
Then you are in luck: I'm also maintaining the img2pdf package in Debian so once I release a new version here I'll also upload that version to Debian unstable. :)
Then you are in luck: I'm also maintaining the img2pdf package in Debian so once I release a new version here I'll also upload that version to Debian unstable. :)
Hi,
When I convert 1 bit (black and white only) PNGs created by GIMP they become much larger than the orginal.
I think its because the metadata saved with the file is causing the bug.
The metadata are the
defaultmetadata that GIMP offered to save PNGs with.PNG with metadata
1bit.png 74,0651 bytes
Converted to PDF (much larger):
1bit.pdf 172,594 bytes
PNG but no metadata
1bit_no_metadata.png 69,971 bytes
Converted (correct size):
1bit_no_metadata.pdf 71,352 bytes
EDIT: Actually for the no-metadata PNG there was some metatdata saved: the image resolution.
Software details
GIMP 2.10.34
img2pdf 0.4.4
Debian Testing (12 Bookworm)
Conversion logs
PNG with metadata:
PNG with no metadata:
GIMP 1bit PNGs (with metatadata) bugto GIMP 1bit PNGs (with metadata) bug 1 year agoI think the bug may be due to the colorspace used.
I tried forcing the colorspace using the
--colorspace
option.Surprisingly forcing the Black and White colorspace results in a PDF thats too large (the input PDF is a 1 bit Black and White image).
But forcing a Grayscale colorspace results in a correclty sized PDF!?
(The input PNG has metadata)
Why does forcing the correct Black and White colorspace casue a larger PDF?
img2pdf
seems to be using aTiffImagePlugin
(on a PNG) is that part of the problem?These errors are at the end of the Black and White colorspace conversion log:
Fax3SetupState: Bits/sample must be 1 for Group 3/4 encoding/decoding. DEBUG:img2pdf:encoder error -2 when writing image file DEBUG:img2pdf:Converting colorspace 1 to L
Full conversion logs
Force Grayscale colorspace (result is correct sized PDF)
Force Black and White colorspace (result is a too large sized PDF)
I did some more tesing and the metadata that is triggering the bug seems to be GIMP's "Save color profile" option. GIMP's default PNG options don't have that selected, but including a color profile is part of the offical PNG spec.
Hi, could you share a png image that causes this so that I can reproduce this problem? Thanks!
Is your bilevel png image just a palette png with only 2 palette entries? The png format itself does not support bilevel images natively.
The reason the TIFF plugin comes into play is because the CCITT fax encoding compresses bilevel image data much better than the png paeth filter does.
I've pushed some commits which will not include the ICC profile saved by GIMP for bilevel images. See #164 and https://gitlab.gnome.org/GNOME/gimp/-/issues/3438 for details.
Can you confirm that fixes this issue and then close it if it does?
Sorry for the delay.
I've attached the files that I used in my first comment.
(Attached as zip file,
not sure if that will work here?)PNG with metadata
1bit.png 74,0651 bytes
Converted to PDF (much larger):
1bit.pdf 172,594 bytes
PNG but no metadata
1bit_no_metadata.png 69,971 bytes
Converted (correct size):
1bit_no_metadata.pdf 71,352 bytes
(Note that
1bit_no_metadata.png
actually has resolution metadata.)I think so. To create a bilevel image in GIMP I use
Image > Mode > Indexed
Colormap > Use Black and White (1-bit) palette
I also have this selected
Remove unused and duplicate colors from colormap
in the same dialog.I'm not very competent with compiling apps. I just run with what's in my distribution's repository. But I'll take a look into it.
Aha, whoops, sorry, when you wrote:
I thought that I had to wait for a follow-up. Sorry for the misunderstanding! :)
I now looked into it a bit more and I'm able to confirm your observation and understand the problem. This is indeed another instance of the GIMP bug https://gitlab.gnome.org/GNOME/gimp/-/issues/3438 but this time for PNG images and not for TIFF images. My solution for issue #164 only covered TIFF images and thus this issue is not fixed.
The PNG format supports 1-bit (bilevel) grayscale images but that's not the kind of image produced by GIMP. Instead what you showed here are palette images but with only two colors in the palette: black and white.
So the solution to this problem would be to add some code similar to what was added for TIFF which auto-detects palette PNG images with only two colors as they are created by GIMP and drop the ICC profile for those.
Here is a patch that should work:
tl;dr I'm not comfortable using PIP.
I should have replied back. I don't understand PIP and what it does and what changes it makes to a system.
I've seen a few warnings about PIP being risky and a potential source of malware (obviously not your code).
I'm short of time at the moment (who isn't?!) so I can't dig into PIP some more and try to figure out what it does and what its risks are (and if I'm honest I'm not sure I want to!)
I'm not a developer, just a Linux user who has had to look under the hood a few times.
@monobot no problem! I just wanted to give you the opportunity to test if those changes do what you expect. That way we reduce the chance of me fixing something that is not really the problem you observed. If you don't want to try out the diff I posted for whatever reason that's totally fine. I think and hope the diff fixes what I understood as your problem. It is obviously not your responsibility to test any diff.
I'm happy you reported this problem and hope that this is fixed with the next release!
If you don't use pip, where do you obtain img2pdf from?
@josch. Thanks for giving me the chance to test the changes.
It's not that I don't want to test the changes, it's I'm not able to!
PIP is totally new to me and I've no idea what it does (and more importantly what it does behind the scenes).
I install img2pdf using Debian's
apt
installer.I just run
sudo apt install img2pdf
and it gets installed.Is pip a common way for regular users to install apps? I thought it was more for developers.
Thanks for working on this issue. And thanks for img2pdf, it's an extremely useful app.
Then you are in luck: I'm also maintaining the img2pdf package in Debian so once I release a new version here I'll also upload that version to Debian unstable. :)