new(?) issue with converting 2-color TIFF images to PDF. #164

Closed
opened 1 year ago by smw · 17 comments
smw commented 1 year ago

I've read issue #66, and I'm aware that the problem reported there was fixed in version 0.3.4 of img2pdf. However, I'm currently running into exactly the same symptom with version 0.4.4 on Manjaro Linux (22.1.0).

Specifically, for various reasons I need to be able to work with bilevel TIFF images, and when a number of these (with exactly one image per .tif file) are converted to the same PDF file, most of the time Acrobat Reader will report "insufficient data for an image". This sometimes happens immediately when I try to open the file, but other times it occurs in the middle of the PDF file. The same PDF files worke perfectly with every program I've tried on Linux.

Furthermore, the problem occurs regardess of whether I convert many .tif files in one invocation of img2pdf, or whether I convert each image separately and later concatenate them with pdftk.

I can supply example files (both .tif and PDF) if that would help, along with whatever additional information may be useful.

I've read issue #66, and I'm aware that the problem reported there was fixed in version 0.3.4 of img2pdf. However, I'm currently running into exactly the same symptom with version 0.4.4 on Manjaro Linux (22.1.0). Specifically, for various reasons I need to be able to work with bilevel TIFF images, and when a number of these (with exactly one image per .tif file) are converted to the same PDF file, most of the time Acrobat Reader will report "insufficient data for an image". This sometimes happens immediately when I try to open the file, but other times it occurs in the middle of the PDF file. The same PDF files worke perfectly with every program I've tried on Linux. Furthermore, the problem occurs regardess of whether I convert many .tif files in one invocation of img2pdf, or whether I convert each image separately and later concatenate them with pdftk. I can supply example files (both .tif and PDF) if that would help, along with whatever additional information may be useful.
smw commented 12 months ago
Poster

After some experimenting, it seems that the problem occurs only for images that were edited using gimp version version 2.10.34, and exported from gimp using its default export options. I can't prove that this is the cause, but images exported this way reliably trigger the problem, and turning off gimp's "Export the image's color profile by default" option produces images that are reliably converted without errors.

The problematic images produce PDF files with this symptom:

% gs -dNOPAUSE -dBATCH -sDEVICE=nullpage borked.pdf
GPL Ghostscript 10.01.1 (2023-03-27)
Copyright (C) 2023 Artifex Software, Inc. All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
Unknown .defaultpapersize: (Letter).
Processing pages 1 through 1.
Page 1

The following errors were encountered at least once while processing this file:
ICCbased space /N value does not match the number of components in the ICC profile

The following warnings were encountered at least once while processing this file:
recoverable image error

  **** This file had errors that were repaired or ignored.
  **** The file was produced by: 
  **** >>>> img2pdf 0.4.4 <<<<
  **** Please notify the author of the software that produced this
  **** file that it does not conform to Adobe's published PDF
  **** specification.

...so it makes sense that the color profile is implicated.

After some experimenting, it seems that the problem occurs only for images that were edited using gimp version version 2.10.34, and exported from gimp using its default export options. I can't prove that this is the cause, but images exported this way reliably trigger the problem, and turning off gimp's "Export the image's color profile by default" option produces images that are reliably converted without errors. The problematic images produce PDF files with this symptom: % gs -dNOPAUSE -dBATCH -sDEVICE=nullpage borked.pdf GPL Ghostscript 10.01.1 (2023-03-27) Copyright (C) 2023 Artifex Software, Inc. All rights reserved. This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY: see the file COPYING for details. Unknown .defaultpapersize: (Letter). Processing pages 1 through 1. Page 1 The following errors were encountered at least once while processing this file: ICCbased space /N value does not match the number of components in the ICC profile The following warnings were encountered at least once while processing this file: recoverable image error **** This file had errors that were repaired or ignored. **** The file was produced by: **** >>>> img2pdf 0.4.4 <<<< **** Please notify the author of the software that produced this **** file that it does not conform to Adobe's published PDF **** specification. ...so it makes sense that the color profile is implicated.
Owner

I seem unable to find out how to create a bilevel image with GIMP. I can do a 256 color palette image or an 8-bit grayscale image. But which buttons do i have to click to make my image have only two colors?

Do you have an example file that triggers the problem?

I seem unable to find out how to create a bilevel image with GIMP. I can do a 256 color palette image or an 8-bit grayscale image. But which buttons do i have to click to make my image have only two colors? Do you have an example file that triggers the problem?
smw commented 11 months ago
Poster

I already replied to this by email, but I'll copy that reply here for the record.

I don't know how to create bilevel TIFF images in gimp either. The images I'm working with were created by scanning on Brother MFC-J6510DW all-in-one, using this command:

/bin/scanimage -d "brother4:bus1;dev3" --buffer-size=102400 --source "Flatbed" --reso
lution 600 --mode "Black & White" -x 228.579 -y 304.772 --batch="page_%03d_A-scanned.tif
" --batch-start 2 --format "tiff" --brightness 40

These scanned images are correctly converted by img2pdf.

The problem arises because the scanned images often (but not always) need to be touched up in various ways, which I do using gimp.

It took a while before I realized that the problem occurs repeatably with images saved by gimp, but doesn't occur at all without gimp. That was when I took a closer look and discovered that I could work around the problem by turning off the gimp export option to save the color profile.

As for samples, I've uploaded two files with this message. The first was edited and will fail, and the second wasn't edited and will be converted correctly.

I already replied to this by email, but I'll copy that reply here for the record. I don't know how to create bilevel TIFF images in gimp either. The images I'm working with were created by scanning on Brother MFC-J6510DW all-in-one, using this command: ``` /bin/scanimage -d "brother4:bus1;dev3" --buffer-size=102400 --source "Flatbed" --reso lution 600 --mode "Black & White" -x 228.579 -y 304.772 --batch="page_%03d_A-scanned.tif " --batch-start 2 --format "tiff" --brightness 40 ``` These scanned images are correctly converted by img2pdf. The problem arises because the scanned images often (but not always) need to be touched up in various ways, which I do using gimp. It took a while before I realized that the problem occurs repeatably with images saved by gimp, but doesn't occur at all without gimp. That was when I took a closer look and discovered that I could work around the problem by turning off the gimp export option to save the color profile. As for samples, I've uploaded two files with this message. The first was edited and will fail, and the second wasn't edited and will be converted correctly.
Owner
I believe this is this issue: https://gitlab.gnome.org/GNOME/gimp/-/issues/9518
smw commented 11 months ago
Poster

I think you're right. Is there any chance you'd be willing to modify img2pdf to detect that condition and ignore the false color profile?

I think you're right. Is there any chance you'd be willing to modify img2pdf to detect that condition and ignore the false color profile?
Owner

Yes. But that would require me to extract and parse the color profile and I do not know yet how to do that. Patches welcome.

Yes. But that would require me to extract and parse the color profile and I do not know yet how to do that. Patches welcome.
smw commented 11 months ago
Poster

Do you really need to parse the color profile? You already know that the image is a bilevel TIFF with CCITT Group 4 encoding; isn't it safe to forcibly set iccp to None and proceed on that basis?

Do you really need to parse the color profile? You already know that the image is a bilevel TIFF with CCITT Group 4 encoding; isn't it safe to forcibly set iccp to None and proceed on that basis?
smw commented 11 months ago
Poster

Just for fun, I added a print statement to the version 0.4.4 source code as follows:

*** img2pdf.py.original 2023-05-29 18:43:46.651841267 -0400
--- img2pdf.py  2023-05-29 18:39:15.568222019 -0400
***************
*** 1427,1432 ****
--- 1427,1433 ----
          iccp = imgdata.info.get("icc_profile")
  
      logger.debug("width x height = %dpx x %dpx", imgwidthpx, imgheightpx)
+     print("iccp = " + str(iccp)) # smw
      return (color, ndpi, imgwidthpx, imgheightpx, rotation, iccp)

Running this version on the uploaded page_002_A-scanned.tif produces the output

iccp = None

as expected. Running it on the uploaded page_001_A-scanned.tif produces this:

iccp = b'\x00\x00\x02\xa0lcms\x04@\x00\x00mntrRGB XYZ \x07\xe7\x00\x05\x00\x01\x00\x07\x00\x1b\x00\x03acspAPPL\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf6\xd6\x00\x01\x00\x00\x00\x00\xd3-lcms\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\rdesc\x00\x00\x01 \x00\x00\x00@cprt\x00\x00\x01`\x00\x00\x006wtpt\x00\x00\x01\x98\x00\x00\x00\x14chad\x00\x00\x01\xac\x00\x00\x00,rXYZ\x00\x00\x01\xd8\x00\x00\x00\x14bXYZ\x00\x00\x01\xec\x00\x00\x00\x14gXYZ\x00\x00\x02\x00\x00\x00\x00\x14rTRC\x00\x00\x02\x14\x00\x00\x00 gTRC\x00\x00\x02\x14\x00\x00\x00 bTRC\x00\x00\x02\x14\x00\x00\x00 chrm\x00\x00\x024\x00\x00\x00$dmnd\x00\x00\x02X\x00\x00\x00$dmdd\x00\x00\x02|\x00\x00\x00$mluc\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x0cenUS\x00\x00\x00$\x00\x00\x00\x1c\x00G\x00I\x00M\x00P\x00 \x00b\x00u\x00i\x00l\x00t\x00-\x00i\x00n\x00 \x00s\x00R\x00G\x00Bmluc\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x0cenUS\x00\x00\x00\x1a\x00\x00\x00\x1c\x00P\x00u\x00b\x00l\x00i\x00c\x00 \x00D\x00o\x00m\x00a\x00i\x00n\x00\x00XYZ \x00\x00\x00\x00\x00\x00\xf6\xd6\x00\x01\x00\x00\x00\x00\xd3-sf32\x00\x00\x00\x00\x00\x01\x0cB\x00\x00\x05\xde\xff\xff\xf3%\x00\x00\x07\x93\x00\x00\xfd\x90\xff\xff\xfb\xa1\xff\xff\xfd\xa2\x00\x00\x03\xdc\x00\x00\xc0nXYZ \x00\x00\x00\x00\x00\x00o\xa0\x00\x008\xf5\x00\x00\x03\x90XYZ \x00\x00\x00\x00\x00\x00$\x9f\x00\x00\x0f\x84\x00\x00\xb6\xc4XYZ \x00\x00\x00\x00\x00\x00b\x97\x00\x00\xb7\x87\x00\x00\x18\xd9para\x00\x00\x00\x00\x00\x03\x00\x00\x00\x02ff\x00\x00\xf2\xa7\x00\x00\rY\x00\x00\x13\xd0\x00\x00\n[chrm\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\xa3\xd7\x00\x00T|\x00\x00L\xcd\x00\x00\x99\x9a\x00\x00&g\x00\x00\x0f\\mluc\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x0cenUS\x00\x00\x00\x08\x00\x00\x00\x1c\x00G\x00I\x00M\x00Pmluc\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x0cenUS\x00\x00\x00\x08\x00\x00\x00\x1c\x00s\x00R\x00G\x00B'

So you're already extracting the color profile. I don't know how how to parse it either, but I'll take a look and see what I can find.

Just for fun, I added a print statement to the version 0.4.4 source code as follows: ``` *** img2pdf.py.original 2023-05-29 18:43:46.651841267 -0400 --- img2pdf.py 2023-05-29 18:39:15.568222019 -0400 *************** *** 1427,1432 **** --- 1427,1433 ---- iccp = imgdata.info.get("icc_profile") logger.debug("width x height = %dpx x %dpx", imgwidthpx, imgheightpx) + print("iccp = " + str(iccp)) # smw return (color, ndpi, imgwidthpx, imgheightpx, rotation, iccp) ``` Running this version on the uploaded page_002_A-scanned.tif produces the output ``` iccp = None ``` as expected. Running it on the uploaded page_001_A-scanned.tif produces this: ``` iccp = b'\x00\x00\x02\xa0lcms\x04@\x00\x00mntrRGB XYZ \x07\xe7\x00\x05\x00\x01\x00\x07\x00\x1b\x00\x03acspAPPL\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf6\xd6\x00\x01\x00\x00\x00\x00\xd3-lcms\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\rdesc\x00\x00\x01 \x00\x00\x00@cprt\x00\x00\x01`\x00\x00\x006wtpt\x00\x00\x01\x98\x00\x00\x00\x14chad\x00\x00\x01\xac\x00\x00\x00,rXYZ\x00\x00\x01\xd8\x00\x00\x00\x14bXYZ\x00\x00\x01\xec\x00\x00\x00\x14gXYZ\x00\x00\x02\x00\x00\x00\x00\x14rTRC\x00\x00\x02\x14\x00\x00\x00 gTRC\x00\x00\x02\x14\x00\x00\x00 bTRC\x00\x00\x02\x14\x00\x00\x00 chrm\x00\x00\x024\x00\x00\x00$dmnd\x00\x00\x02X\x00\x00\x00$dmdd\x00\x00\x02|\x00\x00\x00$mluc\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x0cenUS\x00\x00\x00$\x00\x00\x00\x1c\x00G\x00I\x00M\x00P\x00 \x00b\x00u\x00i\x00l\x00t\x00-\x00i\x00n\x00 \x00s\x00R\x00G\x00Bmluc\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x0cenUS\x00\x00\x00\x1a\x00\x00\x00\x1c\x00P\x00u\x00b\x00l\x00i\x00c\x00 \x00D\x00o\x00m\x00a\x00i\x00n\x00\x00XYZ \x00\x00\x00\x00\x00\x00\xf6\xd6\x00\x01\x00\x00\x00\x00\xd3-sf32\x00\x00\x00\x00\x00\x01\x0cB\x00\x00\x05\xde\xff\xff\xf3%\x00\x00\x07\x93\x00\x00\xfd\x90\xff\xff\xfb\xa1\xff\xff\xfd\xa2\x00\x00\x03\xdc\x00\x00\xc0nXYZ \x00\x00\x00\x00\x00\x00o\xa0\x00\x008\xf5\x00\x00\x03\x90XYZ \x00\x00\x00\x00\x00\x00$\x9f\x00\x00\x0f\x84\x00\x00\xb6\xc4XYZ \x00\x00\x00\x00\x00\x00b\x97\x00\x00\xb7\x87\x00\x00\x18\xd9para\x00\x00\x00\x00\x00\x03\x00\x00\x00\x02ff\x00\x00\xf2\xa7\x00\x00\rY\x00\x00\x13\xd0\x00\x00\n[chrm\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\xa3\xd7\x00\x00T|\x00\x00L\xcd\x00\x00\x99\x9a\x00\x00&g\x00\x00\x0f\\mluc\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x0cenUS\x00\x00\x00\x08\x00\x00\x00\x1c\x00G\x00I\x00M\x00Pmluc\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x0cenUS\x00\x00\x00\x08\x00\x00\x00\x1c\x00s\x00R\x00G\x00B' ``` So you're already extracting the color profile. I don't know how how to parse it either, but I'll take a look and see what I can find.
smw commented 11 months ago
Poster

This is probably a stupid question, but why doesn't '-C 1' sidestep the problem?

This is probably a stupid question, but why doesn't '-C 1' sidestep the problem?
Owner

Do you really need to parse the color profile? You already know that the image is a bilevel TIFF with CCITT Group 4 encoding; isn't it safe to forcibly set iccp to None and proceed on that basis?

The TIFF produced by GIMP are not CCITT Group 4 encoded but grayscale images with just two colors.

The problem with just dropping the ICC profile is, that the one thing that img2pdf does different from other conversion tools is, that img2pdf is lossless. So we have to be very careful before we discard any information from the input.

> Do you really need to parse the color profile? You already know that the image is a bilevel TIFF with CCITT Group 4 encoding; isn't it safe to forcibly set iccp to None and proceed on that basis? The TIFF produced by GIMP are not CCITT Group 4 encoded but grayscale images with just two colors. The problem with just dropping the ICC profile is, that the one thing that img2pdf does different from other conversion tools is, that img2pdf is lossless. So we have to be very careful before we discard any information from the input.
smw commented 11 months ago
Poster

I didn't know that about the format produced by gimp, so thank you for clarifying that.

Also, I understand why you need to be careful about losing information, but that's where the -C option comes back into the picture: even when I specify -C 1, the incorrect RGB color profile isn't replaced, and I don't understand why not. Is that meant to happen?

I didn't know that about the format produced by gimp, so thank you for clarifying that. Also, I understand why you need to be careful about losing information, but that's where the -C option comes back into the picture: even when I specify -C 1, the incorrect RGB color profile isn't replaced, and I don't understand why not. Is that meant to happen?
josch closed this issue 11 months ago
Owner

I pushed a commit that should fix this issue. Please re-open if it does not for you.

I pushed a commit that should fix this issue. Please re-open if it does not for you.
smw commented 11 months ago
Poster

I just tried it and it works beautifully. Thank you!

I just tried it and it works beautifully. Thank you!
smw commented 11 months ago
Poster

Sadly, I spoke too soon. The new update absolutely does correctly handle the case where the image has the incorrect color profile, but for images which don't have that (such as the original page_002_A-scanned.tif I uploaded), I get this:

img2pdf.py page_002_A-scanned.tif -v -o p2.pdf
DEBUG:PIL.TiffImagePlugin:*** TiffImageFile._open ***
DEBUG:PIL.TiffImagePlugin:- __first: 4124744
DEBUG:PIL.TiffImagePlugin:- ifh: b'II*\x00H\xf0>\x00'
DEBUG:PIL.TiffImagePlugin:Seeking to frame 0, on frame -1, __next 4124744, location: 8
DEBUG:PIL.TiffImagePlugin:Loading tags, location: 4124744
DEBUG:PIL.TiffImagePlugin:tag: ImageWidth (256) - type: short (3) - value: b'@\x13'
DEBUG:PIL.TiffImagePlugin:tag: ImageLength (257) - type: short (3) - value: b'(\x1a'
DEBUG:PIL.TiffImagePlugin:tag: BitsPerSample (258) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: Compression (259) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: PhotometricInterpretation (262) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: FillOrder (266) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: StripOffsets (273) - type: long (4) Tag Location: 4124830 - Data Location: 4125014 - value: b'\x08\x00\x00\x00\x08\xf1\x0f\x00\x08\xe2\x1f\x00\x08\xd3/\x00'
DEBUG:PIL.TiffImagePlugin:tag: Orientation (274) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: SamplesPerPixel (277) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: RowsPerStrip (278) - type: short (3) - value: b'\xa0\x06'
DEBUG:PIL.TiffImagePlugin:tag: StripByteCounts (279) - type: long (4) Tag Location: 4124878 - Data Location: 4124998 - value: b'\x00\xf1\x0f\x00\x00\xf1\x0f\x00\x00\xf1\x0f\x00@\x1d\x0f\x00'
DEBUG:PIL.TiffImagePlugin:tag: XResolution (282) - type: rational (5) Tag Location: 4124890 - Data Location: 4124966 - value: b'X\x02\x00\x00\x01\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: YResolution (283) - type: rational (5) Tag Location: 4124902 - Data Location: 4124974 - value: b'X\x02\x00\x00\x01\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: PlanarConfiguration (284) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: XPosition (286) - type: rational (5) Tag Location: 4124926 - Data Location: 4124982 - value: b'\x00\x00\x00\x00\x01\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: YPosition (287) - type: rational (5) Tag Location: 4124938 - Data Location: 4124990 - value: b'\x00\x00\x00\x00\x01\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: ResolutionUnit (296) - type: short (3) - value: b'\x02\x00'
DEBUG:PIL.TiffImagePlugin:tag: PageNumber (297) - type: short (3) - value: b'\x00\x00\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: ImageWidth (256) - type: short (3) - value: b'@\x13'
DEBUG:PIL.TiffImagePlugin:tag: ImageLength (257) - type: short (3) - value: b'(\x1a'
DEBUG:PIL.TiffImagePlugin:tag: BitsPerSample (258) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: Compression (259) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: PhotometricInterpretation (262) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: FillOrder (266) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: StripOffsets (273) - type: long (4) Tag Location: 4124830 - Data Location: 4125014 - value: b'\x08\x00\x00\x00\x08\xf1\x0f\x00\x08\xe2\x1f\x00\x08\xd3/\x00'
DEBUG:PIL.TiffImagePlugin:tag: Orientation (274) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: SamplesPerPixel (277) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: RowsPerStrip (278) - type: short (3) - value: b'\xa0\x06'
DEBUG:PIL.TiffImagePlugin:tag: StripByteCounts (279) - type: long (4) Tag Location: 4124878 - Data Location: 4124998 - value: b'\x00\xf1\x0f\x00\x00\xf1\x0f\x00\x00\xf1\x0f\x00@\x1d\x0f\x00'
DEBUG:PIL.TiffImagePlugin:tag: XResolution (282) - type: rational (5) Tag Location: 4124890 - Data Location: 4124966 - value: b'X\x02\x00\x00\x01\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: YResolution (283) - type: rational (5) Tag Location: 4124902 - Data Location: 4124974 - value: b'X\x02\x00\x00\x01\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: PlanarConfiguration (284) - type: short (3) - value: b'\x01\x00'
DEBUG:PIL.TiffImagePlugin:tag: XPosition (286) - type: rational (5) Tag Location: 4124926 - Data Location: 4124982 - value: b'\x00\x00\x00\x00\x01\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: YPosition (287) - type: rational (5) Tag Location: 4124938 - Data Location: 4124990 - value: b'\x00\x00\x00\x00\x01\x00\x00\x00'
DEBUG:PIL.TiffImagePlugin:tag: ResolutionUnit (296) - type: short (3) - value: b'\x02\x00'
DEBUG:PIL.TiffImagePlugin:tag: PageNumber (297) - type: short (3) - value: b'\x00\x00\x01\x00'
DEBUG:PIL.TiffImagePlugin:*** Summary ***
DEBUG:PIL.TiffImagePlugin:- compression: raw
DEBUG:PIL.TiffImagePlugin:- photometric_interpretation: 1
DEBUG:PIL.TiffImagePlugin:- planar_configuration: 1
DEBUG:PIL.TiffImagePlugin:- fill_order: 1
DEBUG:PIL.TiffImagePlugin:- YCbCr subsampling: None
DEBUG:PIL.TiffImagePlugin:- size: (4928, 6696)
DEBUG:PIL.TiffImagePlugin:format key: (b'II', 1, (1,), 1, (1,), ())
DEBUG:PIL.TiffImagePlugin:- raw mode: 1
DEBUG:PIL.TiffImagePlugin:- pil mode: 1
DEBUG:__main__:PIL format = TIFF
DEBUG:__main__:imgformat = TIFF
DEBUG:__main__:Converting frame: 0
DEBUG:__main__:input dpi = 600 x 600
DEBUG:__main__:rotation = 0°
DEBUG:__main__:input colorspace = 1
ERROR:__main__:error: cannot open profile from string
Traceback (most recent call last):
  File "/home/smw/bin/img2pdf", line 4342, in main
    convert(
  File "/home/smw/bin/img2pdf", line 2636, in convert
    ) in read_images(
         ^^^^^^^^^^^^
  File "/home/smw/bin/img2pdf", line 2079, in read_images
    color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata(
                                                           ^^^^^^^^^^^^^^^^
  File "/home/smw/bin/img2pdf", line 1481, in get_imgmetadata
    prf = ImageCms.ImageCmsProfile(f)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/PIL/ImageCms.py", line 191, in __init__
    self._set(core.profile_frombytes(profile.read()))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: cannot open profile from string
Sadly, I spoke too soon. The new update absolutely does correctly handle the case where the image has the incorrect color profile, but for images which don't have that (such as the original page_002_A-scanned.tif I uploaded), I get this: ``` img2pdf.py page_002_A-scanned.tif -v -o p2.pdf DEBUG:PIL.TiffImagePlugin:*** TiffImageFile._open *** DEBUG:PIL.TiffImagePlugin:- __first: 4124744 DEBUG:PIL.TiffImagePlugin:- ifh: b'II*\x00H\xf0>\x00' DEBUG:PIL.TiffImagePlugin:Seeking to frame 0, on frame -1, __next 4124744, location: 8 DEBUG:PIL.TiffImagePlugin:Loading tags, location: 4124744 DEBUG:PIL.TiffImagePlugin:tag: ImageWidth (256) - type: short (3) - value: b'@\x13' DEBUG:PIL.TiffImagePlugin:tag: ImageLength (257) - type: short (3) - value: b'(\x1a' DEBUG:PIL.TiffImagePlugin:tag: BitsPerSample (258) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: Compression (259) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: PhotometricInterpretation (262) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: FillOrder (266) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: StripOffsets (273) - type: long (4) Tag Location: 4124830 - Data Location: 4125014 - value: b'\x08\x00\x00\x00\x08\xf1\x0f\x00\x08\xe2\x1f\x00\x08\xd3/\x00' DEBUG:PIL.TiffImagePlugin:tag: Orientation (274) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: SamplesPerPixel (277) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: RowsPerStrip (278) - type: short (3) - value: b'\xa0\x06' DEBUG:PIL.TiffImagePlugin:tag: StripByteCounts (279) - type: long (4) Tag Location: 4124878 - Data Location: 4124998 - value: b'\x00\xf1\x0f\x00\x00\xf1\x0f\x00\x00\xf1\x0f\x00@\x1d\x0f\x00' DEBUG:PIL.TiffImagePlugin:tag: XResolution (282) - type: rational (5) Tag Location: 4124890 - Data Location: 4124966 - value: b'X\x02\x00\x00\x01\x00\x00\x00' DEBUG:PIL.TiffImagePlugin:tag: YResolution (283) - type: rational (5) Tag Location: 4124902 - Data Location: 4124974 - value: b'X\x02\x00\x00\x01\x00\x00\x00' DEBUG:PIL.TiffImagePlugin:tag: PlanarConfiguration (284) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: XPosition (286) - type: rational (5) Tag Location: 4124926 - Data Location: 4124982 - value: b'\x00\x00\x00\x00\x01\x00\x00\x00' DEBUG:PIL.TiffImagePlugin:tag: YPosition (287) - type: rational (5) Tag Location: 4124938 - Data Location: 4124990 - value: b'\x00\x00\x00\x00\x01\x00\x00\x00' DEBUG:PIL.TiffImagePlugin:tag: ResolutionUnit (296) - type: short (3) - value: b'\x02\x00' DEBUG:PIL.TiffImagePlugin:tag: PageNumber (297) - type: short (3) - value: b'\x00\x00\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: ImageWidth (256) - type: short (3) - value: b'@\x13' DEBUG:PIL.TiffImagePlugin:tag: ImageLength (257) - type: short (3) - value: b'(\x1a' DEBUG:PIL.TiffImagePlugin:tag: BitsPerSample (258) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: Compression (259) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: PhotometricInterpretation (262) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: FillOrder (266) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: StripOffsets (273) - type: long (4) Tag Location: 4124830 - Data Location: 4125014 - value: b'\x08\x00\x00\x00\x08\xf1\x0f\x00\x08\xe2\x1f\x00\x08\xd3/\x00' DEBUG:PIL.TiffImagePlugin:tag: Orientation (274) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: SamplesPerPixel (277) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: RowsPerStrip (278) - type: short (3) - value: b'\xa0\x06' DEBUG:PIL.TiffImagePlugin:tag: StripByteCounts (279) - type: long (4) Tag Location: 4124878 - Data Location: 4124998 - value: b'\x00\xf1\x0f\x00\x00\xf1\x0f\x00\x00\xf1\x0f\x00@\x1d\x0f\x00' DEBUG:PIL.TiffImagePlugin:tag: XResolution (282) - type: rational (5) Tag Location: 4124890 - Data Location: 4124966 - value: b'X\x02\x00\x00\x01\x00\x00\x00' DEBUG:PIL.TiffImagePlugin:tag: YResolution (283) - type: rational (5) Tag Location: 4124902 - Data Location: 4124974 - value: b'X\x02\x00\x00\x01\x00\x00\x00' DEBUG:PIL.TiffImagePlugin:tag: PlanarConfiguration (284) - type: short (3) - value: b'\x01\x00' DEBUG:PIL.TiffImagePlugin:tag: XPosition (286) - type: rational (5) Tag Location: 4124926 - Data Location: 4124982 - value: b'\x00\x00\x00\x00\x01\x00\x00\x00' DEBUG:PIL.TiffImagePlugin:tag: YPosition (287) - type: rational (5) Tag Location: 4124938 - Data Location: 4124990 - value: b'\x00\x00\x00\x00\x01\x00\x00\x00' DEBUG:PIL.TiffImagePlugin:tag: ResolutionUnit (296) - type: short (3) - value: b'\x02\x00' DEBUG:PIL.TiffImagePlugin:tag: PageNumber (297) - type: short (3) - value: b'\x00\x00\x01\x00' DEBUG:PIL.TiffImagePlugin:*** Summary *** DEBUG:PIL.TiffImagePlugin:- compression: raw DEBUG:PIL.TiffImagePlugin:- photometric_interpretation: 1 DEBUG:PIL.TiffImagePlugin:- planar_configuration: 1 DEBUG:PIL.TiffImagePlugin:- fill_order: 1 DEBUG:PIL.TiffImagePlugin:- YCbCr subsampling: None DEBUG:PIL.TiffImagePlugin:- size: (4928, 6696) DEBUG:PIL.TiffImagePlugin:format key: (b'II', 1, (1,), 1, (1,), ()) DEBUG:PIL.TiffImagePlugin:- raw mode: 1 DEBUG:PIL.TiffImagePlugin:- pil mode: 1 DEBUG:__main__:PIL format = TIFF DEBUG:__main__:imgformat = TIFF DEBUG:__main__:Converting frame: 0 DEBUG:__main__:input dpi = 600 x 600 DEBUG:__main__:rotation = 0° DEBUG:__main__:input colorspace = 1 ERROR:__main__:error: cannot open profile from string Traceback (most recent call last): File "/home/smw/bin/img2pdf", line 4342, in main convert( File "/home/smw/bin/img2pdf", line 2636, in convert ) in read_images( ^^^^^^^^^^^^ File "/home/smw/bin/img2pdf", line 2079, in read_images color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata( ^^^^^^^^^^^^^^^^ File "/home/smw/bin/img2pdf", line 1481, in get_imgmetadata prf = ImageCms.ImageCmsProfile(f) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/PIL/ImageCms.py", line 191, in __init__ self._set(core.profile_frombytes(profile.read())) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ OSError: cannot open profile from string ```
smw reopened this issue 11 months ago
Owner

Are you at the most recent git HEAD? I pushed another commit 40 minutes ago.

Are you at the most recent git HEAD? I pushed another commit 40 minutes ago.
smw commented 11 months ago
Poster

I'm not sure exactly what happened, but I have the latest commit now, and this one really does work perfectly. My apology for the inconvenience, and thank you again for doing this!

I'm not sure exactly what happened, but I have the latest commit now, and this one really does work perfectly. My apology for the inconvenience, and thank you again for doing this!
smw closed this issue 11 months ago
Owner

It wasn't your fault but mine. The original commit missed another condition and thus triggered the bug you saw. I fixed that problem in another commit shortly after.

I'm happy that this is working for you now!

Thank you for helping me track down this problem 💙

It wasn't your fault but mine. The original commit missed another condition and thus triggered the bug you saw. I fixed that problem in another commit shortly after. I'm happy that this is working for you now! Thank you for helping me track down this problem 💙
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: josch/img2pdf#164
Loading…
There is no content yet.