Error processing jpegs made on iPhone #36

Closed
opened 3 years ago by josch · 0 comments
josch commented 3 years ago
Owner

By Andrey Gursky on 2017-10-17T12:20:56.686Z

Hi,

I received some screenshots made on iPhone, e.g.:

test

img2pdf fails to process them on Debian testing:

$ img2pdf --output test.pdf test.jpg
ERROR:root:error: division by zero
$ ls -l test.pdf
-rw-r--r-- 1 andrey andrey 0 Oct 17 14:08 test.pdf

Verbose output:

$ img2pdf -v --output test.pdf test.jpg
DEBUG:root:imgformat = JPEG
DEBUG:root:input dpi = 0 x 0
DEBUG:root:input colorspace = RGB
DEBUG:root:width x height = 1125px x 1446px
ERROR:root:error: division by zero
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/img2pdf.py", line 1719, in main
    first_frame_only=args.first_frame_only)
  File "/usr/lib/python3/dist-packages/img2pdf.py", line 1025, in convert
    layout_fun(imgwidthpx, imgheightpx, ndpi)
  File "/usr/lib/python3/dist-packages/img2pdf.py", line 957, in default_layout_fun
    imgwidthpdf = pagewidth = px_to_pt(imgwidthpx, ndpi[0])
  File "/usr/lib/python3/dist-packages/img2pdf.py", line 747, in px_to_pt
    return 72*length/dpi
ZeroDivisionError: division by zero

From the man page:

Image and page size and layout arguments:

    Every input image will be placed on its own page.
    The image size is controlled by the dpi value of the input image or,
        if unset or missing, the default dpi of 96.00.

So if the X-/Y-resolution is set, but unfortunately to a meaningless value of 0, it should be overridden by the default either, I guess.

Regards,

Andrey


By josch on 2017-10-17T12:27:20.296Z


Hi!

Thanks for your bugreport! To further investigate this issue it would be really helpful to have access to the original input image. I see the version you uploaded (md5sum 3f2a5b5b7c23122c338d6775e46fe2ee) but that image is probably not the same as the input you used. If I use img2pdf on it, it succeeds without any problems. Can you confirm that observation? You could send me a private email to josch@mister-muffin.de with the original image attached.

Thanks!

cheers, josch


By Andrey Gursky on 2017-10-17T14:23:29.226Z


The md5sum is correct, so I'm surprised. As long as you're using the Debian package (0.2.3-1) and you can see at least

DEBUG:root:input dpi = 0 x 0

it couldn't pass the line with division. Or it takes for some reason another execution path...


By josch on 2017-10-17T14:57:41.272Z


Okay, this is most surprising. If this is about the Debian package, you should have reported this in the Debian bugtracker but for convenience lets continue the discussion here.

I'm also trying this out with the Debian package:

$ dpkg -l | grep img2pdf                                                        
ii  img2pdf                                 0.2.3-1                              all          onversion of raster images to PDF
ii  python3-img2pdf                         0.2.3-1                              all          onversion of raster images to PDF (library)
$ md5sum test.jpg
3f2a5b5b7c23122c338d6775e46fe2ee  test.jpg
$ img2pdf -v --output test.pdf test.jpg
DEBUG:root:imgformat = JPEG
DEBUG:root:input dpi = 96 x 96
DEBUG:root:input colorspace = RGB
DEBUG:root:width x height = 1125px x 1446px
DEBUG:root:ImageFormat.JPEG

So this might not even be about the version of img2pdf. Specifically, the image properties are retrieved using PIL. What is the version of the python3-pil package on your system?


By Andrey Gursky on 2017-10-17T15:10:17.855Z


Okay, this is most surprising. If this is about the Debian package, you should have reported this in the Debian bugtracker but for convenience lets continue the discussion here.

For some reason the Debian package is not updated since 5 months, so I've assumed my bug report would also not get attention there.

DEBUG:root:input dpi = 96 x 96

geeqie shows X-Resolution = 0 and Y-Resolution = 0 in the EXIF-window for this image. I'm wondering, how PIL can get this 96x96.

So this might not even be about the version of img2pdf. Specifically, the image properties are retrieved using PIL. What is the version of the python3-pil package on your system?

The version of python3-pil is 4.2.1-1.


By josch on 2017-10-17T15:27:58.822Z


Bingo! Once I upgrade my version of python3-pil from 4.0.0-4 to 4.2.1-1 I am getting your error. I'll investigate. Thanks!


By josch on 2017-10-18T08:25:44.284Z


Okay, the problem was introduced by Pillow git commits 07a96209597c5e8dfe785c757d7051ce67a980fb and 53df62647af19f47819380373b7b4abd1ffe79ff. Since those commits, Pillow will also use the JPEG EXIF data to figure out the DPI if the DPI are not recorded in the JPEG header. The problem then is, that the respective EXIF values stored in 0x011A can be zero. In that case, Pillow will not return "None" as it did before but instead a DPI of (0, 0) which doesn't make much sense. I think a good solution would be for img2pdf to fallback to the default dpi if the image claims that it has zero dots per inch.


By josch on 2017-10-18T08:36:08.459Z


Status changed to closed by commit 9836b976d3

*By Andrey Gursky on 2017-10-17T12:20:56.686Z* Hi, I received some screenshots made on iPhone, e.g.: ![test](/uploads/ab41c8605f484ef2aa54c8817a5d3d26/test.jpg) img2pdf fails to process them on Debian testing: ``` $ img2pdf --output test.pdf test.jpg ERROR:root:error: division by zero $ ls -l test.pdf -rw-r--r-- 1 andrey andrey 0 Oct 17 14:08 test.pdf ``` Verbose output: ``` $ img2pdf -v --output test.pdf test.jpg DEBUG:root:imgformat = JPEG DEBUG:root:input dpi = 0 x 0 DEBUG:root:input colorspace = RGB DEBUG:root:width x height = 1125px x 1446px ERROR:root:error: division by zero Traceback (most recent call last): File "/usr/lib/python3/dist-packages/img2pdf.py", line 1719, in main first_frame_only=args.first_frame_only) File "/usr/lib/python3/dist-packages/img2pdf.py", line 1025, in convert layout_fun(imgwidthpx, imgheightpx, ndpi) File "/usr/lib/python3/dist-packages/img2pdf.py", line 957, in default_layout_fun imgwidthpdf = pagewidth = px_to_pt(imgwidthpx, ndpi[0]) File "/usr/lib/python3/dist-packages/img2pdf.py", line 747, in px_to_pt return 72*length/dpi ZeroDivisionError: division by zero ``` From the man page: >>> Image and page size and layout arguments: Every input image will be placed on its own page. The image size is controlled by the dpi value of the input image or, if unset or missing, the default dpi of 96.00. >>> So if the X-/Y-resolution is set, but unfortunately to a meaningless value of 0, it should be overridden by the default either, I guess. Regards, Andrey --- *By josch on 2017-10-17T12:27:20.296Z* --- Hi! Thanks for your bugreport! To further investigate this issue it would be really helpful to have access to the original input image. I see the version you uploaded (md5sum 3f2a5b5b7c23122c338d6775e46fe2ee) but that image is probably not the same as the input you used. If I use img2pdf on it, it succeeds without any problems. Can you confirm that observation? You could send me a private email to josch@mister-muffin.de with the original image attached. Thanks! cheers, josch --- *By Andrey Gursky on 2017-10-17T14:23:29.226Z* --- The md5sum is correct, so I'm surprised. As long as you're using the Debian package (0.2.3-1) and you can see at least ``` DEBUG:root:input dpi = 0 x 0 ``` it couldn't pass the line with division. Or it takes for some reason another execution path... --- *By josch on 2017-10-17T14:57:41.272Z* --- Okay, this is most surprising. If this is about the Debian package, you should have reported this in the Debian bugtracker but for convenience lets continue the discussion here. I'm also trying this out with the Debian package: ``` $ dpkg -l | grep img2pdf ii img2pdf 0.2.3-1 all onversion of raster images to PDF ii python3-img2pdf 0.2.3-1 all onversion of raster images to PDF (library) $ md5sum test.jpg 3f2a5b5b7c23122c338d6775e46fe2ee test.jpg $ img2pdf -v --output test.pdf test.jpg DEBUG:root:imgformat = JPEG DEBUG:root:input dpi = 96 x 96 DEBUG:root:input colorspace = RGB DEBUG:root:width x height = 1125px x 1446px DEBUG:root:ImageFormat.JPEG ``` So this might not even be about the version of img2pdf. Specifically, the image properties are retrieved using PIL. What is the version of the `python3-pil` package on your system? --- *By Andrey Gursky on 2017-10-17T15:10:17.855Z* --- > Okay, this is most surprising. If this is about the Debian package, you should have reported this in the Debian bugtracker but for convenience lets continue the discussion here. For some reason the Debian package is not updated since 5 months, so I've assumed my bug report would also not get attention there. > ``` DEBUG:root:input dpi = 96 x 96 ``` geeqie shows X-Resolution = 0 and Y-Resolution = 0 in the EXIF-window for this image. I'm wondering, how PIL can get this 96x96. > So this might not even be about the version of img2pdf. Specifically, the image properties are retrieved using PIL. What is the version of the python3-pil package on your system? The version of python3-pil is 4.2.1-1. --- *By josch on 2017-10-17T15:27:58.822Z* --- Bingo! Once I upgrade my version of `python3-pil` from 4.0.0-4 to 4.2.1-1 I am getting your error. I'll investigate. Thanks! --- *By josch on 2017-10-18T08:25:44.284Z* --- Okay, the problem was introduced by Pillow git commits 07a96209597c5e8dfe785c757d7051ce67a980fb and 53df62647af19f47819380373b7b4abd1ffe79ff. Since those commits, Pillow will also use the JPEG EXIF data to figure out the DPI if the DPI are not recorded in the JPEG header. The problem then is, that the respective EXIF values stored in 0x011A can be zero. In that case, Pillow will not return "None" as it did before but instead a DPI of (0, 0) which doesn't make much sense. I think a good solution would be for img2pdf to fallback to the default dpi if the image claims that it has zero dots per inch. --- *By josch on 2017-10-18T08:36:08.459Z* --- Status changed to closed by commit 9836b976d3e815c6b4f73a3c03afa7fb822dfa3c
josch closed this issue 3 years ago
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: josch/img2pdf#36
Loading…
There is no content yet.