Massive increase in file size #7

Closed
opened 2021-04-25 19:57:29 +00:00 by josch · 0 comments
Owner

By josch on 2015-03-15T09:41:47.742Z

Created by: ComFreek

My JPG is 166 KB big and the PDF, which img2pdf outputs, has a file size of 3,18 MB. Is there something that can be done?

Imported comments:

By josch on 2014-12-25 00:07:56 UTC

This should not happen.

Can you somehow show me the JPG for which this happens?

By ComFreek on 2014-12-25 11:26:44 UTC

@josch Thanks for your response. Could I send the picture via mail since I don't want to publicly upload it? You can find my mail address on my profile: https://github.com/ComFreek

By josch on 2014-12-25 11:30:41 UTC

@ComFreek of course! Please send it to my email j.schauer@email.de

By josch on 2014-12-25 12:56:29 UTC

Hi, using your image I cannot reproduce the problem.

If I do:

img2pdf -o out.pdf xxx.jpg

then the resulting out.pdf will be 159K in size and not 732K as the one you sent me.

Your pdf is that big because somehow the image got encoded using FlateDecode which takes significantly more space than the original jpeg.

So what we have to find out is why the jpeg gets converted to FlateDecode in your case and stays in JPEG format (as it should) in my case.

Can you give me information about the platform/OS and python and PIL version you are using?

Can you also see if the problem is the same for you with other input jpeg images?

By ComFreek on 2014-12-25 18:38:49 UTC

  • Windows 8.1 Pro 64-bit
  • Python 2.7.8 (32-bit)
  • PIL 1.1.7 as per the following commands:
from PIL import Image
Image.VERSION

The problem occurs with every JPG image I used. The JPG images have been created from scanned TIFFs using PowerShell (.NET System.Drawing.Image::Save).

By josch on 2014-12-25 19:37:37 UTC

Okay, the JPG you created does not look any special. Especially because the same jpeg resulted in a perfectly fine PDF on my system.

So this looks like it does not correctly detect the input file as a jpeg.

Could you run the conversion with --verbose on your system and paste the output on the terminal here?

By ComFreek on 2014-12-25 19:59:43 UTC

Never mind, I found my own silly error.
The PowerShell script I used for automation fed img2pdf.exe with the original TIFF instead of the JPG file.

Nonetheless, thanks for your effort and sorry for having wasted your time!

*By josch on 2015-03-15T09:41:47.742Z* *Created by: ComFreek* My JPG is 166 KB big and the PDF, which img2pdf outputs, has a file size of 3,18 MB. Is there something that can be done? **Imported comments:** *By josch on 2014-12-25 00:07:56 UTC* This should not happen. Can you somehow show me the JPG for which this happens? *By ComFreek on 2014-12-25 11:26:44 UTC* @josch Thanks for your response. Could I send the picture via mail since I don't want to publicly upload it? You can find my mail address on my profile: https://github.com/ComFreek *By josch on 2014-12-25 11:30:41 UTC* @ComFreek of course! Please send it to my email j.schauer@email.de *By josch on 2014-12-25 12:56:29 UTC* Hi, using your image I cannot reproduce the problem. If I do: img2pdf -o out.pdf xxx.jpg then the resulting `out.pdf` will be 159K in size and not 732K as the one you sent me. Your pdf is that big because somehow the image got encoded using FlateDecode which takes significantly more space than the original jpeg. So what we have to find out is why the jpeg gets converted to FlateDecode in your case and stays in JPEG format (as it should) in my case. Can you give me information about the platform/OS and python and PIL version you are using? Can you also see if the problem is the same for you with other input jpeg images? *By ComFreek on 2014-12-25 18:38:49 UTC* - Windows 8.1 Pro 64-bit - Python 2.7.8 (32-bit) - PIL 1.1.7 as per the following commands: ``` from PIL import Image Image.VERSION ``` The problem occurs with every JPG image I used. The JPG images have been created from scanned TIFFs using PowerShell (.NET [System.Drawing.Image::Save](http://msdn.microsoft.com/en-us/library/ytz20d80(v=vs.110).aspx)). *By josch on 2014-12-25 19:37:37 UTC* Okay, the JPG you created does not look any special. Especially because the same jpeg resulted in a perfectly fine PDF on my system. So this looks like it does not correctly detect the input file as a jpeg. Could you run the conversion with `--verbose` on your system and paste the output on the terminal here? *By ComFreek on 2014-12-25 19:59:43 UTC* Never mind, I found my own silly error. The PowerShell script I used for automation fed img2pdf.exe with the original TIFF instead of the JPG file. Nonetheless, thanks for your effort and sorry for having wasted your time!
josch closed this issue 2021-04-25 19:57:29 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: josch/img2pdf#7
No description provided.