large jp2000 file doesn't convert #18
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
By josch on 2015-03-15T09:41:52.904Z
Created by: marc-p
Hello,
I've create both a jp2 and j2k file of a large image (36x48") in a COTS program (geosoft oasis montaj). I'm using img2pdf in a 64 bit Windows environment, so I installed Pillow and updated your script to say "From PIL import Image" It runs, but creates a large (26x200"), apparently empty image. I've tried Adobe Reader X, and Nitro Reader 3.
I am able to import the image back into geosoft, and view it in another GIS application (FME Universal viewer), but not in Corel PhotoPaint X5. Gimp 2.86 can view the images but they come up greyscale for some reason.
The same image as a png works beautifully.
I'm unable to attach the file here, as its over 30 MB. Can I yousendit to you?
Imported comments:
By josch on 2013-08-29 19:37:24 UTC
img2pdf just dumps the content of your jpeg into the pdf. Possible sources of error are:
Are you python savvy enough to have a look at what values img2pdf figures out for width/height/color/dpi? Otherwise I can quickly add a --verbose option which will tell you that.
If you want me to try the file, just upload it somewhere. I would not like to have a 30MB file in my email inbox.
By josch on 2013-08-30 09:05:51 UTC
Hi,
I discovered a problem. Your jpeg2000 file had the width/height at a different position in the file (4 byte offset) than my other jpeg2000 test files. This is why width and height were read wrongly. I think I found the value which specifies that offset. Now your file gets read as being of 11929 x 7145 pixels. This looks not too wrong.
Now your jpeg2000 gets copied correctly into the pdf and the sizes are set correctly. If acrobat still can't read it, then it is because your jpeg2000 files are "weird".
By bitsgalore on 2013-08-30 11:22:53 UTC
Hi,
Just saw this discussion. If this is about "weird" jpeg 2000 files, you might want to check out this, which is a validator tool for JPEG 2000 Part 1 (aka JP2) that is able to detect all sorts of JPEG 2000 weirdness:
http://www.openplanetsfoundation.org/software/jpylyzer
(Incidentally I'm the main author of that tool.) Don't know if this is of any help, but just thought I'd mention it.
Cheers,
Johan
-----Original Message-----
From: josch [mailto:notifications@github.com]
Sent: Fri 30/08/2013 11:05
To: josch/img2pdf
Subject: Re: [img2pdf] large jp2000 file doesn't convert (#4)
Hi,
I discovered a problem. Your jpeg2000 file had the width/height at a different position in the file (4 byte offset) than my other jpeg2000 test files. This is why width and height were read wrongly. I think I found the value which specifies that offset. Now your file gets read as being of 11929 x 7145 pixels. This looks not too wrong.
Now your jpeg2000 gets copied correctly into the pdf and the sizes are set correctly. If acrobat still can't read it, then it is because your jpeg2000 files are "weird".
Reply to this email directly or view it on GitHub:
https://github.com/josch/img2pdf/issues/4#issuecomment-23548918
By josch on 2013-08-31 07:07:47 UTC
Hi Johan,
Quoting Johan van der Knijff (2013-08-30 13:22:54)
Thanks a lot for that hint! Now, by reading the jpylyzer code I can even figure
out how jpeg2000 actually is supposed to work! Somehow I was not able to find
any documentation for the jpeg2000 file format online. Even finding out at what
offset it saves width and height seemed impossible so I ended up figuring it
out by looking at hex dumps.
The only thing which I found weird about jpylyzer was, that its output is XML.
First I thought that I was using it wrongly because the only thing it output
was a big xml blob. Only when I looked deeper and ran the XML through a
prettifier I figured out, that XML was actually the intended output of
jpylyzer. Maybe you should write somewhere in the beginning that the default
output of jpylyzer is in XML format? Using other terminal applications on a
regular basis it was quite unexpected that it was using XML output. Which is
also why sentences in the readme like "In the above example, output is
redirected to the file 'rubbish.xml'." only confused me because I was asking
myself: "why would I want to save the output to an XML file???". Maybe you can
mention this fact some place in the beginning of the README or docs? The user
manual only states this on page 20.
Thanks a lot for this tool - now I can actually learn how jpeg2000 works. No
idea why I was not able to find any actual documentation on it.
PS: the jp2 file by marc-p seemed to be valid according to jpylyzer :)
cheers, josch
By bitsgalore on 2013-08-31 13:36:57 UTC
Hi Josch,
Actually the filespec for JP2 is here (downloadable for free):
http://www.jpeg.org/public/15444-1annexi.pdf
However that doesn't include the image codestream syntax. The spec for that is behind a paywall, but there is a free (though partially outdated) Final Committee Draft that'll give you the general idea:
http://www.jpeg.org/public/fcd15444-1.pdf
Another useful link, just in case you're interested in any of the other JPEG 2000 formats:
http://fileformats.archiveteam.org/wiki/JPEG2000
As for your comments regarding XML output: yes, I might emphasize that a bit at the top of the readme. From the outset jpylyzer was really designed to be used as a component in automated workflows, and for that XML is much easier to use/process than human readable text. Also, originally jpylyzer's output was pretty-printed, but under certain circumstances that would result in weird Unicode errors under Python 3.x. Some recent improvements of the code should have fixed that, so if i have a bit of time I'll see if I can re-introduce pretty printing in an upcoming version. Meanwhile I'd suggest to use a dedicated XML viewer/editor or even a web browser for inspecting jpylyzer's output, as I fully agree it does look pretty terrible in a text editor!
Cheers,
Johan
By josch on 2013-10-21 13:57:54 UTC
I just added a tiny new parsing module based on jpylyzer to read jpeg2000 files more properly than it was done so far. With this change, the colorspace should now also be correctly detected.