HEIC support? #105

Closed
opened 3 years ago by Ghost · 16 comments
Ghost commented 3 years ago

Issue:
The iPhone uses a format to record photographs known as an heic.
The converter does not support this format.

Could it be added as a future feature?

Issue: The iPhone uses a format to record photographs known as an heic. The converter does not support this format. Could it be added as a future feature?
josch commented 3 years ago
Owner

img2pdf relies on PIL/Pillow for all image formats that are not supported by the PDF format: https://github.com/python-pillow/Pillow/issues/2806

Notice though, that even once Pillow gains support for the format, img2pdf will only be able to include HEIF images at the cost of a much larger file size because PDF doesn't support the format. If you do not mind loosing some information, you are better off converting your images to JPEGs in which case you can already use img2pdf today.

img2pdf relies on PIL/Pillow for all image formats that are not supported by the PDF format: https://github.com/python-pillow/Pillow/issues/2806 Notice though, that even once Pillow gains support for the format, img2pdf will only be able to include HEIF images at the cost of a much larger file size because PDF doesn't support the format. If you do not mind loosing some information, you are better off converting your images to JPEGs in which case you can already use img2pdf today.

You could use one of these external pillow heif openers as optional dependency, e. g. pillow-heif (Apache 2 licensed).

from importlib.util import find_spec
if find_spec('pillow_heif'):
    from pillow_heif import register_heif_opener
    register_heif_opener()
You could use one of these external pillow heif openers as optional dependency, e. g. [pillow-heif](https://pypi.org/project/pillow-heif/) (Apache 2 licensed). ```python3 from importlib.util import find_spec if find_spec('pillow_heif'): from pillow_heif import register_heif_opener register_heif_opener() ```
josch commented 2 years ago
Owner

There is nothing that img2pdf can do about this. There are workarounds listed in https://github.com/python-pillow/Pillow/issues/2806

There is nothing that img2pdf can do about this. There are workarounds listed in https://github.com/python-pillow/Pillow/issues/2806
josch closed this issue 2 years ago

There is nothing that img2pdf can do about this.

Could you clarify why you think pillow_heif is not an option for img2pdf?

> There is nothing that img2pdf can do about this. Could you clarify why you think `pillow_heif` is not an option for img2pdf?
josch commented 2 years ago
Owner

Because that is something that the library user is to set up, no?

Because that is something that the library user is to set up, no?

Hmm, yes, but what about CLI users? Couldn't img2pdf initialise pillow_heif in its main() function?

Hmm, yes, but what about CLI users? Couldn't img2pdf initialise pillow_heif in its `main()` function?
josch commented 2 years ago
Owner

I'm hesitatant to include this into the CLI interface before I have a way to verify that the result will be bit-by-bit identical. As far as I can see, neither Debian nor Ubuntu nor Fedora include a heif Python binding. So this feature would be untested.

I'm hesitatant to include this into the CLI interface before I have a way to verify that the result will be bit-by-bit identical. As far as I can see, neither Debian nor Ubuntu nor Fedora include a heif Python binding. So this feature would be untested.

Sorry, I don't understand the connection - pillow_heif has wheels on PyPI. The fact that it is not currently included in Linux distribution repositories doesn't mean it's not testable, does it?

Sorry, I don't understand the connection - `pillow_heif` has wheels on PyPI. The fact that it is not currently included in Linux distribution repositories doesn't mean it's not testable, does it?
josch commented 2 years ago
Owner

It does for me. Anybody can upload stuff to pypi. If it's included in one of the big distros, then somebody vouched with their GPG key that the software:

  • will be maintained for multiple years
  • does not do things with my system that I do not want
  • that the software is stable and mature
  • integrates well with GNU/Linux distributions
  • is maintainable at all

I do not have these assurances with software uploaded to pypi. Anybody can upload their stuff there and it does not go through another pair of eyes. Another thing that distributions maintainers do before including software into their distro is:

  • choose the one implementation functionality X that best meets above quality standards

There are multiple python heif implementations. Which one do I choose for long-term stability? I do not want to do the effort to investigate and find the answer myself.

So I'm not going to install pypi and download packages from there.

It does for me. Anybody can upload stuff to pypi. If it's included in one of the big distros, then somebody vouched with their GPG key that the software: * will be maintained for multiple years * does not do things with my system that I do not want * that the software is stable and mature * integrates well with GNU/Linux distributions * is maintainable at all I do not have these assurances with software uploaded to pypi. Anybody can upload their stuff there and it does not go through another pair of eyes. Another thing that distributions maintainers do before including software into their distro is: * choose the one implementation functionality X that best meets above quality standards There are multiple python heif implementations. Which one do I choose for long-term stability? I do not want to do the effort to investigate and find the answer myself. So I'm not going to install pypi and download packages from there.
josch commented 2 years ago
Owner

It's also interesting that you mention that they offer wheels. I'm absolutely not just downloading and running some other random person's binary on my machine. That's like downloading a setup.exe on Windows and running that and hoping for the best...

It's also interesting that you mention that they offer wheels. I'm absolutely not just downloading and running some other random person's binary on my machine. That's like downloading a setup.exe on Windows and running that and hoping for the best...

Well, as you mentioned above, converting HEIF with img2pdf is rather inadvisable due to the increase in file size anyway, so this of low importance.

However, I do have some comments on your general points regarding PyPI / Linux distros. While you're right the distribution packaging process may have security advantages, it also tends to be clumsy and lacking behind. As you say, no major distribution provides a Python HEIF package. Rather than abstaining from the functionality, I prefer to check PyPI. Generally, PyPI is great for development to keep your code base in sync with API changes in dependencies. If you work with the Python packages in Debian stable, your code base may well be outdated by 3 years' worth of development in dependencies.

In the end, any system is only as safe as the person who's using it. Most OSS users are mature enough to compare different packages and choose the solution they like best. Regardless of whether the binary is built by a distribution or by the person who's writing the software, there will always be someone whom you have to trust to some extent. While one can't check every single line, one can well take an overall look that may provide a good judgement whether the codebase is safe or not. It should also be noted that CI setups like GH Actions make the build process quite transparent (as in the case of pikepdf and pillow_heif).

That said, I doubt if the additional security provided by distributions is large. You'd be hard pushed to review huge and frequently updated codebases like Chromium with every single release. I suspect that often it rather comes down to just building the thing. Also it should be noted that some packages are maintained by their own author, as is the case for img2pdf in Debian.

Well, as you mentioned above, converting HEIF with img2pdf is rather inadvisable due to the increase in file size anyway, so this of low importance. However, I do have some comments on your general points regarding PyPI / Linux distros. While you're right the distribution packaging process may have security advantages, it also tends to be clumsy and lacking behind. As you say, no major distribution provides a Python HEIF package. Rather than abstaining from the functionality, I prefer to check PyPI. Generally, PyPI is great for development to keep your code base in sync with API changes in dependencies. If you work with the Python packages in Debian stable, your code base may well be outdated by 3 years' worth of development in dependencies. In the end, any system is only as safe as the person who's using it. Most OSS users are mature enough to compare different packages and choose the solution they like best. Regardless of whether the binary is built by a distribution or by the person who's writing the software, there will always be someone whom you have to trust to some extent. While one can't check every single line, one can well take an overall look that may provide a good judgement whether the codebase is safe or not. It should also be noted that CI setups like GH Actions make the build process quite transparent (as in the case of pikepdf and pillow_heif). That said, I doubt if the additional security provided by distributions is large. You'd be hard pushed to review huge and frequently updated codebases like Chromium with every single release. I suspect that often it rather comes down to just building the thing. Also it should be noted that some packages are maintained by their own author, as is the case for img2pdf in Debian.
josch commented 2 years ago
Owner

Instead of replying to your points (because that would serve no purpose as far as this bug is concerned) lets just agree to disagree on this. I'm not going to enable PyPi on my machine to add support for this.

If, on the other hand, there is somebody else willing to work on this and maintain it long term, I'd be merging that patch.

So if somebody wants to do the work, please just reopen this bug and file a merge request.

Thanks!

Instead of replying to your points (because that would serve no purpose as far as this bug is concerned) lets just agree to disagree on this. I'm not going to enable PyPi on my machine to add support for this. If, on the other hand, there is somebody else willing to work on this and maintain it long term, I'd be merging that patch. So if somebody wants to do the work, please just reopen this bug and file a merge request. Thanks!

If, on the other hand, there is somebody else willing to work on this and maintain it long term, I'd be merging that patch.

I could take care of it, but "long term" is a difficult topic. Hoping that Pillow will eventually implement native support for HEIF, this would only be an intermediate solution. Also I obviously can't guarantee that pillow_heif will continue to be maintained. Anything's transient. But within these constraints, I could do it.

> If, on the other hand, there is somebody else willing to work on this and maintain it long term, I'd be merging that patch. I could take care of it, but "long term" is a difficult topic. Hoping that Pillow will eventually implement native support for HEIF, this would only be an intermediate solution. Also I obviously can't guarantee that `pillow_heif` will continue to be maintained. Anything's transient. But within these constraints, I could do it.
https://gitlab.mister-muffin.de/josch/img2pdf/pulls/149
josch commented 2 years ago
Owner

@mara0004 "best effort" on your part would be enough for me. I know that you have been a long-term contributor to this project so your word would've certainly been enough for me.

Even if you ended up closing #149 I think your comments there are valuable for any future contributors and discussions on this topic.

Thanks!

@mara0004 "best effort" on your part would be enough for me. I know that you have been a long-term contributor to this project so your word would've certainly been enough for me. Even if you ended up closing #149 I think your comments there are valuable for any future contributors and discussions on this topic. Thanks!

Thanks, sorry it took me so long to realise that it's not really appropriate to do this in img2pdf itself.
Sometimes I need to attempt an implementation to actually get the problem.

Thanks, sorry it took me so long to realise that it's not really appropriate to do this in img2pdf itself. Sometimes I need to attempt an implementation to actually get the problem.
Sign in to join this conversation.
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: josch/img2pdf#105
Loading…
There is no content yet.