Support for transparent images (png) #76

New issue

Closed

opened 2021-04-25 19:58:45 +00:00 by josch · 1 comment

josch commented

2021-04-25 19:58:45 +00:00

Owner

By jose1711 on 2020-05-04T14:14:22.200Z

Input images with alpha channels are not allowed. PDF doesn't support alpha channels in images and thus, the alpha channel of the input would have to be discarded.

Is this still true? See this post: https://stackoverflow.com/questions/14220221/how-to-insert-transparent-png-in-pdf

wget http://pd4ml.com/i/pd4ml18130.pdf
pdftopng pd4ml18130.pdf out1
identify -format '%[channels]' out1-000001.png # srgb
pdftopng -alpha pd4ml18130.pdf out2
identify -format '%[channels]' out2-000001.png # srgba

By josch on 2020-05-04T14:56:51.029Z

As the link you quote also shows you, pdf uses masks for transparency. This means that the result is not lossless.

If you know of a way to turn a png with alpha channel into a pdf and then back into a raster image with the exact same RGBA information, please tell.

Lastly, what would the use-case be? Why would you need transparency in the first place?

By jose1711 on 2020-05-04T19:58:51.875Z

My use-case would be a couple of background images with transparent overlays placed over them. I'd like to put them into PDF but with a possibility of breaking them into separate images again if needed. I know that currently img2pdf does not support stacking of images but who knows some day it might. For now I'll probably go with something like this:

# prepare sample images (one background, 3 overlays)
convert -size 100x150 rose: rose.png
convert -size 100x150 xc:transparent -pointsize 72 -fill black -annotate +20+100 'A' -trim text1.png
convert -size 100x150 xc:transparent -pointsize 72 -fill black -annotate +20+100 'B' -trim text2.png
convert -size 100x150 xc:transparent -pointsize 72 -fill black -annotate +20+100 'C' -trim text3.png

cat <<HERE > gen_pdf.py
import fitz  # requires pymupdf
doc = fitz.open()
page = doc.newPage()
rect = page.rect
bg = open("rose.png", "rb").read()
layer1 = open("text1.png", "rb").read()
layer2 = open("text2.png", "rb").read()
layer3 = open("text3.png", "rb").read()
page.insertImage(rect, stream = bg)
page.insertImage(rect, stream = layer1)
page.insertImage(rect, stream = layer2)
page.insertImage(rect, stream = layer3)
doc.save("foo.pdf", deflate=True)
HERE

python gen_pdf.py  # generates pdf
mkdir out
cd out
# re-extract images
python -m fitz extract ../foo.pdf -images

visually compare images in both locations

By josch on 2020-05-04T20:45:38.045Z

I can assure you, that even in the future, for as long as I maintain the software, img2pdf will not support stacking images. If somebody thinks that this feature should be in img2pdf they can always fork the software.

I see you already figured out how to use pymupdf to do what you need. Another alternative would be repeating use of pdftk stamp to put one image over the other.

Can this issue be closed then because you solved your problem?

By jose1711 on 2020-05-04T21:46:19.261Z

img2pdf will not support stacking .. if somebody thinks this feature should be.. they can always fork

Isn't this bit kind of an oxymoron? :-)

Correct, I've solved the problem for me. I still was kinda hoping for a view from an expert. Or maybe the statements in Bugs section (the one cited at the beginning of this issue) needs to be slightly corrected? B/c to be honest after reading it I almost trashed the whole idea (yet it seems to work quite OK).

By josch on 2020-05-04T22:06:48.445Z

Should anybody fork img2pdf I hope they also change the name -- hence I don't think it's a contradiction.

No correction is needed. PDF doesn't support alpha channels. PDF supports masks but those are binary masks, so it's impossible to retain the 8-bit alpha channel of an input image.

If you still think an improvement could be made, please suggest the change you'd like to make.

By jose1711 on 2020-05-05T21:02:54.692Z

Here's my suggestion:
Input images with alpha channels are not allowed. PDF doesn't support alpha channels in images (note that images with binary transparency masks are supported though) and thus, the alpha channel of the input would have to be discarded.

But probably you can improve it further. Thanks for your fast responses and feel free to close.

By josch on 2020-05-06T06:56:00.122Z

Status changed to closed by commit 17dd59e722

*By jose1711 on 2020-05-04T14:14:22.200Z* > Input images with alpha channels are not allowed. PDF doesn't support alpha channels in images and thus, the alpha channel of the input would have to be discarded. Is this still true? See this post: https://stackoverflow.com/questions/14220221/how-to-insert-transparent-png-in-pdf ``` wget http://pd4ml.com/i/pd4ml18130.pdf pdftopng pd4ml18130.pdf out1 identify -format '%[channels]' out1-000001.png # srgb pdftopng -alpha pd4ml18130.pdf out2 identify -format '%[channels]' out2-000001.png # srgba ``` --- *By josch on 2020-05-04T14:56:51.029Z* --- As the link you quote also shows you, pdf uses masks for transparency. This means that the result is not lossless. If you know of a way to turn a png with alpha channel into a pdf and then back into a raster image with the exact same RGBA information, please tell. Lastly, what would the use-case be? Why would you need transparency in the first place? --- *By jose1711 on 2020-05-04T19:58:51.875Z* --- My use-case would be a couple of background images with transparent overlays placed over them. I'd like to put them into PDF but with a possibility of breaking them into separate images again if needed. I know that currently `img2pdf` does not support stacking of images but who knows some day it might. For now I'll probably go with something like this: ``` # prepare sample images (one background, 3 overlays) convert -size 100x150 rose: rose.png convert -size 100x150 xc:transparent -pointsize 72 -fill black -annotate +20+100 'A' -trim text1.png convert -size 100x150 xc:transparent -pointsize 72 -fill black -annotate +20+100 'B' -trim text2.png convert -size 100x150 xc:transparent -pointsize 72 -fill black -annotate +20+100 'C' -trim text3.png cat <<HERE > gen_pdf.py import fitz # requires pymupdf doc = fitz.open() page = doc.newPage() rect = page.rect bg = open("rose.png", "rb").read() layer1 = open("text1.png", "rb").read() layer2 = open("text2.png", "rb").read() layer3 = open("text3.png", "rb").read() page.insertImage(rect, stream = bg) page.insertImage(rect, stream = layer1) page.insertImage(rect, stream = layer2) page.insertImage(rect, stream = layer3) doc.save("foo.pdf", deflate=True) HERE python gen_pdf.py # generates pdf mkdir out cd out # re-extract images python -m fitz extract ../foo.pdf -images visually compare images in both locations ``` --- *By josch on 2020-05-04T20:45:38.045Z* --- I can assure you, that even in the future, for as long as I maintain the software, img2pdf will not support stacking images. If somebody thinks that this feature should be in img2pdf they can always fork the software. I see you already figured out how to use pymupdf to do what you need. Another alternative would be repeating use of `pdftk stamp` to put one image over the other. Can this issue be closed then because you solved your problem? --- *By jose1711 on 2020-05-04T21:46:19.261Z* --- > img2pdf will not support stacking .. if somebody thinks this feature should be.. they can always fork Isn't this bit kind of an oxymoron? :-) Correct, I've solved the problem for me. I still was kinda hoping for a view from an expert. Or maybe the statements in Bugs section (the one cited at the beginning of this issue) needs to be slightly corrected? B/c to be honest after reading it I almost trashed the whole idea (yet it seems to work quite OK). --- *By josch on 2020-05-04T22:06:48.445Z* --- Should anybody fork img2pdf I hope they also change the name -- hence I don't think it's a contradiction. No correction is needed. PDF doesn't support alpha channels. PDF supports masks but those are binary masks, so it's impossible to retain the 8-bit alpha channel of an input image. If you still think an improvement could be made, please suggest the change you'd like to make. --- *By jose1711 on 2020-05-05T21:02:54.692Z* --- Here's my suggestion: Input images with alpha channels are not allowed. PDF doesn't support alpha channels in images **(note that images with binary transparency masks are supported though)** and thus, the alpha channel of the input would have to be discarded. But probably you can improve it further. Thanks for your fast responses and feel free to close. --- *By josch on 2020-05-06T06:56:00.122Z* --- Status changed to closed by commit 17dd59e72207f8f810141bb9a73c7015c74dca3a