Compare commits

...

30 commits
main ... main

Author SHA1 Message Date
819b366bf5
release version 0.5.1 2023-11-26 06:33:10 +01:00
cc8c708295
HACKING: how to bisect 2023-11-25 09:47:53 +01:00
fb9537d8b7
src/img2pdf.py: allow PNG input without dpi units but non-square dpi aspect ratio
Closes: #181
2023-11-25 09:47:52 +01:00
7678435eb7
validate icc profile and no default location on windows
closes: #179
2023-11-07 18:50:07 +01:00
ba7a360866
release version 0.5.0 2023-10-28 08:35:54 +02:00
7f0bf47ff3
src/img2pdf.py: reformat with black 2023-10-28 08:35:53 +02:00
Leo
5cd0918d50 Issue #175 related. The original was SmartAlbums, but another case with 'Adobe PS', so delete the exif_software check part 2023-10-18 13:33:44 +08:00
Leo
f157ced05d
ignore RGB icc profile for grayscale jpegs produced by SmartAlbums
closes: #175
2023-10-17 11:32:25 +02:00
09064e8e70
jp2: rudimentary support for raw jpeg2000 without jp2 boxes 2023-08-08 07:40:38 +02:00
2f736d7891
allow 'matte' to be missing in MIFF 2023-08-06 19:43:19 +02:00
e05580a49a
src/img2pdf_test.py: IM7 dropped 'baseType' in json output, so use 'type' instead which works for both IM6 and IM7 2023-08-06 19:27:01 +02:00
acc25a4926
Support JPEG2000 images with transparency
Closes: #173
2023-08-05 16:06:30 +02:00
f597887088
The GIMP ICC bug does not only apply to 1-bit tiff but also to black/white palette PNG
https://gitlab.gnome.org/GNOME/gimp/-/issues/3438

Closes: #159
2023-08-05 14:43:18 +02:00
3e832fbcc2
add information about how to convert images to 8 bit (closes: #170) 2023-08-05 14:43:07 +02:00
1e8557cef1
src/img2pdf_test.py: drop check for endianness for tests where it does not matter
IM7 defaults to big-endian on architectures other than x86 even if they
are little endian: https://github.com/ImageMagick/ImageMagick/issues/6300

Closes: #152
2023-08-05 14:42:48 +02:00
29921eeabd
the default PDF/A icc profile is /usr/share/color/icc/sRGB.icc, /usr/share/color/icc/OpenICC/sRGB.icc or /usr/share/color/icc/colord/sRGB.icc depending on which one exists 2023-06-11 21:56:21 +02:00
33139612f8
src/img2pdf_test.py: make endianness dependant on sys.byteorder (closes: #152) 2023-06-11 14:45:09 +02:00
64d27f4a8b
src/img2pdf_test.py: allow Bilevel as well as Grayscale type for png_gray1_img (closes: #161) 2023-06-11 13:24:30 +02:00
85cbe1d128
factor out argparse.ArgumentParser to allow for generating completions via shtab 2023-06-11 08:09:46 +02:00
b25429a4c1
src/img2pdf_test.py: add tests for timestamps 2023-06-11 08:01:36 +02:00
c703e9df06
fix date(1) based timestamp parser 2023-06-11 07:48:23 +02:00
79e9985f35
src/img2pdf_test.py: black 2023-06-11 07:47:22 +02:00
cb2644c34f
do not include thumbnails in the output by default unless --include-thumbnails is used
This is relevant for the MPO format which otherwise would result in PDF
files containing the same image in different sizes multiple times. With
this change, the default is to only have a single page containing the
full MPO. This means that extracting that MPO also gets the thumbnails
back.

With the --include-thumbnails option, each frame gets stored on its own
page as it is done for multi-frame GIF, for example.

Closes: #135
2023-06-11 07:31:07 +02:00
81502f21af Convert creation/modification dates to UTC (fixes #155)
Ensure that timezones are correctly interpreted in the input by calling
`.astimezone()` as appropriate on datetime objects, and store the
resulting date fields as UTC.

One could argue that datetimes in the local timezone be stored in the
PDF, but then the date string handling becomes more complicated; the PDF
and XMP date specs both use the `Z` suffix to indicate UTC time, but
other +/- offsets require different syntax between the two specs.
2023-06-10 17:53:03 -07:00
0cbcb8fa12
avoid converting palette PNG with alpha to RGB (closes: #158) 2023-06-08 08:54:37 +02:00
e9e04b6dd9
extend comments around dropping ICC profile stored by GIMP for bilevel input 2023-06-08 08:53:22 +02:00
fc059ee471
use quotes around caret in examples for windows users
Closes: #167
2023-06-08 07:14:17 +02:00
25466113e9
another small fixup for the last commit 2023-05-30 08:06:36 +02:00
7405635b72
only check whether icc profile can be dropped if there is any 2023-05-30 07:10:32 +02:00
aea472101b
strip off RGB color profile from bilevel TIFF images produced by gimp (closes: #164) 2023-05-30 06:25:26 +02:00
6 changed files with 657 additions and 232 deletions

View file

@ -2,6 +2,22 @@
CHANGES CHANGES
======= =======
0.5.1 (2023-11-26)
------------------
- no default ICC profile location for PDF/A-1b on Windows
- workaround for PNG input without dpi units but non-square dpi aspect ratio
0.5.0 (2023-10-28)
------------------
- support MIFF for 16 bit CMYK input
- accept pathlib.Path objects as input
- don't store RGB ICC profiles from bilevel or grayscale TIFF, PNG and JPEG
- thumbnails are no longer included by default and --include-thumbnails has to
be used if you want them
- support for pikepdf (>= 6.2.0)
0.4.4 (2022-04-07) 0.4.4 (2022-04-07)
------------------ ------------------

39
HACKING
View file

@ -27,6 +27,41 @@ Making a new release
- Build and upload to pypi: - Build and upload to pypi:
$ rm dist/* $ rm -rf dist/*
$ python3 setup.py sdist $ python3 setup.py sdist
$ twine upload --sign dist/* $ twine upload dist/*
Using debbisect to find regressions
-----------------------------------
$ debbisect --cache=./cache --depends="git,ca-certificates,python3,
ghostscript,imagemagick,mupdf-tools,poppler-utils,python3-pil,
python3-pytest,python3-numpy,python3-scipy,python3-pikepdf" \
--verbose 2023-09-16 2023-10-24 \
'chroot "$1" sh -c "
git clone https://gitlab.mister-muffin.de/josch/img2pdf.git
&& cd img2pdf
&& pytest 'src/img2pdf_test.py::test_jpg_2000_rgba8[internal]"'
Using debbisect cache
---------------------
$ mmdebstrap --variant=apt --aptopt='Acquire::Check-Valid-Until "false"' \
--include=git,ca-certificates,python3,ghostscript,imagemagick \
--include=mupdf-tools,poppler-utils,python3-pil,python3-pytest \
--include=python3-numpy,python3-scipy,python3-pikepdf \
--hook-dir=/usr/share/mmdebstrap/hooks/file-mirror-automount \
--setup-hook='mkdir -p "$1/home/josch/git/devscripts/cache/pool/"' \
--setup-hook='mount -o ro,bind /home/josch/git/devscripts/cache/pool/ "$1/home/josch/git/devscripts/cache/pool/"' \
--chrooted-customize-hook=bash
unstable /dev/null
file:///home/josch/git/devscripts/cache/archive/debian/20231022T090139Z/
Bisecting imagemagick
---------------------
$ git clean -fdx && git reset --hard
$ ./configure --prefix=$(pwd)/prefix
$ make -j$(nproc)
$ make install
$ LD_LIBRARY_PATH=$(pwd)/prefix/lib prefix/bin/compare ...

View file

@ -1,7 +1,7 @@
import sys import sys
from setuptools import setup from setuptools import setup
VERSION = "0.4.4" VERSION = "0.5.1"
INSTALL_REQUIRES = ( INSTALL_REQUIRES = (
"Pillow", "Pillow",

View file

@ -22,7 +22,7 @@ import sys
import os import os
import zlib import zlib
import argparse import argparse
from PIL import Image, TiffImagePlugin, GifImagePlugin from PIL import Image, TiffImagePlugin, GifImagePlugin, ImageCms
if hasattr(GifImagePlugin, "LoadingStrategy"): if hasattr(GifImagePlugin, "LoadingStrategy"):
# Pillow 9.0.0 started emitting all frames but the first as RGB instead of # Pillow 9.0.0 started emitting all frames but the first as RGB instead of
@ -36,8 +36,8 @@ if hasattr(GifImagePlugin, "LoadingStrategy"):
# TiffImagePlugin.DEBUG = True # TiffImagePlugin.DEBUG = True
from PIL.ExifTags import TAGS from PIL.ExifTags import TAGS
from datetime import datetime from datetime import datetime, timezone
from jp2 import parsejp2 import jp2
from enum import Enum from enum import Enum
from io import BytesIO from io import BytesIO
import logging import logging
@ -46,6 +46,7 @@ import platform
import hashlib import hashlib
from itertools import chain from itertools import chain
import re import re
import io
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@ -61,7 +62,7 @@ try:
except ImportError: except ImportError:
have_pikepdf = False have_pikepdf = False
__version__ = "0.4.4" __version__ = "0.5.1"
default_dpi = 96.0 default_dpi = 96.0
papersizes = { papersizes = {
"letter": "8.5inx11in", "letter": "8.5inx11in",
@ -721,7 +722,7 @@ class pdfdoc(object):
self.writer.docinfo = PdfDict(indirect=True) self.writer.docinfo = PdfDict(indirect=True)
def datetime_to_pdfdate(dt): def datetime_to_pdfdate(dt):
return dt.strftime("%Y%m%d%H%M%SZ") return dt.astimezone(tz=timezone.utc).strftime("%Y%m%d%H%M%SZ")
for k in ["Title", "Author", "Creator", "Producer", "Subject"]: for k in ["Title", "Author", "Creator", "Producer", "Subject"]:
v = locals()[k.lower()] v = locals()[k.lower()]
@ -731,7 +732,7 @@ class pdfdoc(object):
v = PdfString.encode(v) v = PdfString.encode(v)
self.writer.docinfo[getattr(PdfName, k)] = v self.writer.docinfo[getattr(PdfName, k)] = v
now = datetime.now() now = datetime.now().astimezone()
for k in ["CreationDate", "ModDate"]: for k in ["CreationDate", "ModDate"]:
v = locals()[k.lower()] v = locals()[k.lower()]
if v is None and nodate: if v is None and nodate:
@ -751,7 +752,7 @@ class pdfdoc(object):
) )
def datetime_to_xmpdate(dt): def datetime_to_xmpdate(dt):
return dt.strftime("%Y-%m-%dT%H:%M:%SZ") return dt.astimezone(tz=timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
self.xmp = b"""<?xpacket begin='\xef\xbb\xbf' id='W5M0MpCehiHzreSzNTczkc9d'?> self.xmp = b"""<?xpacket begin='\xef\xbb\xbf' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='XMP toolkit 2.9.1-13, framework 1.6'> <x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='XMP toolkit 2.9.1-13, framework 1.6'>
@ -826,8 +827,10 @@ class pdfdoc(object):
artborder=None, artborder=None,
iccp=None, iccp=None,
): ):
assert (color != Colorspace.RGBA and color != Colorspace.LA) or ( assert (
imgformat == ImageFormat.PNG and smaskdata is not None color not in [Colorspace.RGBA, Colorspace.LA]
or (imgformat == ImageFormat.PNG and smaskdata is not None)
or imgformat == ImageFormat.JPEG2000
) )
if self.engine == Engine.pikepdf: if self.engine == Engine.pikepdf:
@ -851,6 +854,12 @@ class pdfdoc(object):
if color == Colorspace["1"] or color == Colorspace.L or color == Colorspace.LA: if color == Colorspace["1"] or color == Colorspace.L or color == Colorspace.LA:
colorspace = PdfName.DeviceGray colorspace = PdfName.DeviceGray
elif color == Colorspace.RGB or color == Colorspace.RGBA: elif color == Colorspace.RGB or color == Colorspace.RGBA:
if color == Colorspace.RGBA and imgformat == ImageFormat.JPEG2000:
# there is no DeviceRGBA and for JPXDecode it is okay to have
# no colorspace as the pdf reader is supposed to get this info
# from the jpeg2000 payload itself
colorspace = None
else:
colorspace = PdfName.DeviceRGB colorspace = PdfName.DeviceRGB
elif color == Colorspace.CMYK or color == Colorspace["CMYK;I"]: elif color == Colorspace.CMYK or color == Colorspace["CMYK;I"]:
colorspace = PdfName.DeviceCMYK colorspace = PdfName.DeviceCMYK
@ -922,6 +931,7 @@ class pdfdoc(object):
image[PdfName.Filter] = ofilter image[PdfName.Filter] = ofilter
image[PdfName.Width] = imgwidthpx image[PdfName.Width] = imgwidthpx
image[PdfName.Height] = imgheightpx image[PdfName.Height] = imgheightpx
if colorspace is not None:
image[PdfName.ColorSpace] = colorspace image[PdfName.ColorSpace] = colorspace
image[PdfName.BitsPerComponent] = depth image[PdfName.BitsPerComponent] = depth
@ -1291,7 +1301,7 @@ def get_imgmetadata(
if imgformat == ImageFormat.JPEG2000 and rawdata is not None and imgdata is None: if imgformat == ImageFormat.JPEG2000 and rawdata is not None and imgdata is None:
# this codepath gets called if the PIL installation is not able to # this codepath gets called if the PIL installation is not able to
# handle JPEG2000 files # handle JPEG2000 files
imgwidthpx, imgheightpx, ics, hdpi, vdpi = parsejp2(rawdata) imgwidthpx, imgheightpx, ics, hdpi, vdpi, channels, bpp = jp2.parse(rawdata)
if hdpi is None: if hdpi is None:
hdpi = default_dpi hdpi = default_dpi
@ -1301,7 +1311,19 @@ def get_imgmetadata(
else: else:
imgwidthpx, imgheightpx = imgdata.size imgwidthpx, imgheightpx = imgdata.size
ndpi = imgdata.info.get("dpi", (default_dpi, default_dpi)) ndpi = imgdata.info.get("dpi")
if ndpi is None:
# the PNG plugin of PIL adds the undocumented "aspect" field instead of
# the "dpi" field if the PNG pHYs chunk unit is not set to meters
if imgformat == ImageFormat.PNG and imgdata.info.get("aspect") is not None:
aspect = imgdata.info["aspect"]
# make sure not to go below the default dpi
if aspect[0] > aspect[1]:
ndpi = (default_dpi * aspect[0] / aspect[1], default_dpi)
else:
ndpi = (default_dpi, default_dpi * aspect[1] / aspect[0])
else:
ndpi = (default_dpi, default_dpi)
# In python3, the returned dpi value for some tiff images will # In python3, the returned dpi value for some tiff images will
# not be an integer but a float. To make the behaviour of # not be an integer but a float. To make the behaviour of
# img2pdf the same between python2 and python3, we convert that # img2pdf the same between python2 and python3, we convert that
@ -1311,7 +1333,7 @@ def get_imgmetadata(
ics = imgdata.mode ics = imgdata.mode
# GIF and PNG files with transparency are supported # GIF and PNG files with transparency are supported
if (imgformat == ImageFormat.PNG or imgformat == ImageFormat.GIF) and ( if imgformat in [ImageFormat.PNG, ImageFormat.GIF, ImageFormat.JPEG2000] and (
ics in ["RGBA", "LA"] or "transparency" in imgdata.info ics in ["RGBA", "LA"] or "transparency" in imgdata.info
): ):
# Must check the IHDR chunk for the bit depth, because PIL would lossily # Must check the IHDR chunk for the bit depth, because PIL would lossily
@ -1321,6 +1343,10 @@ def get_imgmetadata(
if depth > 8: if depth > 8:
logger.warning("Image with transparency and a bit depth of %d." % depth) logger.warning("Image with transparency and a bit depth of %d." % depth)
logger.warning("This is unsupported due to PIL limitations.") logger.warning("This is unsupported due to PIL limitations.")
logger.warning(
"If you accept a lossy conversion, you can manually convert "
"your images to 8 bit using `convert -depth 8` from imagemagick"
)
raise AlphaChannelError( raise AlphaChannelError(
"Refusing to work with multiple >8bit channels." "Refusing to work with multiple >8bit channels."
) )
@ -1431,6 +1457,53 @@ def get_imgmetadata(
iccp = None iccp = None
if "icc_profile" in imgdata.info: if "icc_profile" in imgdata.info:
iccp = imgdata.info.get("icc_profile") iccp = imgdata.info.get("icc_profile")
# GIMP saves bilevel TIFF images and palette PNG images with only black and
# white in the palette with an RGB ICC profile which is useless
# https://gitlab.gnome.org/GNOME/gimp/-/issues/3438
# and produces an error in Adobe Acrobat, so we ignore it with a warning.
# imagemagick also used to (wrongly) include an RGB ICC profile for bilevel
# images: https://github.com/ImageMagick/ImageMagick/issues/2070
if iccp is not None and (
(color == Colorspace["1"] and imgformat == ImageFormat.TIFF)
or (
imgformat == ImageFormat.PNG
and color == Colorspace.P
and rawdata is not None
and parse_png(rawdata)[1]
in [b"\x00\x00\x00\xff\xff\xff", b"\xff\xff\xff\x00\x00\x00"]
)
):
with io.BytesIO(iccp) as f:
prf = ImageCms.ImageCmsProfile(f)
if (
prf.profile.model == "sRGB"
and prf.profile.manufacturer == "GIMP"
and prf.profile.profile_description == "GIMP built-in sRGB"
):
if imgformat == ImageFormat.TIFF:
logger.warning(
"Ignoring RGB ICC profile in bilevel TIFF produced by GIMP."
)
elif imgformat == ImageFormat.PNG:
logger.warning(
"Ignoring RGB ICC profile in 2-color palette PNG produced by GIMP."
)
logger.warning("https://gitlab.gnome.org/GNOME/gimp/-/issues/3438")
iccp = None
# SmartAlbums old version (found 2.2.6) exports JPG with only 1 compone
# with an RGB ICC profile which is useless.
# This produces an error in Adobe Acrobat, so we ignore it with a warning.
# Update: Found another case, the JPG is created by Adobe PhotoShop, so we
# don't check software anymore.
if iccp is not None and (
(color == Colorspace["L"] and imgformat == ImageFormat.JPEG)
):
with io.BytesIO(iccp) as f:
prf = ImageCms.ImageCmsProfile(f)
if prf.profile.xcolor_space not in ("GRAY"):
logger.warning("Ignoring non-GRAY ICC profile in Grayscale JPG")
iccp = None
logger.debug("width x height = %dpx x %dpx", imgwidthpx, imgheightpx) logger.debug("width x height = %dpx x %dpx", imgwidthpx, imgheightpx)
@ -1649,7 +1722,7 @@ def parse_miff(data):
elif hdata["colorspace"] == "Gray": elif hdata["colorspace"] == "Gray":
numchannels = 1 numchannels = 1
colorspace = Colorspace.L colorspace = Colorspace.L
if hdata["matte"]: if hdata.get("matte"):
numchannels += 1 numchannels += 1
if hdata.get("profile"): if hdata.get("profile"):
# there is no key encoding the length of icc or exif data # there is no key encoding the length of icc or exif data
@ -1699,7 +1772,7 @@ def parse_miff(data):
# case "PseudoClass": # case "PseudoClass":
elif hdata["class"] == "PseudoClass": elif hdata["class"] == "PseudoClass":
assert "colors" in hdata assert "colors" in hdata
if hdata["matte"]: if hdata.get("matte"):
numchannels = 2 numchannels = 2
else: else:
numchannels = 1 numchannels = 1
@ -1734,7 +1807,9 @@ def parse_miff(data):
# fmt: on # fmt: on
def read_images(rawdata, colorspace, first_frame_only=False, rot=None): def read_images(
rawdata, colorspace, first_frame_only=False, rot=None, include_thumbnails=False
):
im = BytesIO(rawdata) im = BytesIO(rawdata)
im.seek(0) im.seek(0)
imgdata = None imgdata = None
@ -1788,10 +1863,13 @@ def read_images(rawdata, colorspace, first_frame_only=False, rot=None):
raise JpegColorspaceError("jpeg can't be monochrome") raise JpegColorspaceError("jpeg can't be monochrome")
if color == Colorspace["P"]: if color == Colorspace["P"]:
raise JpegColorspaceError("jpeg can't have a color palette") raise JpegColorspaceError("jpeg can't have a color palette")
if color == Colorspace["RGBA"]: if color == Colorspace["RGBA"] and imgformat != ImageFormat.JPEG2000:
raise JpegColorspaceError("jpeg can't have an alpha channel") raise JpegColorspaceError("jpeg can't have an alpha channel")
logger.debug("read_images() embeds a JPEG") logger.debug("read_images() embeds a JPEG")
cleanup() cleanup()
depth = 8
if imgformat == ImageFormat.JPEG2000:
*_, depth = jp2.parse(rawdata)
return [ return [
( (
color, color,
@ -1803,7 +1881,7 @@ def read_images(rawdata, colorspace, first_frame_only=False, rot=None):
imgheightpx, imgheightpx,
[], [],
False, False,
8, depth,
rotation, rotation,
iccp, iccp,
) )
@ -1820,6 +1898,77 @@ def read_images(rawdata, colorspace, first_frame_only=False, rot=None):
if imgformat == ImageFormat.MPO: if imgformat == ImageFormat.MPO:
result = [] result = []
img_page_count = 0 img_page_count = 0
assert len(imgdata._MpoImageFile__mpoffsets) == len(imgdata.mpinfo[0xB002])
num_frames = len(imgdata.mpinfo[0xB002])
# An MPO file can be a main image together with one or more thumbnails
# if that is the case, then we only include all frames if the
# --include-thumbnails option is given. If it is not, such an MPO file
# will be embedded as is, so including its thumbnails but showing up
# as a single image page in the resulting PDF.
num_main_frames = 0
num_thumbnail_frames = 0
for i, mpent in enumerate(imgdata.mpinfo[0xB002]):
# check only the first frame for being the main image
if (
i == 0
and mpent["Attribute"]["DependentParentImageFlag"]
and not mpent["Attribute"]["DependentChildImageFlag"]
and mpent["Attribute"]["RepresentativeImageFlag"]
and mpent["Attribute"]["MPType"] == "Baseline MP Primary Image"
):
num_main_frames += 1
elif (
not mpent["Attribute"]["DependentParentImageFlag"]
and mpent["Attribute"]["DependentChildImageFlag"]
and not mpent["Attribute"]["RepresentativeImageFlag"]
and mpent["Attribute"]["MPType"]
in [
"Large Thumbnail (VGA Equivalent)",
"Large Thumbnail (Full HD Equivalent)",
]
):
num_thumbnail_frames += 1
logger.debug(f"number of frames: {num_frames}")
logger.debug(f"number of main frames: {num_main_frames}")
logger.debug(f"number of thumbnail frames: {num_thumbnail_frames}")
# this MPO file is a main image plus zero or more thumbnails
# embed as-is unless the --include-thumbnails option was given
if num_frames == 1 or (
not include_thumbnails
and num_main_frames == 1
and num_thumbnail_frames + 1 == num_frames
):
color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata(
imgdata, imgformat, default_dpi, colorspace, rawdata, rot
)
if color == Colorspace["1"]:
raise JpegColorspaceError("jpeg can't be monochrome")
if color == Colorspace["P"]:
raise JpegColorspaceError("jpeg can't have a color palette")
if color == Colorspace["RGBA"]:
raise JpegColorspaceError("jpeg can't have an alpha channel")
logger.debug("read_images() embeds an MPO verbatim")
cleanup()
return [
(
color,
ndpi,
ImageFormat.JPEG,
rawdata,
None,
imgwidthpx,
imgheightpx,
[],
False,
8,
rotation,
iccp,
)
]
# If the control flow reaches here, the MPO has more than a single
# frame but was not detected to be a main image followed by multiple
# thumbnails. We thus treat this MPO as we do other multi-frame images
# and include all its frames as individual pages.
for offset, mpent in zip( for offset, mpent in zip(
imgdata._MpoImageFile__mpoffsets, imgdata.mpinfo[0xB002] imgdata._MpoImageFile__mpoffsets, imgdata.mpinfo[0xB002]
): ):
@ -2085,7 +2234,16 @@ def read_images(rawdata, colorspace, first_frame_only=False, rot=None):
) )
) )
else: else:
if ( if color in [Colorspace.P, Colorspace.PA] and iccp is not None:
# PDF does not support palette images with icc profile
if color == Colorspace.P:
newcolor = Colorspace.RGB
newimg = newimg.convert(mode="RGB")
elif color == Colorspace.PA:
newcolor = Colorspace.RGBA
newimg = newimg.convert(mode="RGBA")
smaskidat = None
elif (
color == Colorspace.RGBA color == Colorspace.RGBA
or color == Colorspace.LA or color == Colorspace.LA
or color == Colorspace.PA or color == Colorspace.PA
@ -2099,25 +2257,21 @@ def read_images(rawdata, colorspace, first_frame_only=False, rot=None):
newcolor = color newcolor = color
l, a = newimg.split() l, a = newimg.split()
newimg = l newimg = l
elif color == Colorspace.PA or (
color == Colorspace.P and "transparency" in newimg.info
):
newcolor = color
a = newimg.convert(mode="RGBA").split()[-1]
else: else:
newcolor = Colorspace.RGBA newcolor = Colorspace.RGBA
r, g, b, a = newimg.convert(mode="RGBA").split() r, g, b, a = newimg.convert(mode="RGBA").split()
newimg = Image.merge("RGB", (r, g, b)) newimg = Image.merge("RGB", (r, g, b))
smaskidat, _, _ = to_png_data(a) smaskidat, *_ = to_png_data(a)
logger.warning( logger.warning(
"Image contains an alpha channel. Computing a separate " "Image contains an alpha channel. Computing a separate "
"soft mask (/SMask) image to store transparency in PDF." "soft mask (/SMask) image to store transparency in PDF."
) )
elif color in [Colorspace.P, Colorspace.PA] and iccp is not None:
# PDF does not support palette images with icc profile
if color == Colorspace.P:
newcolor = Colorspace.RGB
newimg = newimg.convert(mode="RGB")
elif color == Colorspace.PA:
newcolor = Colorspace.RGBA
newimg = newimg.convert(mode="RGBA")
smaskidat = None
else: else:
newcolor = color newcolor = color
smaskidat = None smaskidat = None
@ -2488,6 +2642,7 @@ def convert(*images, **kwargs):
artborder=None, artborder=None,
pdfa=None, pdfa=None,
rotation=None, rotation=None,
include_thumbnails=False,
) )
for kwname, default in _default_kwargs.items(): for kwname, default in _default_kwargs.items():
if kwname not in kwargs: if kwname not in kwargs:
@ -2580,6 +2735,7 @@ def convert(*images, **kwargs):
kwargs["colorspace"], kwargs["colorspace"],
kwargs["first_frame_only"], kwargs["first_frame_only"],
kwargs["rotation"], kwargs["rotation"],
kwargs["include_thumbnails"],
): ):
pagewidth, pageheight, imgwidthpdf, imgheightpdf = kwargs["layout_fun"]( pagewidth, pageheight, imgwidthpdf, imgheightpdf = kwargs["layout_fun"](
imgwidthpx, imgheightpx, ndpi imgwidthpx, imgheightpx, ndpi
@ -2955,7 +3111,7 @@ def valid_date(string):
else: else:
try: try:
return parser.parse(string) return parser.parse(string)
except TypeError: except:
pass pass
# as a last resort, try the local date utility # as a last resort, try the local date utility
try: try:
@ -2968,7 +3124,7 @@ def valid_date(string):
except subprocess.CalledProcessError: except subprocess.CalledProcessError:
pass pass
else: else:
return datetime.utcfromtimestamp(int(utime)) return datetime.fromtimestamp(int(utime))
raise argparse.ArgumentTypeError("cannot parse date: %s" % string) raise argparse.ArgumentTypeError("cannot parse date: %s" % string)
@ -3670,7 +3826,35 @@ def gui():
app.mainloop() app.mainloop()
def main(argv=sys.argv): def file_is_icc(fname):
with open(fname, "rb") as f:
data = f.read(40)
if len(data) < 40:
return False
return data[36:] == b"acsp"
def validate_icc(fname):
if not file_is_icc(fname):
raise argparse.ArgumentTypeError('"%s" is not an ICC profile' % fname)
return fname
def get_default_icc_profile():
for profile in [
"/usr/share/color/icc/sRGB.icc",
"/usr/share/color/icc/OpenICC/sRGB.icc",
"/usr/share/color/icc/colord/sRGB.icc",
]:
if not os.path.exists(profile):
continue
if not file_is_icc(profile):
continue
return profile
return "/usr/share/color/icc/sRGB.icc"
def get_main_parser():
rendered_papersizes = "" rendered_papersizes = ""
for k, v in sorted(papersizes.items()): for k, v in sorted(papersizes.items()):
rendered_papersizes += " %-8s %s\n" % (papernames[k], v) rendered_papersizes += " %-8s %s\n" % (papernames[k], v)
@ -3711,7 +3895,9 @@ Paper sizes:
the value in the second column has the same effect as giving the short hand the value in the second column has the same effect as giving the short hand
in the first column. Appending ^T (a caret/circumflex followed by the letter in the first column. Appending ^T (a caret/circumflex followed by the letter
T) turns the paper size from portrait into landscape. The postfix thus T) turns the paper size from portrait into landscape. The postfix thus
symbolizes the transpose. The values are case insensitive. symbolizes the transpose. Note that on Windows cmd.exe the caret symbol is
the escape character, so you need to put quotes around the option value.
The values are case insensitive.
%s %s
@ -3778,7 +3964,7 @@ Examples:
while preserving its aspect ratio and a print border of 2 cm on the top and while preserving its aspect ratio and a print border of 2 cm on the top and
bottom and 2.5 cm on the left and right hand side. bottom and 2.5 cm on the left and right hand side.
$ img2pdf --output out.pdf --pagesize A4^T --border 2cm:2.5cm *.jpg $ img2pdf --output out.pdf --pagesize "A4^T" --border 2cm:2.5cm *.jpg
On each A4 page, fit images into a 10 cm times 15 cm rectangle but keep the On each A4 page, fit images into a 10 cm times 15 cm rectangle but keep the
original image size if the image is smaller than that. original image size if the image is smaller than that.
@ -3913,6 +4099,17 @@ RGB.""",
"input image be converted into a page in the resulting PDF.", "input image be converted into a page in the resulting PDF.",
) )
outargs.add_argument(
"--include-thumbnails",
action="store_true",
help="Some multi-frame formats like MPO carry a main image and "
"one or more scaled-down copies of the main image (thumbnails). "
"In such a case, img2pdf will only include the main image and "
"not create additional pages for each of the thumbnails. If this "
"option is set, img2pdf will instead create one page per frame and "
"thus store each thumbnail on its own page.",
)
outargs.add_argument( outargs.add_argument(
"--pillow-limit-break", "--pillow-limit-break",
action="store_true", action="store_true",
@ -3924,13 +4121,28 @@ RGB.""",
% Image.MAX_IMAGE_PIXELS, % Image.MAX_IMAGE_PIXELS,
) )
if sys.platform == "win32":
# on Windows, there are no default paths to search for an ICC profile
# so make the argument required instead of optional
outargs.add_argument(
"--pdfa",
type=validate_icc,
help="Output a PDF/A-1b compliant document. The argument to this "
"option is the path to the ICC profile that will be embedded into "
"the resulting PDF.",
)
else:
outargs.add_argument( outargs.add_argument(
"--pdfa", "--pdfa",
nargs="?", nargs="?",
const="/usr/share/color/icc/sRGB.icc", const=get_default_icc_profile(),
default=None, default=None,
type=validate_icc,
help="Output a PDF/A-1b compliant document. By default, this will " help="Output a PDF/A-1b compliant document. By default, this will "
"embed /usr/share/color/icc/sRGB.icc as the color profile.", "embed either /usr/share/color/icc/sRGB.icc, "
"/usr/share/color/icc/OpenICC/sRGB.icc or "
"/usr/share/color/icc/colord/sRGB.icc as the color profile, whichever "
"is found to exist first.",
) )
sizeargs = parser.add_argument_group( sizeargs = parser.add_argument_group(
@ -4220,8 +4432,11 @@ and left/right, respectively. It is not possible to specify asymmetric borders.
action="store_true", action="store_true",
help="Instruct the PDF viewer to open the PDF in fullscreen mode", help="Instruct the PDF viewer to open the PDF in fullscreen mode",
) )
return parser
args = parser.parse_args(argv[1:])
def main(argv=sys.argv):
args = get_main_parser().parse_args(argv[1:])
if args.verbose: if args.verbose:
logging.basicConfig(level=logging.DEBUG) logging.basicConfig(level=logging.DEBUG)
@ -4248,7 +4463,7 @@ and left/right, respectively. It is not possible to specify asymmetric borders.
print( print(
"Reading image from standard input...\n" "Reading image from standard input...\n"
"Re-run with -h or --help for usage information.", "Re-run with -h or --help for usage information.",
file=sys.stderr file=sys.stderr,
) )
try: try:
images = [sys.stdin.buffer.read()] images = [sys.stdin.buffer.read()]
@ -4310,6 +4525,7 @@ and left/right, respectively. It is not possible to specify asymmetric borders.
artborder=args.art_border, artborder=args.art_border,
pdfa=args.pdfa, pdfa=args.pdfa,
rotation=args.rotation, rotation=args.rotation,
include_thumbnails=args.include_thumbnails,
) )
except Exception as e: except Exception as e:
logger.error("error: " + str(e)) logger.error("error: " + str(e))

View file

@ -19,6 +19,8 @@ from packaging.version import parse as parse_version
import warnings import warnings
import json import json
import pathlib import pathlib
import itertools
import xml.etree.ElementTree as ET
img2pdfprog = os.getenv("img2pdfprog", default="src/img2pdf.py") img2pdfprog = os.getenv("img2pdfprog", default="src/img2pdf.py")
@ -37,6 +39,14 @@ for glob in ICC_PROFILE_PATHS:
ICC_PROFILE = path ICC_PROFILE = path
break break
HAVE_FAKETIME = True
try:
ver = subprocess.check_output(["faketime", "--version"])
if b"faketime: Version " not in ver:
HAVE_FAKETIME = False
except FileNotFoundError:
HAVE_FAKETIME = False
HAVE_MUTOOL = True HAVE_MUTOOL = True
try: try:
ver = subprocess.check_output(["mutool", "-v"], stderr=subprocess.STDOUT) ver = subprocess.check_output(["mutool", "-v"], stderr=subprocess.STDOUT)
@ -130,6 +140,25 @@ psnr_re = re.compile(rb"((?:inf|(?:0|[1-9][0-9]*)(?:\.[0-9]+)?))(?: \([0-9.]+\))
############################################################################### ###############################################################################
# Interpret a datetime string in a given timezone and format it according to a
# given format string in in UTC.
# We avoid using the Python datetime module for this job because doing so would
# just replicate the code we want to test for correctness.
def tz2utcstrftime(string, fmt, timezone):
return (
subprocess.check_output(
[
"date",
"--utc",
f'--date=TZ="{timezone}" {string}',
f"+{fmt}",
]
)
.decode("utf8")
.removesuffix("\n")
)
def find_closest_palette_color(color, palette): def find_closest_palette_color(color, palette):
if color.ndim == 0: if color.ndim == 0:
idx = (numpy.abs(palette - color)).argmin() idx = (numpy.abs(palette - color)).argmin()
@ -332,6 +361,8 @@ def compare(im1, im2, exact, icc, cmyk):
+ [ + [
"-metric", "-metric",
"AE", "AE",
"-alpha",
"off",
im1, im1,
im2, im2,
"null:", "null:",
@ -603,7 +634,7 @@ def alpha_value():
alpha = numpy.zeros((60, 60, 4), dtype=numpy.dtype("int64")) alpha = numpy.zeros((60, 60, 4), dtype=numpy.dtype("int64"))
# draw three circles # draw three circles
for (xpos, ypos, color) in [ for xpos, ypos, color in [
(12, 3, [0xFFFF, 0, 0, 0xFFFF]), (12, 3, [0xFFFF, 0, 0, 0xFFFF]),
(21, 21, [0, 0xFFFF, 0, 0xFFFF]), (21, 21, [0, 0xFFFF, 0, 0xFFFF]),
(3, 21, [0, 0, 0xFFFF, 0xFFFF]), (3, 21, [0, 0, 0xFFFF, 0xFFFF]),
@ -1187,6 +1218,74 @@ def jpg_2000_img(tmp_path_factory, tmp_normal_png):
in_img.unlink() in_img.unlink()
@pytest.fixture(scope="session")
def jpg_2000_rgba8_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("jpg_2000_rgba8") / "in.jp2"
subprocess.check_call(CONVERT + [str(tmp_alpha_png), "-depth", "8", str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "JP2", str(identify)
assert identify[0]["image"].get("mimeType") == "image/jp2", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "JPEG2000", str(identify)
yield in_img
in_img.unlink()
@pytest.fixture(scope="session")
def jpg_2000_rgba16_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("jpg_2000_rgba16") / "in.jp2"
subprocess.check_call(CONVERT + [str(tmp_alpha_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "JP2", str(identify)
assert identify[0]["image"].get("mimeType") == "image/jp2", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "JPEG2000", str(identify)
yield in_img
in_img.unlink()
@pytest.fixture(scope="session") @pytest.fixture(scope="session")
def png_rgb8_img(tmp_normal_png): def png_rgb8_img(tmp_normal_png):
in_img = tmp_normal_png in_img = tmp_normal_png
@ -1599,7 +1698,7 @@ def png_gray1_img(tmp_path_factory, tmp_gray1_png):
"y": 0, "y": 0,
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify) assert identify[0]["image"].get("type") in ["Bilevel", "Grayscale"], str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2330,10 +2429,6 @@ def tiff_float_img(tmp_path_factory, tmp_normal_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("baseDepth") == 32, str(identify) assert identify[0]["image"].get("baseDepth") == 32, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
@ -2349,9 +2444,6 @@ def tiff_float_img(tmp_path_factory, tmp_normal_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify) ), str(identify)
@ -2391,10 +2483,6 @@ def tiff_cmyk8_img(tmp_path_factory, tmp_normal_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "CMYK", str(identify) assert identify[0]["image"].get("colorspace") == "CMYK", str(identify)
assert identify[0]["image"].get("type") == "ColorSeparation", str(identify) assert identify[0]["image"].get("type") == "ColorSeparation", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2405,9 +2493,6 @@ def tiff_cmyk8_img(tmp_path_factory, tmp_normal_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "separated" == "separated"
@ -2450,10 +2535,6 @@ def tiff_cmyk16_img(tmp_path_factory, tmp_normal_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "CMYK", str(identify) assert identify[0]["image"].get("colorspace") == "CMYK", str(identify)
assert identify[0]["image"].get("type") == "ColorSeparation", str(identify) assert identify[0]["image"].get("type") == "ColorSeparation", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2464,9 +2545,6 @@ def tiff_cmyk16_img(tmp_path_factory, tmp_normal_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "separated" == "separated"
@ -2499,10 +2577,6 @@ def tiff_rgb8_img(tmp_path_factory, tmp_normal_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2513,9 +2587,6 @@ def tiff_rgb8_img(tmp_path_factory, tmp_normal_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify) ), str(identify)
@ -2555,10 +2626,6 @@ def tiff_rgb12_img(tmp_path_factory, tmp_normal16_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("baseDepth") == 12, str(identify) assert identify[0]["image"].get("baseDepth") == 12, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2569,9 +2636,6 @@ def tiff_rgb12_img(tmp_path_factory, tmp_normal16_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify) ), str(identify)
@ -2611,10 +2675,6 @@ def tiff_rgb14_img(tmp_path_factory, tmp_normal16_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("baseDepth") == 14, str(identify) assert identify[0]["image"].get("baseDepth") == 14, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2625,9 +2685,6 @@ def tiff_rgb14_img(tmp_path_factory, tmp_normal16_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify) ), str(identify)
@ -2667,10 +2724,6 @@ def tiff_rgb16_img(tmp_path_factory, tmp_normal16_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2681,9 +2734,6 @@ def tiff_rgb16_img(tmp_path_factory, tmp_normal16_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify) ), str(identify)
@ -2724,10 +2774,6 @@ def tiff_rgba8_img(tmp_path_factory, tmp_alpha_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify) assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2738,9 +2784,6 @@ def tiff_rgba8_img(tmp_path_factory, tmp_alpha_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unassociated" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unassociated"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify) ), str(identify)
@ -2781,10 +2824,6 @@ def tiff_rgba16_img(tmp_path_factory, tmp_alpha_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify) assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2795,9 +2834,6 @@ def tiff_rgba16_img(tmp_path_factory, tmp_alpha_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unassociated" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unassociated"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify) ), str(identify)
@ -2837,10 +2873,6 @@ def tiff_gray1_img(tmp_path_factory, tmp_gray1_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2851,9 +2883,6 @@ def tiff_gray1_img(tmp_path_factory, tmp_gray1_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black" == "min-is-black"
@ -2894,10 +2923,6 @@ def tiff_gray2_img(tmp_path_factory, tmp_gray2_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 2, str(identify) assert identify[0]["image"].get("depth") == 2, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2908,9 +2933,6 @@ def tiff_gray2_img(tmp_path_factory, tmp_gray2_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black" == "min-is-black"
@ -2951,10 +2973,6 @@ def tiff_gray4_img(tmp_path_factory, tmp_gray4_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 4, str(identify) assert identify[0]["image"].get("depth") == 4, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -2965,9 +2983,6 @@ def tiff_gray4_img(tmp_path_factory, tmp_gray4_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black" == "min-is-black"
@ -3008,10 +3023,6 @@ def tiff_gray8_img(tmp_path_factory, tmp_gray8_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -3022,9 +3033,6 @@ def tiff_gray8_img(tmp_path_factory, tmp_gray8_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black" == "min-is-black"
@ -3065,10 +3073,6 @@ def tiff_gray16_img(tmp_path_factory, tmp_gray16_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -3079,9 +3083,6 @@ def tiff_gray16_img(tmp_path_factory, tmp_gray16_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black" == "min-is-black"
@ -3124,10 +3125,6 @@ def tiff_multipage_img(tmp_path_factory, tmp_normal_png, tmp_inverse_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -3138,9 +3135,6 @@ def tiff_multipage_img(tmp_path_factory, tmp_normal_png, tmp_inverse_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify) ), str(identify)
@ -3164,10 +3158,6 @@ def tiff_multipage_img(tmp_path_factory, tmp_normal_png, tmp_inverse_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -3178,9 +3168,6 @@ def tiff_multipage_img(tmp_path_factory, tmp_normal_png, tmp_inverse_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify) ), str(identify)
@ -3213,10 +3200,6 @@ def tiff_palette1_img(tmp_path_factory, tmp_palette1_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("baseDepth") == 1, str(identify) assert identify[0]["image"].get("baseDepth") == 1, str(identify)
assert identify[0]["image"].get("colormapEntries") == 2, str(identify) assert identify[0]["image"].get("colormapEntries") == 2, str(identify)
@ -3229,9 +3212,6 @@ def tiff_palette1_img(tmp_path_factory, tmp_palette1_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette"
), str(identify) ), str(identify)
@ -3263,10 +3243,6 @@ def tiff_palette2_img(tmp_path_factory, tmp_palette2_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("baseDepth") == 2, str(identify) assert identify[0]["image"].get("baseDepth") == 2, str(identify)
assert identify[0]["image"].get("colormapEntries") == 4, str(identify) assert identify[0]["image"].get("colormapEntries") == 4, str(identify)
@ -3279,9 +3255,6 @@ def tiff_palette2_img(tmp_path_factory, tmp_palette2_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette"
), str(identify) ), str(identify)
@ -3313,10 +3286,6 @@ def tiff_palette4_img(tmp_path_factory, tmp_palette4_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("baseDepth") == 4, str(identify) assert identify[0]["image"].get("baseDepth") == 4, str(identify)
assert identify[0]["image"].get("colormapEntries") == 16, str(identify) assert identify[0]["image"].get("colormapEntries") == 16, str(identify)
@ -3329,9 +3298,6 @@ def tiff_palette4_img(tmp_path_factory, tmp_palette4_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette"
), str(identify) ), str(identify)
@ -3363,10 +3329,6 @@ def tiff_palette8_img(tmp_path_factory, tmp_palette8_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("colormapEntries") == 256, str(identify) assert identify[0]["image"].get("colormapEntries") == 256, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
@ -3378,9 +3340,6 @@ def tiff_palette8_img(tmp_path_factory, tmp_palette8_png):
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified" identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unspecified"
), str(identify) ), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert ( assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette" identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette"
), str(identify) ), str(identify)
@ -3427,9 +3386,10 @@ def tiff_ccitt_lsb_m2l_white_img(tmp_path_factory, tmp_gray1_png):
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str( assert identify[0]["image"].get(endian) in [
identify "Undefined",
) # FIXME: should be LSB "LSB",
], str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -3677,9 +3637,10 @@ def tiff_ccitt_lsb_m2l_black_img(tmp_path_factory, tmp_gray1_png):
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str( assert identify[0]["image"].get(endian) in [
identify "Undefined",
) # FIXME: should be LSB "LSB",
], str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -3767,9 +3728,10 @@ def tiff_ccitt_nometa1_img(tmp_path_factory, tmp_gray1_png):
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str( assert identify[0]["image"].get(endian) in [
identify "Undefined",
) # FIXME: should be LSB "LSB",
], str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -3851,9 +3813,10 @@ def tiff_ccitt_nometa2_img(tmp_path_factory, tmp_gray1_png):
assert identify[0]["image"].get("units") == "PixelsPerInch", str(identify) assert identify[0]["image"].get("units") == "PixelsPerInch", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str( assert identify[0]["image"].get(endian) in [
identify "Undefined",
) # FIXME: should be LSB "LSB",
], str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("compression") == "Group4", str(identify) assert identify[0]["image"].get("compression") == "Group4", str(identify)
@ -3910,7 +3873,7 @@ def miff_cmyk8_img(tmp_path_factory, tmp_normal_png):
assert "image" in identify[0] assert "image" in identify[0]
assert identify[0]["image"].get("format") == "MIFF", str(identify) assert identify[0]["image"].get("format") == "MIFF", str(identify)
assert identify[0]["image"].get("class") == "DirectClass" assert identify[0]["image"].get("class") == "DirectClass"
assert identify[0]["image"].get("baseType") == "ColorSeparation" assert identify[0]["image"].get("type") == "ColorSeparation"
assert identify[0]["image"].get("geometry") == { assert identify[0]["image"].get("geometry") == {
"width": 60, "width": 60,
"height": 60, "height": 60,
@ -3919,10 +3882,6 @@ def miff_cmyk8_img(tmp_path_factory, tmp_normal_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "CMYK", str(identify) assert identify[0]["image"].get("colorspace") == "CMYK", str(identify)
assert identify[0]["image"].get("type") == "ColorSeparation", str(identify) assert identify[0]["image"].get("type") == "ColorSeparation", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -3958,7 +3917,7 @@ def miff_cmyk16_img(tmp_path_factory, tmp_normal_png):
assert "image" in identify[0] assert "image" in identify[0]
assert identify[0]["image"].get("format") == "MIFF", str(identify) assert identify[0]["image"].get("format") == "MIFF", str(identify)
assert identify[0]["image"].get("class") == "DirectClass" assert identify[0]["image"].get("class") == "DirectClass"
assert identify[0]["image"].get("baseType") == "ColorSeparation" assert identify[0]["image"].get("type") == "ColorSeparation"
assert identify[0]["image"].get("geometry") == { assert identify[0]["image"].get("geometry") == {
"width": 60, "width": 60,
"height": 60, "height": 60,
@ -3967,10 +3926,6 @@ def miff_cmyk16_img(tmp_path_factory, tmp_normal_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "CMYK", str(identify) assert identify[0]["image"].get("colorspace") == "CMYK", str(identify)
assert identify[0]["image"].get("type") == "ColorSeparation", str(identify) assert identify[0]["image"].get("type") == "ColorSeparation", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("baseDepth") == 16, str(identify) assert identify[0]["image"].get("baseDepth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
@ -3997,7 +3952,7 @@ def miff_rgb8_img(tmp_path_factory, tmp_normal_png):
assert "image" in identify[0] assert "image" in identify[0]
assert identify[0]["image"].get("format") == "MIFF", str(identify) assert identify[0]["image"].get("format") == "MIFF", str(identify)
assert identify[0]["image"].get("class") == "DirectClass" assert identify[0]["image"].get("class") == "DirectClass"
assert identify[0]["image"].get("baseType") == "TrueColor" assert identify[0]["image"].get("type") == "TrueColor"
assert identify[0]["image"].get("geometry") == { assert identify[0]["image"].get("geometry") == {
"width": 60, "width": 60,
"height": 60, "height": 60,
@ -4006,10 +3961,6 @@ def miff_rgb8_img(tmp_path_factory, tmp_normal_png):
}, str(identify) }, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in ["Undefined", "LSB",], str(
identify
) # FIXME: should be LSB
assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == { assert identify[0]["image"].get("pageGeometry") == {
"width": 60, "width": 60,
@ -4187,6 +4138,60 @@ def jpg_2000_pdf(tmp_path_factory, jpg_2000_img, request):
out_pdf.unlink() out_pdf.unlink()
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def jpg_2000_rgba8_pdf(tmp_path_factory, jpg_2000_rgba8_img, request):
out_pdf = tmp_path_factory.mktemp("jpg_2000_rgba8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
jpg_2000_rgba8_img,
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert not hasattr(p.pages[0].Resources.XObject.Im0, "ColorSpace")
assert p.pages[0].Resources.XObject.Im0.Filter == "/JPXDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
yield out_pdf
out_pdf.unlink()
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def jpg_2000_rgba16_pdf(tmp_path_factory, jpg_2000_rgba16_img, request):
out_pdf = tmp_path_factory.mktemp("jpg_2000_rgba16_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
jpg_2000_rgba16_img,
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 16
assert not hasattr(p.pages[0].Resources.XObject.Im0, "ColorSpace")
assert p.pages[0].Resources.XObject.Im0.Filter == "/JPXDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
yield out_pdf
out_pdf.unlink()
@pytest.fixture(scope="session", params=["internal", "pikepdf"]) @pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_rgb8_pdf(tmp_path_factory, png_rgb8_img, request): def png_rgb8_pdf(tmp_path_factory, png_rgb8_img, request):
out_pdf = tmp_path_factory.mktemp("png_rgb8_pdf") / "out.pdf" out_pdf = tmp_path_factory.mktemp("png_rgb8_pdf") / "out.pdf"
@ -4276,9 +4281,10 @@ def gif_transparent_pdf(tmp_path_factory, gif_transparent_img, request):
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
) )
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Height == 60
@ -5579,6 +5585,39 @@ def test_jpg_2000(tmp_path_factory, jpg_2000_img, jpg_2000_pdf):
compare_pdfimages_jp2(tmpdir, jpg_2000_img, jpg_2000_pdf) compare_pdfimages_jp2(tmpdir, jpg_2000_img, jpg_2000_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.skipif(
not HAVE_JP2, reason="requires imagemagick with support for jpeg2000"
)
def test_jpg_2000_rgba8(tmp_path_factory, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf):
tmpdir = tmp_path_factory.mktemp("jpg_2000_rgba8")
compare_ghostscript(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf)
compare_poppler(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf)
# compare_mupdf(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf)
compare_pdfimages_jp2(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.skipif(
not HAVE_JP2, reason="requires imagemagick with support for jpeg2000"
)
def test_jpg_2000_rgba16(tmp_path_factory, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf):
tmpdir = tmp_path_factory.mktemp("jpg_2000_rgba16")
compare_ghostscript(
tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf, gsdevice="tiff48nc"
)
# poppler outputs 8-bit RGB so the comparison will not be exact
# compare_poppler(tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf, exact=False)
# compare_mupdf(tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf)
compare_pdfimages_jp2(tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf)
@pytest.mark.skipif( @pytest.mark.skipif(
sys.platform in ["win32"], sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS", reason="test utilities not available on Windows and MacOS",
@ -6831,6 +6870,96 @@ def general_input(request):
return request.param return request.param
@pytest.mark.skipif(not HAVE_FAKETIME, reason="requires faketime")
@pytest.mark.parametrize(
"engine,testdata,timezone,pdfa",
itertools.product(
["internal", "pikepdf"],
["2021-02-05 17:49:00"],
["Europe/Berlin", "GMT+12"],
[True, False],
),
)
def test_faketime(tmp_path_factory, jpg_img, engine, testdata, timezone, pdfa):
expected = tz2utcstrftime(testdata, "D:%Y%m%d%H%M%SZ", timezone)
out_pdf = tmp_path_factory.mktemp("faketime") / "out.pdf"
subprocess.check_call(
["env", f"TZ={timezone}", "faketime", "-f", testdata, img2pdfprog]
+ (["--pdfa"] if pdfa else [])
+ [
"--producer=",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(jpg_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert p.docinfo.CreationDate == expected
assert p.docinfo.ModDate == expected
if pdfa:
assert p.Root.Metadata.Subtype == "/XML"
assert p.Root.Metadata.Type == "/Metadata"
expected = tz2utcstrftime(testdata, "%Y-%m-%dT%H:%M:%SZ", timezone)
root = ET.fromstring(p.Root.Metadata.read_bytes())
for k in ["ModifyDate", "CreateDate"]:
assert (
root.find(
f".//xmp:{k}", {"xmp": "http://ns.adobe.com/xap/1.0/"}
).text
== expected
)
out_pdf.unlink()
@pytest.mark.parametrize(
"engine,testdata,timezone,pdfa",
itertools.product(
["internal", "pikepdf"],
[
"2021-02-05 17:49:00",
"2021-02-05T17:49:00",
"Fri, 05 Feb 2021 17:49:00 +0100",
"last year 12:00",
],
["Europe/Berlin", "GMT+12"],
[True, False],
),
)
def test_date(tmp_path_factory, jpg_img, engine, testdata, timezone, pdfa):
# we use the date utility to convert the timestamp from the local
# timezone into UTC with the format used by PDF
expected = tz2utcstrftime(testdata, "D:%Y%m%d%H%M%SZ", timezone)
out_pdf = tmp_path_factory.mktemp("faketime") / "out.pdf"
subprocess.check_call(
["env", f"TZ={timezone}", img2pdfprog]
+ (["--pdfa"] if pdfa else [])
+ [
f"--moddate={testdata}",
f"--creationdate={testdata}",
"--producer=",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(jpg_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert p.docinfo.CreationDate == expected
assert p.docinfo.ModDate == expected
if pdfa:
assert p.Root.Metadata.Subtype == "/XML"
assert p.Root.Metadata.Type == "/Metadata"
expected = tz2utcstrftime(testdata, "%Y-%m-%dT%H:%M:%SZ", timezone)
root = ET.fromstring(p.Root.Metadata.read_bytes())
for k in ["ModifyDate", "CreateDate"]:
assert (
root.find(
f".//xmp:{k}", {"xmp": "http://ns.adobe.com/xap/1.0/"}
).text
== expected
)
out_pdf.unlink()
@pytest.mark.parametrize("engine", ["internal", "pikepdf"]) @pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_general(general_input, engine): def test_general(general_input, engine):
inputf = os.path.join(os.path.dirname(__file__), "tests", "input", general_input) inputf = os.path.join(os.path.dirname(__file__), "tests", "input", general_input)

View file

@ -37,9 +37,8 @@ def getBox(data, byteStart, noBytes):
def parse_ihdr(data): def parse_ihdr(data):
height = struct.unpack(">I", data[0:4])[0] height, width, channels, bpp = struct.unpack(">IIHB", data[:11])
width = struct.unpack(">I", data[4:8])[0] return width, height, channels, bpp + 1
return width, height
def parse_colr(data): def parse_colr(data):
@ -59,8 +58,8 @@ def parse_colr(data):
def parse_resc(data): def parse_resc(data):
hnum, hden, vnum, vden, hexp, vexp = struct.unpack(">HHHHBB", data) hnum, hden, vnum, vden, hexp, vexp = struct.unpack(">HHHHBB", data)
hdpi = ((hnum / hden) * (10 ** hexp) * 100) / 2.54 hdpi = ((hnum / hden) * (10**hexp) * 100) / 2.54
vdpi = ((vnum / vden) * (10 ** vexp) * 100) / 2.54 vdpi = ((vnum / vden) * (10**vexp) * 100) / 2.54
return hdpi, vdpi return hdpi, vdpi
@ -85,13 +84,13 @@ def parse_jp2h(data):
while byteStart < noBytes and boxLengthValue != 0: while byteStart < noBytes and boxLengthValue != 0:
boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes) boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes)
if boxType == b"ihdr": if boxType == b"ihdr":
width, height = parse_ihdr(boxContents) width, height, channels, bpp = parse_ihdr(boxContents)
elif boxType == b"colr": elif boxType == b"colr":
colorspace = parse_colr(boxContents) colorspace = parse_colr(boxContents)
elif boxType == b"res ": elif boxType == b"res ":
hdpi, vdpi = parse_res(boxContents) hdpi, vdpi = parse_res(boxContents)
byteStart = byteEnd byteStart = byteEnd
return (width, height, colorspace, hdpi, vdpi) return (width, height, colorspace, hdpi, vdpi, channels, bpp)
def parsejp2(data): def parsejp2(data):
@ -102,7 +101,9 @@ def parsejp2(data):
while byteStart < noBytes and boxLengthValue != 0: while byteStart < noBytes and boxLengthValue != 0:
boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes) boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes)
if boxType == b"jp2h": if boxType == b"jp2h":
width, height, colorspace, hdpi, vdpi = parse_jp2h(boxContents) width, height, colorspace, hdpi, vdpi, channels, bpp = parse_jp2h(
boxContents
)
break break
byteStart = byteEnd byteStart = byteEnd
if not width: if not width:
@ -112,13 +113,41 @@ def parsejp2(data):
if not colorspace: if not colorspace:
raise Exception("no colorspace in jp2 header") raise Exception("no colorspace in jp2 header")
# retrieving the dpi is optional so we do not error out if not present # retrieving the dpi is optional so we do not error out if not present
return (width, height, colorspace, hdpi, vdpi) return (width, height, colorspace, hdpi, vdpi, channels, bpp)
def parsej2k(data):
lsiz, rsiz, xsiz, ysiz, xosiz, yosiz, _, _, _, _, csiz = struct.unpack(
">HHIIIIIIIIH", data[4:42]
)
ssiz = [None] * csiz
xrsiz = [None] * csiz
yrsiz = [None] * csiz
for i in range(csiz):
ssiz[i], xrsiz[i], yrsiz[i] = struct.unpack(
"BBB", data[42 + 3 * i : 42 + 3 * (i + 1)]
)
assert ssiz == [7, 7, 7]
return xsiz - xosiz, ysiz - yosiz, None, None, None, csiz, 8
def parse(data):
if data[:4] == b"\xff\x4f\xff\x51":
return parsej2k(data)
else:
return parsejp2(data)
if __name__ == "__main__": if __name__ == "__main__":
import sys import sys
width, height, colorspace = parsejp2(open(sys.argv[1]).read()) width, height, colorspace, hdpi, vdpi, channels, bpp = parse(
sys.stdout.write("width = %d" % width) open(sys.argv[1], "rb").read()
sys.stdout.write("height = %d" % height) )
sys.stdout.write("colorspace = %s" % colorspace) print("width = %d" % width)
print("height = %d" % height)
print("colorspace = %s" % colorspace)
print("hdpi = %s" % hdpi)
print("vdpi = %s" % vdpi)
print("channels = %s" % channels)
print("bpp = %s" % bpp)