Slightly simplify the getexif procedure #202

mara0004 · 2024-08-25T15:56:00Z

mara0004 commented

2024-08-25 15:56:00 +00:00

No need for a loop here either.
(Note that you could also use a dictionary rather than an if-elif tree for the value -> rotation translation, where valid/supported.)

No need for a loop here either. (Note that you could also use a dictionary rather than an if-elif tree for the value -> rotation translation, where valid/supported.)

mara0004 added 1 commit 2024-08-25 15:56:02 +00:00

Slightly simplify the getexif procedure 4fa160259d

No need for a loop here either.
Note that you could also use a dictionary rather than an if-elif tree
for the value -> rotation mapping (where valid/supported).

mara0004 reviewed 2024-08-25 19:06:07 +00:00

src/img2pdf.py

					
				@ -1410,0 +1386,4 @@

				                rotation = 180

				            elif value == 8:

				                rotation = 270

				            elif value in (2, 4, 5, 7):

mara0004 commented

2024-08-25 19:06:07 +00:00

BTW, I think you could theoretically support flipped orientation modes using content stream matrices.
Negative width should give horizontal flip (X inversion), and negative height vertical flip (Y inversion).
Assuming /Rotate is clockwise and applied after the matrix, I believe the procedures would be:

2: H, 0
4: V, 0
5: V, 90
7: V, 270

See also Adam M. Costello's illustration in http://sylvana.net/jpegcrop/exif_orientation.html

BTW, I think you could theoretically support flipped orientation modes using [content stream matrices](https://gitlab.mister-muffin.de/josch/img2pdf/src/commit/819b366bf5751e475b32b8f4670fed819fb5e7af/src/img2pdf.py#L994). Negative width should give horizontal flip (X inversion), and negative height vertical flip (Y inversion). Assuming `/Rotate` is clockwise and applied after the matrix, I believe the procedures would be: ```python 2: H, 0 4: V, 0 5: V, 90 7: V, 270 ``` See also Adam M. Costello's illustration in http://sylvana.net/jpegcrop/exif_orientation.html

mara0004 added 1 commit 2024-08-26 11:17:07 +00:00

remove now-unnecessary import fd2306fdb2

josch reviewed 2024-08-27 08:13:34 +00:00

src/img2pdf.py

					
				@ -1372,41 +1371,42 @@ def get_imgmetadata(

				    rotation = 0

				    if rotreq in (None, Rotation.auto, Rotation.ifvalid):

				        if hasattr(imgdata, "_getexif") and imgdata._getexif() is not None:

josch commented

2024-08-27 08:13:34 +00:00

checks like these for _getexif are there for a reason. Could you try and find out what changed that does not make these necessary anymore and since when it is like that?

checks like these for `_getexif` are there for a reason. Could you try and find out what changed that does not make these necessary anymore and since when it is like that?

mara0004 commented

2024-08-27 11:57:40 +00:00

getexif() (without underscore) is available since Pillow 6.0.0, that is more than 5 years ago.
Is requiring that acceptable to you?

Otherwise, I could add a fallback to _getexif(), provided that also returns a subscriptable dict which can be treated the same way as getexif().
But IMHO there's not much point in supporting far outdated versions of Pillow, as that makes the code harder to read.

`getexif()` (without underscore) is available since Pillow [6.0.0](https://pillow.readthedocs.io/en/stable/releasenotes/6.0.0.html#added-exif-class), that is more than 5 years ago. Is requiring that acceptable to you? Otherwise, I could add a fallback to `_getexif()`, provided that also returns a subscriptable dict which can be treated the same way as `getexif()`. But IMHO there's not much point in supporting far outdated versions of Pillow, as that makes the code harder to read.

josch commented

2024-09-15 05:54:40 +00:00

It makes the code harder to read, yes. But the advantage is, that this way, recent versions of img2pdf keep working on older platforms and distributions that ship Pillow before version 6 are still receiving support and security updates. I do not think it's smart to break img2pdf on a platform just to have 5 lines less of code. I mean imagine somebody from a library, archive or school (which at least where i'm from are known to update rarely) reports a bug that img2pdf broke for them. Will you explain them that this is because we wanted to have a few lines of code less?

mara0004 commented

2024-09-15 13:57:13 +00:00

Indeed, I see Debian Buster is still on Pillow 5.4 :/
I understand your BW compat concerns and respect your decision, but would not do it that way in my own projects.

IMHO, if you are on a legacy base system, you should be prepared to accept that newer software may not be easily installable. If the institutions you mention do not update the base system, why should they update img2pdf? Sure, new version provide fixes and add features. But that applies to many, many packages throughout the whole system, so if you care about these improvements, you should upgrade your distribution.
I think it's legitimate to break compatibility with legacy versions of dependencies at some point and ask people to upgrade, and many (most?) other projects are doing so.
Sure, one specific case may not be relevant, but if you removed all workarounds for older versions, I bet it'll be more than 5 lines ;)

Anyway, I will look into adding the fallback.

Indeed, I see Debian Buster is still on Pillow 5.4 :/ I understand your BW compat concerns and respect your decision, but would not do it that way in my own projects. IMHO, if you are on a legacy base system, you should be prepared to accept that newer software may not be easily installable. If the institutions you mention do not update the base system, why should they update img2pdf? Sure, new version provide fixes and add features. But that applies to many, many packages throughout the whole system, so if you care about these improvements, you should upgrade your distribution. I think it's legitimate to break compatibility with legacy versions of dependencies at some point and ask people to upgrade, and many (most?) other projects are doing so. Sure, one specific case may not be relevant, but if you removed all workarounds for older versions, I bet it'll be more than 5 lines ;) Anyway, I will look into adding the fallback.

mara0004 referenced this pull request

2024-09-13 21:42:16 +00:00

Add support for JBIG2 (generic coding) #184

mara0004 referenced this pull request from josch/plakativ

2024-09-26 00:15:23 +00:00

fix AttributeError with insertTextbox #10

josch commented

2025-02-15 13:39:08 +00:00

Hi, I reworked your commit such that your code will be run if the getexif() function is provided and the old one is run if only _getexif() exists. In the future, the latter can be removed easily.

Hi, I reworked your commit such that your code will be run if the `getexif()` function is provided and the old one is run if only `_getexif()` exists. In the future, the latter can be removed easily.

josch closed this pull request

2025-02-15 13:39:08 +00:00

mara0004 commented

2025-02-15 17:04:57 +00:00

Thanks a lot for coming back to this and finishing up the PRs!

If I read the diff of 23436114f8 correctly, you just duplicated the whole code block? IMHO that's not very nice.
We only need a small fallback when retrieving the exif dict, but the following code can be shared.

I mean something like

if hasattr(imgdata, "getexif"):
    exif_dict = imgdata.getexif()
elif hasattr(imgdata, "_getexif"):
    exif_dict = imgdata._getexif()
else:
    exif_dict = None

o_key = ExifTags.Base.Orientation.value  # 274 rsp. 0x112
if exif_dict and o_key in exif_dict:
    value = exif_dict[o_key]
    ...

Thanks a lot for coming back to this and finishing up the PRs! If I read the diff of https://gitlab.mister-muffin.de/josch/img2pdf/commit/23436114f821c8f8b89ea463d215fbbea7b7cd79 correctly, you just duplicated the whole code block? IMHO that's not very nice. We only need a small fallback when retrieving the exif dict, but the following code can be shared. I mean something like ```python if hasattr(imgdata, "getexif"): exif_dict = imgdata.getexif() elif hasattr(imgdata, "_getexif"): exif_dict = imgdata._getexif() else: exif_dict = None o_key = ExifTags.Base.Orientation.value # 274 rsp. 0x112 if exif_dict and o_key in exif_dict: value = exif_dict[o_key] ... ```

mara0004 commented

2025-02-15 17:08:01 +00:00

Well, unless ExifTags.Base.Orientation.value is not available in older versions yet. But in that case we could just change it to the raw int value.

Well, unless `ExifTags.Base.Orientation.value` is not available in older versions yet. But in that case we could just change it to the raw int value.