Slightly simplify the getexif procedure #202

Closed
mara0004 wants to merge 2 commits from mara0004/img2pdf:getexif into main
Contributor

No need for a loop here either.
(Note that you could also use a dictionary rather than an if-elif tree for the value -> rotation translation, where valid/supported.)

No need for a loop here either. (Note that you could also use a dictionary rather than an if-elif tree for the value -> rotation translation, where valid/supported.)
mara0004 added 1 commit 2024-08-25 15:56:02 +00:00
No need for a loop here either.
Note that you could also use a dictionary rather than an if-elif tree
for the value -> rotation mapping (where valid/supported).
mara0004 reviewed 2024-08-25 19:06:07 +00:00
@ -1410,0 +1386,4 @@
rotation = 180
elif value == 8:
rotation = 270
elif value in (2, 4, 5, 7):
Author
Contributor

BTW, I think you could theoretically support flipped orientation modes using content stream matrices.
Negative width should give horizontal flip (X inversion), and negative height vertical flip (Y inversion).
Assuming /Rotate is clockwise and applied after the matrix, I believe the procedures would be:

2: H, 0
4: V, 0
5: V, 90
7: V, 270

See also Adam M. Costello's illustration in http://sylvana.net/jpegcrop/exif_orientation.html

BTW, I think you could theoretically support flipped orientation modes using [content stream matrices](https://gitlab.mister-muffin.de/josch/img2pdf/src/commit/819b366bf5751e475b32b8f4670fed819fb5e7af/src/img2pdf.py#L994). Negative width should give horizontal flip (X inversion), and negative height vertical flip (Y inversion). Assuming `/Rotate` is clockwise and applied after the matrix, I believe the procedures would be: ```python 2: H, 0 4: V, 0 5: V, 90 7: V, 270 ``` See also Adam M. Costello's illustration in http://sylvana.net/jpegcrop/exif_orientation.html
mara0004 added 1 commit 2024-08-26 11:17:07 +00:00
josch reviewed 2024-08-27 08:13:34 +00:00
@ -1372,41 +1371,42 @@ def get_imgmetadata(
rotation = 0
if rotreq in (None, Rotation.auto, Rotation.ifvalid):
if hasattr(imgdata, "_getexif") and imgdata._getexif() is not None:
Owner

checks like these for _getexif are there for a reason. Could you try and find out what changed that does not make these necessary anymore and since when it is like that?

checks like these for `_getexif` are there for a reason. Could you try and find out what changed that does not make these necessary anymore and since when it is like that?
Author
Contributor

getexif() (without underscore) is available since Pillow 6.0.0, that is more than 5 years ago.
Is requiring that acceptable to you?

Otherwise, I could add a fallback to _getexif(), provided that also returns a subscriptable dict which can be treated the same way as getexif().
But IMHO there's not much point in supporting far outdated versions of Pillow, as that makes the code harder to read.

`getexif()` (without underscore) is available since Pillow [6.0.0](https://pillow.readthedocs.io/en/stable/releasenotes/6.0.0.html#added-exif-class), that is more than 5 years ago. Is requiring that acceptable to you? Otherwise, I could add a fallback to `_getexif()`, provided that also returns a subscriptable dict which can be treated the same way as `getexif()`. But IMHO there's not much point in supporting far outdated versions of Pillow, as that makes the code harder to read.
Owner

It makes the code harder to read, yes. But the advantage is, that this way, recent versions of img2pdf keep working on older platforms and distributions that ship Pillow before version 6 are still receiving support and security updates. I do not think it's smart to break img2pdf on a platform just to have 5 lines less of code. I mean imagine somebody from a library, archive or school (which at least where i'm from are known to update rarely) reports a bug that img2pdf broke for them. Will you explain them that this is because we wanted to have a few lines of code less?

It makes the code harder to read, yes. But the advantage is, that this way, recent versions of img2pdf keep working on older platforms and distributions that ship Pillow before version 6 are still receiving support and security updates. I do not think it's smart to break img2pdf on a platform just to have 5 lines less of code. I mean imagine somebody from a library, archive or school (which at least where i'm from are known to update rarely) reports a bug that img2pdf broke for them. Will you explain them that this is because we wanted to have a few lines of code less?
Author
Contributor

Indeed, I see Debian Buster is still on Pillow 5.4 :/
I understand your BW compat concerns and respect your decision, but would not do it that way in my own projects.

IMHO, if you are on a legacy base system, you should be prepared to accept that newer software may not be easily installable. If the institutions you mention do not update the base system, why should they update img2pdf? Sure, new version provide fixes and add features. But that applies to many, many packages throughout the whole system, so if you care about these improvements, you should upgrade your distribution.
I think it's legitimate to break compatibility with legacy versions of dependencies at some point and ask people to upgrade, and many (most?) other projects are doing so.
Sure, one specific case may not be relevant, but if you removed all workarounds for older versions, I bet it'll be more than 5 lines ;)

Anyway, I will look into adding the fallback.

Indeed, I see Debian Buster is still on Pillow 5.4 :/ I understand your BW compat concerns and respect your decision, but would not do it that way in my own projects. IMHO, if you are on a legacy base system, you should be prepared to accept that newer software may not be easily installable. If the institutions you mention do not update the base system, why should they update img2pdf? Sure, new version provide fixes and add features. But that applies to many, many packages throughout the whole system, so if you care about these improvements, you should upgrade your distribution. I think it's legitimate to break compatibility with legacy versions of dependencies at some point and ask people to upgrade, and many (most?) other projects are doing so. Sure, one specific case may not be relevant, but if you removed all workarounds for older versions, I bet it'll be more than 5 lines ;) Anyway, I will look into adding the fallback.
Owner

Hi, I reworked your commit such that your code will be run if the getexif() function is provided and the old one is run if only _getexif() exists. In the future, the latter can be removed easily.

Hi, I reworked your commit such that your code will be run if the `getexif()` function is provided and the old one is run if only `_getexif()` exists. In the future, the latter can be removed easily.
josch closed this pull request 2025-02-15 13:39:08 +00:00
Author
Contributor

Thanks a lot for coming back to this and finishing up the PRs!

If I read the diff of 23436114f8 correctly, you just duplicated the whole code block? IMHO that's not very nice.
We only need a small fallback when retrieving the exif dict, but the following code can be shared.

I mean something like

if hasattr(imgdata, "getexif"):
    exif_dict = imgdata.getexif()
elif hasattr(imgdata, "_getexif"):
    exif_dict = imgdata._getexif()
else:
    exif_dict = None

o_key = ExifTags.Base.Orientation.value  # 274 rsp. 0x112
if exif_dict and o_key in exif_dict:
    value = exif_dict[o_key]
    ...
Thanks a lot for coming back to this and finishing up the PRs! If I read the diff of https://gitlab.mister-muffin.de/josch/img2pdf/commit/23436114f821c8f8b89ea463d215fbbea7b7cd79 correctly, you just duplicated the whole code block? IMHO that's not very nice. We only need a small fallback when retrieving the exif dict, but the following code can be shared. I mean something like ```python if hasattr(imgdata, "getexif"): exif_dict = imgdata.getexif() elif hasattr(imgdata, "_getexif"): exif_dict = imgdata._getexif() else: exif_dict = None o_key = ExifTags.Base.Orientation.value # 274 rsp. 0x112 if exif_dict and o_key in exif_dict: value = exif_dict[o_key] ... ```
Author
Contributor

Well, unless ExifTags.Base.Orientation.value is not available in older versions yet. But in that case we could just change it to the raw int value.

Well, unless `ExifTags.Base.Orientation.value` is not available in older versions yet. But in that case we could just change it to the raw int value.

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: josch/img2pdf#202
No description provided.