josch/img2pdf

Fork 10

Some 0.4.0 test failures on Fedora #85

New issue

Open

opened 2021-04-25 19:58:55 +00:00 by josch · 4 comments

josch commented

2021-04-25 19:58:55 +00:00

Owner

By Georg Sauthoff on 2020-09-20T10:38:44.913Z

So I've bumped the Fedora package to 0.4.0. AFAICS, the test suite switched to pytest although a test shell script is still included.

However, I execute the tests now like this:

PYTHONPATH=src python3 -m pytest src/img2pdf_test.py

Some tests fail because of the 'endianess' vs. 'endianness' key difference in ImageMagick - I think this was previously fixed in test.sh. FWIW, ImageMagick on Fedora >= 31 uses 'endiannes'. I quick fixed this with:

$ cat test-byteorder.diff 
--- a/src/img2pdf_test.py 11:09:08.279377606 +0200
+++ b/src/img2pdf_test.py	2020-09-20 11:15:42.996704883 +0200
@@ -835,6 +835,12 @@
     yield tmp_palette8_png
     tmp_palette8_png.unlink()

+def get_byteorder(identify):
+    r = identify[0]["image"].get("endianess")
+    if r is None:
+       return identify[0]["image"].get("endianness")
+    return r
+

 @pytest.fixture(scope="session")
 def jpg_img(tmp_path_factory, tmp_normal_png):
$ patch -p1 < test-byteorder.diff
$ sed -i 's/assert identify\[0\]\["image"\]\.get("endianess")/assert get_byteorder(identify)/' src/img2pdf_test.py

Another problem is the location of the sRGB.icc some tests use now. On Fedora it's located elsewhere. Thus, I quick fixed (?) it with:

sed -i 's@"/usr/share/color/icc/sRGB.icc"@"/usr/share/color/icc/colord/sRGB.icc"@' src/img2pdf_test.py

Unfortunately, after there are still 3 failing tests (9 errors) - on x86_64:

By josch on 2020-09-20T13:32:01.569Z

So I've bumped the Fedora package to 0.4.0. AFAICS, the test suite switched to pytest although a test shell script is still included.

Yes, that's an oversight and has been fixed in c4fb1d886f

Some tests fail because of the 'endianess' vs. 'endianness' key difference in ImageMagick

Yes, there is some difference in the -verbose output depending on the version which was fixed. Since I didn't want to have these kind of problems in the future, I switched from parsing -verbose which is meant to be human readable to the json output, which is meant to be machine readable and is "guaranteed not to change other than the possibility of additional attributes": https://github.com/ImageMagick/ImageMagick6/issues/90#issuecomment-668281074

Unfortunately, there was another bug in the json output which was quickly fixed: https://github.com/ImageMagick/ImageMagick6/issues/90#issuecomment-682266441

So the correct fix, is to query the json version and then choose to retrieve either endianess or endianness -- I still have to make a commit for that.

The temporary fix for Debian is: https://sources.debian.org/src/img2pdf/0.4.0-1/debian/patches/endianess-endianness.patch/

Another problem is the location of the sRGB.icc some tests use now.

Yes, the tests should not rely on the icc profiles shipped by the OS. This is solved by creating our own ICC profiles in 11907242a5.

Unfortunately, after there are still 3 failing tests

At least the failures are now easier to read than they were before with the shell script. :)

The problem with icc.png is probably, that the sRGB.icc shipped by fedora has a different checksum than the one shipped in Debian and thus the resulting in a different checksum. The problem is fixed by 11907242a5 as well, because no external icc profile is needed anymore.

The next failure is the difference between depth and baseDepth which in Debian I fixed like this:

https://sources.debian.org/src/img2pdf/0.4.0-1/debian/patches/imdepth.patch/

The last failure is a wrong ghostscript rendering of the ccitt encoded pdf. I have no clue about that one. Maybe it's a ghostscript problem? Then it'll probably pop up in Debian soon as well.

By Georg Sauthoff on 2020-09-21T10:12:50.954Z

Ok, I've added your imdepth.patch and I'm skipping test_png_icc and test_tiff_ccitt_nometa2 for this release.

With this release I'm also adding pikepdf as hard dependency. I understand that pdfrw is deprecated and that pikepdf is the future.

Previously, the fedora package didn't depend on pdfrw - which is also ok, because img2pdf also has built-in support for PDF writing.

However, AFAICS, you are basically recommending pikepdf over the internal PDF engine because it offers additional features, such as better compression. Is this a fair summary?

By josch on 2020-09-21T14:51:22.232Z

Yes, that's part of the reason. The additional features are stuff like being able to produce linearized pdfs (fast web view) or the better compression you mentioned. The internal PDF engine will never have these features. The test suite makes sure that both, the pikepdf as well as the internal engine produce pdf files that render equally. My reason to keep the internal engine nevertheless are:

pure Python (some platforms might not have extra modules available)
safe fallback if pdf libraries vanish (pdfrw already got abandoned)
forces the codebase to remain modular (maybe there will be better pdf libraries in the future)

For distributions like Fedora or Debian, there is no reason not to add a hard dependency on pikepdf, because the library is readily available in the respective repos.

*By Georg Sauthoff on 2020-09-20T10:38:44.913Z* So I've bumped the Fedora package to 0.4.0. AFAICS, the test suite switched to pytest although a test shell script is still included. However, I execute the tests now like this: ``` PYTHONPATH=src python3 -m pytest src/img2pdf_test.py ``` Some tests fail because of the 'endianess' vs. 'endianness' key difference in ImageMagick - I think this was [previously fixed][1] in `test.sh`. FWIW, ImageMagick on Fedora >= 31 uses 'endiannes'. I quick fixed this with: ``` $ cat test-byteorder.diff --- a/src/img2pdf_test.py 11:09:08.279377606 +0200 +++ b/src/img2pdf_test.py 2020-09-20 11:15:42.996704883 +0200 @@ -835,6 +835,12 @@ yield tmp_palette8_png tmp_palette8_png.unlink() +def get_byteorder(identify): + r = identify[0]["image"].get("endianess") + if r is None: + return identify[0]["image"].get("endianness") + return r + @pytest.fixture(scope="session") def jpg_img(tmp_path_factory, tmp_normal_png): $ patch -p1 < test-byteorder.diff $ sed -i 's/assert identify\[0\]\["image"\]\.get("endianess")/assert get_byteorder(identify)/' src/img2pdf_test.py ``` Another problem is the location of the `sRGB.icc` some tests use now. On Fedora it's located elsewhere. Thus, I quick fixed (?) it with: ``` sed -i 's@"/usr/share/color/icc/sRGB.icc"@"/usr/share/color/icc/colord/sRGB.icc"@' src/img2pdf_test.py ``` Unfortunately, after there are still 3 failing tests (9 errors) - on x86_64: - https://download.copr.fedorainfracloud.org/results/gsauthof/fedora/fedora-31-x86_64/01678718-python-img2pdf/build.log.gz - https://copr.fedorainfracloud.org/coprs/gsauthof/fedora/build/1678718/ [1]: https://gitlab.mister-muffin.de/josch/img2pdf/commit/c6d04acc4bfc72906e7d7d71850a709da4c5debb --- *By josch on 2020-09-20T13:32:01.569Z* --- > So I've bumped the Fedora package to 0.4.0. AFAICS, the test suite switched to pytest although a test shell script is still included. Yes, that's an oversight and has been fixed in c4fb1d886f0208f50e3f2593fb3c483b4a42a8a2 > Some tests fail because of the 'endianess' vs. 'endianness' key difference in ImageMagick Yes, there is some difference in the `-verbose` output depending on the version which was fixed. Since I didn't want to have these kind of problems in the future, I switched from parsing `-verbose` which is meant to be human readable to the json output, which is meant to be machine readable and is "guaranteed not to change other than the possibility of additional attributes": https://github.com/ImageMagick/ImageMagick6/issues/90#issuecomment-668281074 Unfortunately, there was another bug in the json output which was quickly fixed: https://github.com/ImageMagick/ImageMagick6/issues/90#issuecomment-682266441 So the correct fix, is to query the json version and then choose to retrieve either endianess or endianness -- I still have to make a commit for that. The temporary fix for Debian is: https://sources.debian.org/src/img2pdf/0.4.0-1/debian/patches/endianess-endianness.patch/ > Another problem is the location of the sRGB.icc some tests use now. Yes, the tests should not rely on the icc profiles shipped by the OS. This is solved by creating our own ICC profiles in 11907242a5b5cebb8c267e8f8d4b945d7357baf0. > Unfortunately, after there are still 3 failing tests At least the failures are now easier to read than they were before with the shell script. :) The problem with `icc.png` is probably, that the sRGB.icc shipped by fedora has a different checksum than the one shipped in Debian and thus the resulting in a different checksum. The problem is fixed by 11907242a5b5cebb8c267e8f8d4b945d7357baf0 as well, because no external icc profile is needed anymore. The next failure is the difference between depth and baseDepth which in Debian I fixed like this: https://sources.debian.org/src/img2pdf/0.4.0-1/debian/patches/imdepth.patch/ The last failure is a wrong ghostscript rendering of the ccitt encoded pdf. I have no clue about that one. Maybe it's a ghostscript problem? Then it'll probably pop up in Debian soon as well. --- *By Georg Sauthoff on 2020-09-21T10:12:50.954Z* --- Ok, I've added your `imdepth.patch` and I'm skipping test_png_icc and test_tiff_ccitt_nometa2 for this release. With this release I'm also adding `pikepdf` as hard dependency. I understand that `pdfrw` is deprecated and that `pikepdf` is the future. Previously, the fedora package didn't depend on `pdfrw` - which is also ok, because img2pdf also has built-in support for PDF writing. However, AFAICS, you are basically recommending `pikepdf` over the internal PDF engine because it offers additional features, such as better compression. Is this a fair summary? --- *By josch on 2020-09-21T14:51:22.232Z* --- Yes, that's part of the reason. The additional features are stuff like being able to produce linearized pdfs (fast web view) or the better compression you mentioned. The internal PDF engine will never have these features. The test suite makes sure that both, the pikepdf as well as the internal engine produce pdf files that render equally. My reason to keep the internal engine nevertheless are: - pure Python (some platforms might not have extra modules available) - safe fallback if pdf libraries vanish (pdfrw already got abandoned) - forces the codebase to remain modular (maybe there will be better pdf libraries in the future) For distributions like Fedora or Debian, there is no reason not to add a hard dependency on pikepdf, because the library is readily available in the respective repos.

gms commented

2021-07-04 10:16:51 +00:00

Apparently, the accounts weren't migrated when switching from gitlab to gitea. I had to re-register my account here.

Another problem is the location of the sRGB.icc some tests use now.

Yes, the tests should not rely on the icc profiles shipped by the OS. This is solved by creating our own ICC profiles in 11907242a5.

Hm, this didn't end up in release 0.4.1? I'm still seeing the /usr/share/color/icc/sRGB.icc location errors in the current main branch.

Thus, the related unit test cases still fail in 0.4.1:

https://kojipkgs.fedoraproject.org//work/tasks/6227/71286227/build.log

The 'endianess' vs. 'endianness' issue is still an issue on Fedora Rawhide.

Your check

endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"

doesn't work there because version uses a completely different scheme:

'version': 'ImageMagick 6.9.11-27 Q16 x86_64 2021-05-21 https://imagemagick.org'

So, as before, I propose to just test for both key names and don't bother with any version checking. I consider the probability that the dictionary ends up including endianess and endianness, at some point, as extremely low.

And I would put the check in a helper function (see my previous comment) in case things change again then just one code location need to be patched.

Apparently, the accounts weren't migrated when switching from gitlab to gitea. I had to re-register my account here. --- > > Another problem is the location of the sRGB.icc some tests use now. > > Yes, the tests should not rely on the icc profiles shipped by the OS. This is solved by creating our own ICC profiles in 11907242a5. Hm, this didn't end up in release 0.4.1? I'm still seeing the `/usr/share/color/icc/sRGB.icc` location errors in the [current main branch](https://gitlab.mister-muffin.de/josch/img2pdf/src/branch/main/src/img2pdf_test.py#L307). Thus, the related unit test cases still fail in 0.4.1: https://kojipkgs.fedoraproject.org//work/tasks/6227/71286227/build.log --- The 'endianess' vs. 'endianness' issue is still an issue on Fedora Rawhide. Your check ``` endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" ``` doesn't work there because version uses a completely different scheme: ``` 'version': 'ImageMagick 6.9.11-27 Q16 x86_64 2021-05-21 https://imagemagick.org' ``` So, as before, I propose to just test for both key names and don't bother with any version checking. I consider the probability that the dictionary ends up including `endianess` and `endianness`, at some point, as extremely low. And I would put the check in a helper function (see my previous comment) in case things change again then just one code location need to be patched.

gms commented

2021-07-04 10:49:10 +00:00

On Fedora 33 I have a bunch of test case failures for depth assertions which look similar to the old ones, e.g.:

        )  # FIXME: should be LSB
        if identify[0].get("version", "0") < "1.0":
>           assert identify[0]["image"].get("depth") == 12, str(identify)
E           AssertionError: [{'image': {'name': '/tmp/pytest-of-gms/pytest-1/tiff_rgb84/in.tiff', 'format': 'TIFF', 'formatDescription': 'Tagged Image File Format', 'mimeType': 'image/tiff', 'class': 'DirectClass', 'geometry': {'width': 60, 'height': 60, 'x': 0, 'y': 0}, 'units': 'PixelsPerInch', 'type': 'TrueColor', 'baseType': 'TrueColor', 'endianness': 'Undefined', 'colorspace': 'sRGB', 'depth': 16, 'baseDepth': 12, 'channelDepth': {'red': 12, 'green': 12, 'blue': 12}, 'pixels': 3600, 'imageStatistics': {'all': {'min': 0, 'max': 4096, 'mean': 1195.22, 'standardDeviation': 27320.8, 'kurtosis': -1.001, 'skewness': 0.915043, 'entropy': 0.455603}}, 'channelStatistics': {'red': {'min': 0, 'max': 4096, 'mean': 1195.34, 'standardDeviation': 27313.7, 'kurtosis': -1.00085, 'skewness': 0.915059, 'entropy': 0.456815}, 'green': {'min': 0, 'max': 4096, 'mean': 1197.3, 'standardDeviation': 27334.9, 'kurtosis': -1.0066, 'skewness': 0.912182, 'entropy': 0.455643}, 'blue': {'min': 0, 'max': 4096, 'mean': 1193.01, 'standardDeviation': 27313.9, 'kurtosis': -0.997758, 'skewness': 0.917129, 'entropy': 0.454351}}, 'renderingIntent': 'Perceptual', 'gamma': 0.454545, 'chromaticity': {'redPrimary': {'x': 0.64, 'y': 0.33}, 'greenPrimary': {'x': 0.3, 'y': 0.6}, 'bluePrimary': {'x': 0.15, 'y': 0.06}, 'whitePrimary': {'x': 0.3127, 'y': 0.329}}, 'backgroundColor': '#FFFFFFFFFFFF', 'borderColor': '#DFDFDFDFDFDF', 'matteColor': '#BDBDBDBDBDBD', 'transparentColor': '#000000000000', 'interlace': 'None', 'intensity': 'Undefined', 'compose': 'Over', 'pageGeometry': {'width': 60, 'height': 60, 'x': 0, 'y': 0}, 'dispose': 'Undefined', 'iterations': 0, 'compression': 'Zip', 'orientation': 'TopLeft', 'properties': {'date:create': '2021-07-04T10:29:21+00:00', 'date:modify': '2021-07-04T10:29:21+00:00', 'signature': '89627a773d2fe8cbe84652cb34dacbbb84ae2a141b1c8c78caec7cfb84db51bd', 'tiff:alpha': 'unspecified', 'tiff:endian': 'lsb', 'tiff:photometric': 'RGB', 'tiff:rows-per-strip': '60'}, 'artifacts': {'filename': '/tmp/pytest-of-gms/pytest-1/tiff_rgb84/in.tiff'}, 'tainted': False, 'filesize': '5190B', 'numberPixels': '3600', 'pixelsPerSecond': '4.03318MB', 'userTime': '0.000u', 'elapsedTime': '0:01.000', 'version': 'ImageMagick 6.9.11-27 Q16 x86_64 2020-08-11 https://imagemagick.org'}}]
E           assert 16 == 12
E            +  where 16 = <built-in method get of dict object at 0x7fc28980bd00>('depth')
E            +    where <built-in method get of dict object at 0x7fc28980bd00> = {'artifacts': {'filename': '/tmp/pytest-of-gms/pytest-1/tiff_rgb84/in.tiff'}, 'backgroundColor': '#FFFFFFFFFFFF', 'baseDepth': 12, 'baseType': 'TrueColor', ...}.get

src/img2pdf_test.py:2568: AssertionError

So, again, the version check probably goes in the wrong direction.

The src/img2pdf_test.py::test_tiff_ccitt_nometa2 still fail on Fedora 33, e.g.:


popenargs = (['compare', '-metric', 'AE', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa2_img0/in.tiff', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa20/gs-1.png', 'null:'],), kwargs = {}, retcode = 1
cmd = ['compare', '-metric', 'AE', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa2_img0/in.tiff', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa20/gs-1.png', 'null:']

    def check_call(*popenargs, **kwargs):
        """Run command with arguments.  Wait for command to complete.  If
        the exit code was zero then return, otherwise raise
        CalledProcessError.  The CalledProcessError object will have the
        return code in the returncode attribute.
    
        The arguments are the same as for the call function.  Example:
    
        check_call(["ls", "-l"])
        """
        retcode = call(*popenargs, **kwargs)
        if retcode:
            cmd = kwargs.get("args")
            if cmd is None:
                cmd = popenargs[0]
>           raise CalledProcessError(retcode, cmd)
E           subprocess.CalledProcessError: Command '['compare', '-metric', 'AE', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa2_img0/in.tiff', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa20/gs-1.png', 'null:']' returned non-zero exit status 1.

/usr/lib64/python3.9/subprocess.py:373: CalledProcessError
------------------------------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------------------------------
970
_____________________________________________________________________________________________________________ test_tiff_ccitt_nometa2[pikepdf]

On Fedora 33 I have a bunch of test case failures for depth assertions which look similar to the old ones, e.g.: ``` ) # FIXME: should be LSB if identify[0].get("version", "0") < "1.0": > assert identify[0]["image"].get("depth") == 12, str(identify) E AssertionError: [{'image': {'name': '/tmp/pytest-of-gms/pytest-1/tiff_rgb84/in.tiff', 'format': 'TIFF', 'formatDescription': 'Tagged Image File Format', 'mimeType': 'image/tiff', 'class': 'DirectClass', 'geometry': {'width': 60, 'height': 60, 'x': 0, 'y': 0}, 'units': 'PixelsPerInch', 'type': 'TrueColor', 'baseType': 'TrueColor', 'endianness': 'Undefined', 'colorspace': 'sRGB', 'depth': 16, 'baseDepth': 12, 'channelDepth': {'red': 12, 'green': 12, 'blue': 12}, 'pixels': 3600, 'imageStatistics': {'all': {'min': 0, 'max': 4096, 'mean': 1195.22, 'standardDeviation': 27320.8, 'kurtosis': -1.001, 'skewness': 0.915043, 'entropy': 0.455603}}, 'channelStatistics': {'red': {'min': 0, 'max': 4096, 'mean': 1195.34, 'standardDeviation': 27313.7, 'kurtosis': -1.00085, 'skewness': 0.915059, 'entropy': 0.456815}, 'green': {'min': 0, 'max': 4096, 'mean': 1197.3, 'standardDeviation': 27334.9, 'kurtosis': -1.0066, 'skewness': 0.912182, 'entropy': 0.455643}, 'blue': {'min': 0, 'max': 4096, 'mean': 1193.01, 'standardDeviation': 27313.9, 'kurtosis': -0.997758, 'skewness': 0.917129, 'entropy': 0.454351}}, 'renderingIntent': 'Perceptual', 'gamma': 0.454545, 'chromaticity': {'redPrimary': {'x': 0.64, 'y': 0.33}, 'greenPrimary': {'x': 0.3, 'y': 0.6}, 'bluePrimary': {'x': 0.15, 'y': 0.06}, 'whitePrimary': {'x': 0.3127, 'y': 0.329}}, 'backgroundColor': '#FFFFFFFFFFFF', 'borderColor': '#DFDFDFDFDFDF', 'matteColor': '#BDBDBDBDBDBD', 'transparentColor': '#000000000000', 'interlace': 'None', 'intensity': 'Undefined', 'compose': 'Over', 'pageGeometry': {'width': 60, 'height': 60, 'x': 0, 'y': 0}, 'dispose': 'Undefined', 'iterations': 0, 'compression': 'Zip', 'orientation': 'TopLeft', 'properties': {'date:create': '2021-07-04T10:29:21+00:00', 'date:modify': '2021-07-04T10:29:21+00:00', 'signature': '89627a773d2fe8cbe84652cb34dacbbb84ae2a141b1c8c78caec7cfb84db51bd', 'tiff:alpha': 'unspecified', 'tiff:endian': 'lsb', 'tiff:photometric': 'RGB', 'tiff:rows-per-strip': '60'}, 'artifacts': {'filename': '/tmp/pytest-of-gms/pytest-1/tiff_rgb84/in.tiff'}, 'tainted': False, 'filesize': '5190B', 'numberPixels': '3600', 'pixelsPerSecond': '4.03318MB', 'userTime': '0.000u', 'elapsedTime': '0:01.000', 'version': 'ImageMagick 6.9.11-27 Q16 x86_64 2020-08-11 https://imagemagick.org'}}] E assert 16 == 12 E + where 16 = <built-in method get of dict object at 0x7fc28980bd00>('depth') E + where <built-in method get of dict object at 0x7fc28980bd00> = {'artifacts': {'filename': '/tmp/pytest-of-gms/pytest-1/tiff_rgb84/in.tiff'}, 'backgroundColor': '#FFFFFFFFFFFF', 'baseDepth': 12, 'baseType': 'TrueColor', ...}.get src/img2pdf_test.py:2568: AssertionError ``` So, again, the version check probably goes in the wrong direction. --- The `src/img2pdf_test.py::test_tiff_ccitt_nometa2` still fail on Fedora 33, e.g.: ``` popenargs = (['compare', '-metric', 'AE', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa2_img0/in.tiff', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa20/gs-1.png', 'null:'],), kwargs = {}, retcode = 1 cmd = ['compare', '-metric', 'AE', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa2_img0/in.tiff', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa20/gs-1.png', 'null:'] def check_call(*popenargs, **kwargs): """Run command with arguments. Wait for command to complete. If the exit code was zero then return, otherwise raise CalledProcessError. The CalledProcessError object will have the return code in the returncode attribute. The arguments are the same as for the call function. Example: check_call(["ls", "-l"]) """ retcode = call(*popenargs, **kwargs) if retcode: cmd = kwargs.get("args") if cmd is None: cmd = popenargs[0] > raise CalledProcessError(retcode, cmd) E subprocess.CalledProcessError: Command '['compare', '-metric', 'AE', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa2_img0/in.tiff', '/tmp/pytest-of-gms/pytest-2/tiff_ccitt_nometa20/gs-1.png', 'null:']' returned non-zero exit status 1. /usr/lib64/python3.9/subprocess.py:373: CalledProcessError ------------------------------------------------------------------------------------------------------------------- Captured stderr call ------------------------------------------------------------------------------------------------------------------- 970 _____________________________________________________________________________________________________________ test_tiff_ccitt_nometa2[pikepdf] ```