Output is not deterministic unless using --engine=internal
#150
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Using img2pdf 0.4.4 from debian bookworm. The issue also occurs on img2pdf 0.4.0 from debian bullseye. The issue does not occur on img2pdf 0.3.3 from debian buster.
The manpage says
--nodate
"makes the output deterministic between individual runs".However:
produces:
But if I instead use
--engine=internal
:I get:
The need to use
--engine=internal
to produce pdf files deterministically is not documented anywhere. Version 0.3.3 used to produce deterministic pdf files just using--nodate
as documented.Thank you. This is absolutely a bug.
The statement about deterministic output comes from the time when img2pdf only supported the internal engine as well as pdfrw. Today, pdfrw is unmaintainened and has been removed from the tests. It might be completely broken. Instead, we now have the new pikepdf engine which has since become the default. The pikepdf engine is the problem because it produced non-deterministic
/ID
values.I will investigate.
I think one can pass
static_id=True
topikepdf.Pdf.save()
.However,
--nodate
only implies dates, so maybe a separate option should be added for this.Yes, this should be independent of
--nodate
. It would be wrong to overload--nodate
with additional functionality other than not embedding the current date and time.But it would also be wrong tho pass
static_id=True
topikepdf.Pdf.save()
because then the/ID
metadata would not be generate anymore at all which would make the PDF files generated by img2pdf not uniquely identifiable anymore according to PDF 1.7 reference section 10.3 "File Identifiers".I think this is the correct solution:
https://github.com/pikepdf/pikepdf/pull/400