Output is not deterministic unless using --engine=internal #150
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Using img2pdf 0.4.4 from debian bookworm. The issue also occurs on img2pdf 0.4.0 from debian bullseye. The issue does not occur on img2pdf 0.3.3 from debian buster.
The manpage says
--nodate"makes the output deterministic between individual runs".However:
produces:
But if I instead use
--engine=internal:I get:
The need to use
--engine=internalto produce pdf files deterministically is not documented anywhere. Version 0.3.3 used to produce deterministic pdf files just using--nodateas documented.Thank you. This is absolutely a bug.
The statement about deterministic output comes from the time when img2pdf only supported the internal engine as well as pdfrw. Today, pdfrw is unmaintainened and has been removed from the tests. It might be completely broken. Instead, we now have the new pikepdf engine which has since become the default. The pikepdf engine is the problem because it produced non-deterministic
/IDvalues.I will investigate.
I think one can pass
static_id=Truetopikepdf.Pdf.save().However,
--nodateonly implies dates, so maybe a separate option should be added for this.Yes, this should be independent of
--nodate. It would be wrong to overload--nodatewith additional functionality other than not embedding the current date and time.But it would also be wrong tho pass
static_id=Truetopikepdf.Pdf.save()because then the/IDmetadata would not be generate anymore at all which would make the PDF files generated by img2pdf not uniquely identifiable anymore according to PDF 1.7 reference section 10.3 "File Identifiers".I think this is the correct solution:
https://github.com/pikepdf/pikepdf/pull/400