Hello. Working on the “reproducible builds” effort [1], we have noted that a lot of software packages use pdftex/xetex/luatex to build some documents to be included in the binary package. These builds enclose timestamps in the documents, preventing reproducible build. The SOURCE_DATE_EPOCH specification defines an environment variable that is set during software package building and can be used to replace time() calls (or equivalent) for build timestamps. I would like to promote the SOURCE_DATE_EPOCH support for luatex, which would need two patches in the engine code: 1) one to set timestamps in the PDF file built by luatex from SOURCE_DATE_EPOCH with UTC timezone if this environment variable is set. 2) one to set `\today' (and related commands, through \year, \month, \day primitives) from SOURCE_DATE_EPOCH if it is set. A lot of documentation files are built with the use of \today. Using build date for it is not as relevant as the "source code date", that is given by SOURCE_DATE_EPOCH during the build. Moreover, this prevents reproducible build. Changing this date before compilation needs efforts for all software packages, and this can't be done easily after compilation in the PDF file. So adding SOURCE_DATE_EPOCH support to luatex would be very efficient. When SOURCE_DATE_EPOCH is not set, luatex should obviously work in the exact same way as without the patches. Please find attached a starting point for these two features. The code I propose implements them, but can be enhanced to make more checks on the SOURCE_DATE_EPOCH value (if you think this is needed), as done in the present pdftex engine code (that already implements the first part) [3], or as described in [4]. See also a little script test-luatex.sh that illustrates the required result: when SOURCE_DATE_EPOCH is not set, one should get TODAY=April 30, 2016 LOG: This is LuaTeX, Version 0.95.0 (TeX Live 2016) (format=lualatex 2016.4.30) 30 APR 2016 10:52 /ModDate (D:20160430105258-11'00') /CreationDate (D:20160430105258-11'00') /Creator (TeX) and when SOURCE_DATE_EPOCH is set to 31536000 (epoch for 1971-01-01), one should get TODAY=January 1, 1971 LOG: This is LuaTeX, Version 0.95.0 (TeX Live 2016) (format=lualatex 2016.4.30) 1 JAN 1971 00:00 /ModDate (D:19710101000000Z) /CreationDate (D:19710101000000Z) /Creator (TeX) Thanks in advance for considering including these features in luatex. Regards, Alexis Bienvenüe. [1]: https://wiki.debian.org/ReproducibleBuilds [2]: https://reproducible-builds.org/specs/source-date-epoch/ [3]: https://www.tug.org/svn/texlive/trunk/Build/source/texk/web2c/lib/texmfmp.c?... [4]: https://wiki.debian.org/ReproducibleBuilds/TimestampsProposal#Examples
On Mon, May 2, 2016 at 2:27 PM, Alexis Bienvenüe
Hello.
Working on the “reproducible builds” effort [1], we have noted that a lot of software packages use pdftex/xetex/luatex to build some documents to be included in the binary package. These builds enclose timestamps in the documents, preventing reproducible build. The SOURCE_DATE_EPOCH specification defines an environment variable that is set during software package building and can be used to replace time() calls (or equivalent) for build timestamps.
I would like to promote the SOURCE_DATE_EPOCH support for luatex, which would need two patches in the engine code:
Thank you very much for the patches. The source code now is almost frozen, so it's unlikely (but not impossible) that we can apply the patches proposed. But for sure we will consider them for the next release of luatex. -- luigi
Le 04/05/2016 12:17, luigi scarso a écrit :
Thank you very much for the patches. The source code now is almost frozen, so it's unlikely (but not impossible) that we can apply the patches proposed. But for sure we will consider them for the next release of luatex.
Thank you very much. I think you already read the discussion at tex-k@tug.org [1] that mentioned a slightly different behavior, introducing the SOURCE_DATE_EPOCH_TEX_PRIMITIVES environment variable [2]. Regards, Alexis Bienvenüe. [1] https://www.tug.org/pipermail/tex-k/2016-May/002691.html [2] https://www.tug.org/pipermail/tex-k/2016-May/002696.html
On 5/4/2016 12:17 PM, luigi scarso wrote:
On Mon, May 2, 2016 at 2:27 PM, Alexis Bienvenüe
mailto:pado@passoire.fr> wrote: Hello.
Working on the “reproducible builds” effort [1], we have noted that a lot of software packages use pdftex/xetex/luatex to build some documents to be included in the binary package. These builds enclose timestamps in the documents, preventing reproducible build. The SOURCE_DATE_EPOCH specification defines an environment variable that is set during software package building and can be used to replace time() calls (or equivalent) for build timestamps.
I would like to promote the SOURCE_DATE_EPOCH support for luatex, which would need two patches in the engine code:
Thank you very much for the patches. The source code now is almost frozen, so it's unlikely (but not impossible) that we can apply the patches proposed. But for sure we will consider them for the next release of luatex.
a few remarks: (1) The name SOURCE_DATE_EPOCH is not a nice one. Also some value has to be taken from the environment it should not sound like some hack but be a proper public environment variable, working on all platforms (linux,windows,osx,..) (2) Any environment variable should be part of the formal web2c specification, i.e. it is bound to the progname so it should then be something initial_time_stamp.pdftex etc (3) We already have ways to set all these pdf state variables in the resulting file and it's already complicated enough. (4) I understand it's needed for some testing so one can even wonder if it's something user level (one can this even consider this to be a security issue, faking dates and so) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On 5/4/2016 12:34 PM, Hans Hagen wrote:
On 5/4/2016 12:17 PM, luigi scarso wrote:
On Mon, May 2, 2016 at 2:27 PM, Alexis Bienvenüe
mailto:pado@passoire.fr> wrote: Hello.
Working on the “reproducible builds” effort [1], we have noted that a lot of software packages use pdftex/xetex/luatex to build some documents to be included in the binary package. These builds enclose timestamps in the documents, preventing reproducible build. The SOURCE_DATE_EPOCH specification defines an environment variable that is set during software package building and can be used to replace time() calls (or equivalent) for build timestamps.
I would like to promote the SOURCE_DATE_EPOCH support for luatex, which would need two patches in the engine code:
Thank you very much for the patches. The source code now is almost frozen, so it's unlikely (but not impossible) that we can apply the patches proposed. But for sure we will consider them for the next release of luatex.
a few remarks:
(1) The name SOURCE_DATE_EPOCH is not a nice one. Also some value has to be taken from the environment it should not sound like some hack but be a proper public environment variable, working on all platforms (linux,windows,osx,..)
(2) Any environment variable should be part of the formal web2c specification, i.e. it is bound to the progname so it should then be something initial_time_stamp.pdftex etc
(3) We already have ways to set all these pdf state variables in the resulting file and it's already complicated enough.
(4) I understand it's needed for some testing so one can even wonder if it's something user level (one can this even consider this to be a security issue, faking dates and so)
btw, i can imagine a \systemtime primitive counter that one can set and consult and that will initialize from systemtime.<progname> and that a macro package can consult and use to set whatever it wants to set in front- and backend, which is more generic and flexible because in pdf files there's all kind of data with time related properties (xmp date etc) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Le 04/05/2016 12:34, Hans Hagen a écrit :
(1) The name SOURCE_DATE_EPOCH is not a nice one. Also some value has to be taken from the environment it should not sound like some hack but be a proper public environment variable, working on all platforms (linux,windows,osx,..)
This name was chosen in the specification: https://reproducible-builds.org/specs/source-date-epoch/ It is already used in various programs.
(3) We already have ways to set all these pdf state variables in the resulting file and it's already complicated enough.
Other ways to set both the PDF metadata dates and the \today date imply modification of the documents or building process of each individual software package that uses luatex to build documentation. Using an environment variable allows to share the work.
(4) I understand it's needed for some testing so one can even wonder if it's something user level (one can this even consider this to be a security issue, faking dates and so)
From the document source, anyone can modify it in any way and produce a PDF document with any date, any content. This is not a security issue.
Regards, Alexis Bienvenüe.
Hi Luigi,
The source code now is almost frozen, so it's unlikely (but not impossible) that we can apply the patches
Similar patches were included in pdftex and and xetex (via the shared code), as well as xdvipdfmx and dvips in the last 3 days or so in the TeX Live repository. There is no urgent need, though. If it is not in this release, I will include the respective patches in the Debian builds, only. Norbert ------------------------------------------------------------------------ PREINING, Norbert http://www.preining.info JAIST, Japan TeX Live & Debian Developer GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13 ------------------------------------------------------------------------
On Thu, May 5, 2016 at 5:01 AM, Norbert Preining
Hi Luigi,
The source code now is almost frozen, so it's unlikely (but not impossible) that we can apply the patches
Similar patches were included in pdftex and and xetex (via the shared code), as well as xdvipdfmx and dvips in the last 3 days or so in the TeX Live repository.
There is no urgent need, though. If it is not in this release, I will include the respective patches in the Debian builds, only.
Ok. We (Hans and me) are still discussing about these patches, there are prons and cons; luatex is different from pdftex & xetex and these are not patches on the TeX core. Probably we will exchange some private emails on this just after 0.95.0, when we will return on the experimental branch. As personal note, it's a pity that we will have luatex/Debian != luatex/TeXLive, given that we try to be OS aware as much as possible. -- luigi
As personal note, it's a pity that we will have luatex/Debian != luatex/TeXLive,
It is just support for the env var that is different. Not much of a pain. And to be honest, I am 99% sure that you will come up with again a different solution/approach, and I will still need to put in the Debian patches ;-) This env var will anyway not used by anyone but those who want to give tests on reproducability. I see not reasonable chance that a user will tinker around with the env var.
given that we try to be OS aware as much as possible.
??? What do you mean with "OS aware" here? "unaware"? I don't see the relation to Debian and/or TeX Live. Norbert ------------------------------------------------------------------------ PREINING, Norbert http://www.preining.info JAIST, Japan TeX Live & Debian Developer GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13 ------------------------------------------------------------------------
??? What do you mean with "OS aware" here? "unaware"? I don't see the relation to Debian and/or TeX Live.
OS independent as much as possible
2016-05-04 Karl Berry
On Thu, 05 May 2016, luigi scarso wrote:
what happen if Windows decide to reserve SOURCE_DATE_EPOCH for their own use ?
Well, sure, but that can happen with each and everything.
Is SOURCE_DATE_EPOCH standard ?
Well, at least several distributions and projects are working on similar stuff, see https://reproducible-builds.org/ and https://reproducible-builds.org/specs/source-date-epoch/ Of course, windows is not one of them. All the best Norbert ------------------------------------------------------------------------ PREINING, Norbert http://www.preining.info JAIST, Japan TeX Live & Debian Developer GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13 ------------------------------------------------------------------------
On Thu, May 5, 2016 at 3:06 PM, Norbert Preining
On Thu, 05 May 2016, luigi scarso wrote:
what happen if Windows decide to reserve SOURCE_DATE_EPOCH for their own use ?
Well, sure, but that can happen with each and everything.
sure, and the side effects is something that we want to avoid.
Is SOURCE_DATE_EPOCH standard ?
Well, at least several distributions and projects are working on similar stuff, see https://reproducible-builds.org/ and https://reproducible-builds.org/specs/source-date-epoch/
Of course, windows is not one of them.
Weren't texmf.cnf & kpathsea created also to solve these kind of problems ? On the other side, if you really want a reproducible doc from source you have to use Knuth's tex & dvi & mf & your own format and mf fonts and the convert to pdf in a predictable way. If you want a pdf directly from sources, better to provide *tex sources & pdf and say that the *tex sources are the reference or the pdf is the reference (which is the best solution, if the pdf is pdf/a-2) but not both. If you want to reproduce pdf from the source with luatex, you must use a very carefully crafted format and the builders and "reproducers" must agree at least on the same engine, format & fonts as also on the meaning of "=" for pdfs -- and then cross the fingers. Nobody can exclude that a library (say poppler) will use a hash table for (some of) the pdf objs of the backend, and even if you "control" the libs of Luatex nobody can prevent to use Luatex with a shared lib API compatible, and even if you make a static Luatex only nobody can prevent to use a format that uses Lua (hash) table for the pdf objs... all these can lead to two pdf that are "visually" the same but different as binaries. And comparing the visual appearance of two pdf is not simple, unless you don't fix a dpi --- which means A=B at, say, 300 dpi, but perhaps A!=B at 1200dpi. Of course you can solve these problem if the builder freezes the set of engine & format & fonts etc used and reproducers use exactly the same set, even after 20years the first official build. This is a really interesting subject, but the time is only a component, and as I have said, we are discussing of it. -- luigi
On 5/5/2016 2:51 PM, Norbert Preining wrote:
As personal note, it's a pity that we will have luatex/Debian != luatex/TeXLive,
It is just support for the env var that is different. Not much of a pain.
And to be honest, I am 99% sure that you will come up with again a different solution/approach, and I will still need to put in the Debian patches ;-)
sure, because if you want a reproducable document there is more involved than a time in luatex (and pdftex) one can add extra objects and data structures and these can also have times one can use random elements and then runs can differ too and when one uses lua there can be sequential differences (when flushing hash based data) because each run has a different hashing for security reasons so, the most robust way to deal with a reproduceable document is that the macro package provides an option (because it knows what gets done) (in luatex one can omit a lot of these time dependent entries anyway and if needed we provide more turning options as then the macro package can provide the right solutions / option) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
(in luatex one can omit a lot of these time dependent entries anyway and if needed we provide more turning options as then the macro package can provide the right solutions / option)
Indeed, and I use them already for my own testing environment for TeX Live (in the work via l3build since BachoTeX) - thank Hans for telling me the details. Yes, I am aware that with an embedded interpreter getting guaranteed reproducability is not necessarily possible. This is also not what I want/need. Of course there are also plain tx documents that are not reproducibly buildable, because hey do some strange things (write18 calls to some other program or so). But we are talking here about the majority of documents. The remaining ones, that are special, need to be always treated specially. But I am sure you can come up with a good even more general solution! Thanks Norbert (slowly falling apart after 36++h travel from BachoTeX ;-) ------------------------------------------------------------------------ PREINING, Norbert http://www.preining.info JAIST, Japan TeX Live & Debian Developer GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13 ------------------------------------------------------------------------
participants (4)
-
Alexis Bienvenüe
-
Hans Hagen
-
luigi scarso
-
Norbert Preining