[Dev-luatex] Bug#1009196: texlive-binaries: Reproducible content of .fmt files

Roland Clobus rclobus at rclobus.nl
Tue Apr 12 08:44:00 CEST 2022


Hello luigi and others,

On 11/04/2022 20:28, luigi scarso wrote:
...
> I am perplexed,  perhaps I misunderstood something.
> The distinction among "the regular environments that users will use" 
> and  the "build environments"
> seem to be  done at runtime for the same binary by setting an env. variable
> --  but in this case a malicious "regular" user could also  set 
> LUA_HASH_SEED, breaking the
> security property.

That's why the documentation for such potentially security-breaking 
features mention how they are to be used. One is typically not expected 
to set the seed values, but if you do set them, it's your own 
responsibility.

E.g. Python's man page:
<quote>
   PYTHONHASHSEED
               If this variable is set to "random", a random value is 
used to seed the hashes of str and bytes objects.

               If  PYTHONHASHSEED  is  set to an integer value, it is 
used as a fixed seed for generating the hash() of the types
               covered by the hash randomization.  Its purpose is to 
allow repeatable hashing, such as for selftests for the  in‐
               terpreter itself, or to allow a cluster of python 
processes to share hash values.

               The  integer  must be a decimal number in the range 
[0,4294967295].  Specifying the value 0 will disable hash ran‐
               domization.

</quote>

Perl has a more severe disclaimer: 
https://perldoc.perl.org/perlrun#PERL_HASH_SEED
<quote>
PLEASE NOTE: The hash seed is sensitive information. Hashes are 
randomized to protect against local and remote attacks against Perl 
code. By manually setting a seed, this protection may be partially or 
completely lost.
</quote>

> In this *specific* case, one can check by sorting -- as done by the patch:
> 
> #!/bin/sh
> export FORCE_SOURCE_DATE=1
> export SOURCE_DATE_EPOCH=$(date +%s)
> for i in `seq 1 10`; do
>   luahbtex -ini -jobname=luahbtex -progname=luabhtex  luatex.ini 
> 1>/dev/null;
>   gunzip -d -c luahbtex.fmt|tail -1 |xxd -i |perl -pe 
> 's{,\s*}{\n}g;s{^\s*}{}g;'|sort|md5sum ;
>   md5sum luahbtex.log;
> done

This checks the whole file, but the issue is that the order of the bytes 
is different only at a specific location in the file: the list of 
hyphenation exceptions. Only that specific part needs a special handling.

For completeness, this issue is present in at least 3 .fmt files. Each 
is generated by 'fmtutil --sys --all', which in turn does:
luahbtex -ini   -jobname=luahbtex -progname=luahbtex luatex.ini
luatex -ini   -jobname=dviluatex -progname=dviluatex dviluatex.ini
luatex -ini   -jobname=luatex -progname=luatex luatex.ini

In the case of texlive: setting *both* FORCE_SOURCE_DATE and 
SOURCE_DATE_EPOCH will be IHMO sufficiently special to allow disabling 
the random hashing seed.
I'll follow-up soon with an updated patch.

With kind regards,
Roland
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.ntg.nl/pipermail/dev-luatex/attachments/20220412/c6cbf3e7/attachment.sig>


More information about the dev-luatex mailing list