On 6/26/2015 3:40 AM, Heiko Oberdiek wrote:
pdfTeX calculates the non-deterministic ID values in "utils.c", function "printID". It uses the MD5 sum of the following data for the ID values: * the current time by calling function "time" (resolution is second), * the current working directory by calling "getcwd" and * the output file name.
The trailer dictionary content is defined on page 43 (Table 15) of http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_200... In addition, Section 14.4 (page 551) suggests an MD5 input string, similar to the list Heiko gave above, to determine the ID value: • The current time • A string representation of the file’s location, usually a pathname • The size of the file in bytes • The values of all entries in the file’s document information dictionary But what exactly should or should not be fed into the ID-generating hash function surely depends on workflow requirements. Some may want the time in there, others now. Some may want the entire source file in there, others not. How about adding a new primitive that takes as input the string that pdfTeX will fed into MD5 in order to generate the files identifier? Then the user could override the above default choice, e.g. along the lines of \usepackage{currfile} \pdftrailerid{\today\currfilepath\input\currfilename} if I wanted the ID to be calculated based on the date, pathname and content of the source file, for example. I could then make the ID depend on whatever strings TeX has access to. In particular, I could also use \pdftrailerid{} to make it a constant, or \pdftrailerid{\jobname} to make it only depend on the filename, etc. Markus -- Markus Kuhn, Computer Laboratory, University of Cambridge http://www.cl.cam.ac.uk/~mgk25/ || CB3 0FD, Great Britain