[NTG-pdftex] Creating identical PDF files with different pdfTeX runs

Reinhard Kotucha reinhard.kotucha at web.de
Thu Mar 16 22:09:51 CET 2006

>>>>> "Hartmut" == Hartmut Henkel <hartmut_henkel at gmx.de> writes:

  > On Thu, 16 Mar 2006, Frank Küster wrote:
  >> Reinhard Kotucha <reinhard.kotucha at web.de> wrote:
  >> >
  >> > <pagenumber> <md5sum of the bitmap file>
  >> >
  >> > The bitmap files can be removed by the script when it is
  >> finished and > standard UNIX tools can be used to examine the
  >> output files.
  >> >
  >> > Particularly, diff(1) can be used efficiently.  It will tell
  >> you the > numbers of the pages which are different.
  >> That's a very good suggestion, thanks!

  > this looks pretty fragile to me. Characters will end up in bitmaps
  > with interpolated gray pixels, and so it depends not only on
  > pdftex but also on any subtlety of the rendering engine.

No, not every ghostscript output device does antialiasing.  Usually
antialiasing is done for screen rendering only.  And even there you
can use -sDEVICE=x11 instead of x11alpha.  You can try faxg3 or
pcxmono or something like that.

  > And if the md5sum doesn't match, you know nothing without the
  > original file. Maybe some crosscorrelation between images with
  > some given tolerance limit would be safer.

...and you don't know anything either.

The question was whether files are identical, not similar.  This can
be achieved as I described, given that the bitmaps are produced with a
reasonable high resolution (and it does not matter whether antialiasing
is turned on).  Of course, you have to use the same version of the
program which produces the bitmaps invariably.

If you want to see the differences you need a program which displays
all pixels which are different in two bitmap files.  But I suppose
that you want to check whether two bitmaps are different before you
use such a tool.

  > pnmpsnr: Y  color component: 59.18 dB

Well, it just tells you that the files are different, the actual value
does not provide any useful information.

I think two tools are needed, one which tells you which pages are
different and one which makes the changes visible.


