[NTG-pdftex] [ pdftex-Bugs-824 ] Bus error caused by loading an image into a format file

noreply at sarovar.org noreply at sarovar.org
Mon Jul 2 13:21:20 CEST 2007


Bugs item #824, was opened at 2007-06-26 00:22
You can respond by visiting: 
http://sarovar.org/tracker/?func=detail&atid=493&aid=824&group_id=106

Category: Image inclusion
Group: v1.40.3
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Robin Houston (robinhouston)
Assigned to: Martin Schröder (oneiros)
Summary: Bus error caused by loading an image into a format file

Initial Comment:
I have written a tool that speeds up repeated
compilation of a source file, by generating a format
from the preamble of the document, and then using it to
compile the body.

One user reported a crash when using this tool with a
custom letterhead class; the class in question may be
found at
http://www.soe.ucsc.edu/~elm/LaTeX/ucletter.cls (though
note that a simpler demonstration is attached to this
report).

The bug is triggered by the fact that this class loads
an image during the processing of the preamble (i.e.
during the processing of the .ini file) and saves it in
a box. When this box is used, during the compilation of
the body, pdftex crashes.

This does work with ordinary (non-PDF) TeX, so the
problem is specific to PDFTeX.

The attachment contains a simple demonstration of the
problem. Unpack it and run 'make'.


----------------------------------------------------------------------

>Comment By: Taco Hoekwater (taco)
Date: 2007-07-02 13:21

Message:
Logged In: YES 
user_id=1608

> Has this been changed recently? 

In my (development version) of the source there is a state
'scheduled', that I assume comes from a fix to something
else. In any case, we just have ask Martin to be extra
careful about that line while creating 1.40.4 :-)

It is probably wise do dump the three heads you mentioned,
just in case: I really need the ximage one in my tree, and
dumping the others seems a logical extension. 

Well done to you on finding a way around the regression. 


----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-07-02 13:08

Message:
Logged In: YES 
user_id=4579

That's odd. The source I'm working from has

@d is_obj_written(#) == (obj_offset(#) <> 0)

Has this been changed recently? (Am I right in thinking that
the development version of the source tree is not publicly
available?)

With regard to which linked list headers to dump, we seem to
be reaching the point of making subtle improvements. The
original problem (whatsit nodes in dumped boxes referencing
non-existent objects) is solved without dumping any of the
headers.

One might conceivably want to load images/forms/raw objects
in the .ini, and then reference them from the .tex: this is
addressed by dumping head_tab[obj_type_ximage],
head_tab[obj_type_xform] and head_tab[obj_type_obj].
(Neither forms nor raw objects have any additional
associated metadata, as far as I can tell.)

I can't see any benefit from dumping the other entries,
whether or not doing so would cause potential crashes.

At any rate, what we have now seems a definite improvement
over the status quo, and avoids the undesirable
feature-regression of forbidding \dump after \shipout.

----------------------------------------------------------------------

Comment By: Taco Hoekwater (taco)
Date: 2007-07-02 12:53

Message:
Logged In: YES 
user_id=1608

WRT type image object: 

over here, it is only included twice after I do a small fix
to your patch:
  obj_tab[k].int2 := -1;
this could just be a simple difference between my and your
pdftex source trees, but please re-check. 

I definately have *no* runtime image with obj_tab[k].int2 :=
0, and that is correct/expected, because I can see in the
source that the is_obj_written() test checks for >-1.

The thing with head_tab[obj_type_xform] is why I initially
dumped all of them. The xforms should be safe, but fonts are
definately not ok (not unless whole lot more stuff is dumped
to the format). I don't know about the others heads. It
depends on whether or not the objects need additional data
structures besides obj_tab and pdf_mem, and I am not very
much at home in that section of the pdftex code.





----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-07-02 10:58

Message:
Logged In: YES 
user_id=4579

Taco, do you have an example of the "another problem" that
you mention? I thought I had dealt with that, by not dumping
obj_tab[k].int2 and resetting it to 0, and indeed it seems
to work when I try it.

To be precise about what I tried, I added the lines

%
\pdfrefximage\pdflastximage
\par\vfil\penalty-10000
%

before the \dump command in test-fmt2.tex. Both the
generated PDFs (test-fmt2.pdf and test2.pdf) seem okay.

Incidentally, it's easy to make a test, along the lines of
test2, that requires head_tab[obj_type_xform]. So I guess
that should be dumped too.

----------------------------------------------------------------------

Comment By: Taco Hoekwater (taco)
Date: 2007-07-02 10:34

Message:
Logged In: YES 
user_id=1608

The objects (and their numbers) are saved in the formats as well

----------------------------------------------------------------------

Comment By: Hartmut Henkel (hhenkel)
Date: 2007-07-02 10:30

Message:
Logged In: YES 
user_id=929

how can test2 work anyway? the pdfximage reserves an
absolute object number (not something like an image number),
and \pdflastximage gives this object number. But later in
the real document objects are numbered in the natural way in
ascending order, so how can an image object number fixed
already in the format fit into this sequence?

Regards, Hartmut

----------------------------------------------------------------------

Comment By: Taco Hoekwater (taco)
Date: 2007-07-02 09:11

Message:
Logged In: YES 
user_id=1608

I got an example from Hartmut Henkel that needs
head_tab[pdf_obj_ximage], it is test2 in attached tar. I
tried a quick test and it indeed seems to work ok if I dump
*only* that head_tab entry. 

But, you have to solve another problem as well. If you use
the image in the initex-dumped pdf, then the image wil not
be included in the runtime pdf (its state is 'written').
Somehow this state needs to be reset.

Taco


----------------------------------------------------------------------

Comment By: Taco Hoekwater (taco)
Date: 2007-07-02 09:10

Message:
Logged In: YES 
user_id=1608

I got an example from Hartmut Henkel that needs
head_tab[pdf_obj_ximage], it is test2 in attached tar. I
tried a quick test and it indeed seems to work ok if I dump
*only* that head_tab entry. 

But, you have to solve another problem as well. If you use
the image in the initex-dumped pdf, then the image wil not
be included in the runtime pdf (its state is 'written').
Somehow this state needs to be reset.

Taco


----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-06-30 18:56

Message:
Logged In: YES 
user_id=4579

The attached patch (initex-objects.patch) implements the
suggestions made below, together with a few other
improvements. It incorporates Taco Hoekwater's code, so
should be applied to a clean source tree.

In particular, it reinstates the ability to produce both a
PDF document and a memory dump in the same run. I've tested
it with \pdfxform as well as \pdfximage.

Comments? Counterexamples?

Robin

----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-06-30 14:38

Message:
Logged In: YES 
user_id=4579

Fortunately the "two picture" problem (demonstrated in the
attachment twopic.tar.gz) is trivial to solve: simply dump
and undump the variable pdf_ximage_count.

Presumably the same should be done for pdf_obj_count and
pdf_xform_count. Whether this is sufficient, I have no idea:
none of us have tested forms or raw objects, have we?


----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-06-30 12:08

Message:
Logged In: YES 
user_id=4579

A minor bug in the initex-refximage-dump.patch: the test for
fixed_pdfoutput=1 ought to be fixed_pdfoutput>0, otherwise
it will fail with \pdfoutput=2, say.

Another unrelated problem is that, if you load an image from
the .ini file, and then load and display another image from
the .tex file, the first image will appear where the second
one ought to be. I'll try and find out why this is happening.

(Incidentally, I haven't (yet?) discovered any problems
stemming from not (un)dumping the linked list headers.)

----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-06-29 20:08

Message:
Logged In: YES 
user_id=4579

Argh! Interesting, thanks.

Clearly my understanding of the pdftex code is patchy at
best. That said, what is gained by dumping and restoring the
head_tab? If I remove the parts of your code that dump and
undump the head_tab, then your example works (and mine
continue to work).

Robin

----------------------------------------------------------------------

Comment By: Taco Hoekwater (taco)
Date: 2007-06-29 18:32

Message:
Logged In: YES 
user_id=1608

Try a document that contains items besides a single image.
Many of the auxiliary tables are not dumped, resulting in
assertion failures in the non-ini run.

You don't even need to save or reuse boxes for those
crashes, just make a pair of:

% test.ini
\input plain
\pdfoutput=1
hello
\dump


% test.tex
world!
\end




----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-06-29 17:24

Message:
Logged In: YES 
user_id=4579

PS. I have just attached a new tar file, containing some
simple tests that create PDF from the IniTeX run. They all
appear to work, when my patch is applied.

----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-06-29 17:19

Message:
Logged In: YES 
user_id=4579

I don't think creating a PDF file from the IniTeX run is a
real problem. The attached tiny patch (to be applied on top
of Taco Hoekwater's first patch) seems to make it work.

Am I missing something?

----------------------------------------------------------------------

Comment By: Taco Hoekwater (taco)
Date: 2007-06-29 12:34

Message:
Logged In: YES 
user_id=1608

Here is the promised extra patch. No error is generated,
just a simple warning. Pdftex continues executing as if \end
was given instead.

----------------------------------------------------------------------

Comment By: Nobody (None)
Date: 2007-06-28 22:33

Message:
Logged In: NO 

Hi, the inability to _both_ dump a format _and_ produce
output in the same iniTeX run is probably tolerable as long
as no crashes occur.  I don't know the workings of WhizzyTeX
in detail: it is conceivable that their in-document dump
would be affected.  However, it is unlikely that WhizzyTeX
will indeed require both the output as well as the dump (the
dump may well contain pictures, for example from floats that
are going to be placed later), so there is at least a
reasonable way to tackle this from the macro level (divert
\shipout in the manner of the everyshi.sty package).

At least mylatex.ltx (and consequently preview-latex) should
work fine: \shipout before \begin{document} would be
extremely unusual and would result in strange results, anyway.

David

----------------------------------------------------------------------

Comment By: Taco Hoekwater (taco)
Date: 2007-06-28 21:09

Message:
Logged In: YES 
user_id=1608

Small limitation that I have yet to implement: the ability
to move stuff over from the initex run means that if initex
actually creates a pdf file, dumping a format should be
disallowed, or silently ignored. Attempting to keep track of
two disjunct pdf documents is just too hard for me. 

Note that I do not want to prohibit pdf creation from
initex. Just that you can't create a pdf document _as well
as_ perform a \dump.



----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-06-28 17:01

Message:
Logged In: YES 
user_id=4579

Bravo!

I was ready to embark on this myself, so I'm delighted to be
spared the trouble. It passes all the tests I've tried so far.

----------------------------------------------------------------------

Comment By: Taco Hoekwater (taco)
Date: 2007-06-28 12:57

Message:
Logged In: YES 
user_id=1608

Here is a patch that fixes the problem by dumping and
restoring the image _meta information_ to/from any generated
format. 

The image _data_ is not included (that would be really hard
to do in a portable way), and therefore the undumping
routines have to redo most of the work of \pdfximage, but
this is transparent to the user except that the tests
for \pdfminorversion and \pdfinclusionerrorlevel tests are
re-done.

The patch also saves some pdftex arrays that are needed to
rediscover the object. This could eventually be extended to
 make sure \refxform et al work as well. As of now, that is
untested due to lack of example.

Comments and testing welcome, as always.



----------------------------------------------------------------------

Comment By: Martin Schröder (oneiros)
Date: 2007-06-26 23:01

Message:
Logged In: YES 
user_id=421

Sorry, pdf things work different then dvi, so we can not
make \pdfximage etc. in ini mode without a lot of work; we
will probably disable a bunch of primitives.

----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-06-26 15:49

Message:
Logged In: YES 
user_id=4579

Simplifying the example by eliminating the graphics package:

==> foo.ini <==
\documentclass{minimal}

\newbox\picbox
\pdfximage{pic.pdf}
\setbox\picbox\hbox{\pdfrefximage\pdflastximage}

\dump

==> foo.tex <==
\begin{document}
        \box\picbox
\end{document}


Also, I see now that my comparison with dvi-mode is indeed a
false one, because the mechanisms are quite different, and
the DVI graphics driver only has to include a \special,
rather than the graphic itself.

A simple solution, I suppose, would be to make \pdfrefximage
(and, presumably, the other \pdfrefx commands) invalid in
IniTeX mode. Perhaps this is too brutal, since it forbids
certain harmless activities such as finding the dimensions
of an image from an .ini file; and of course I would be
delighted if it were made to work instead.


----------------------------------------------------------------------

Comment By: Reinhard Kotucha (reinhard)
Date: 2007-06-26 01:26

Message:
Logged In: YES 
user_id=4195

Martin,
you probably can't do everything in a format file.  However,
it would be nice to be able to put graphics to a format file
at least.  Suppose that a web server has to generate PDF
files on-the-fly, each contains a company logo.  The best
place place for the logo is the format file if speed
matters.  It would be nice if there are no restrictions.

Regards,
  Reinhard



----------------------------------------------------------------------

Comment By: Robin Houston (robinhouston)
Date: 2007-06-26 00:42

Message:
Logged In: YES 
user_id=4579

I'm impressed by the speed of your response! Thanks.

If you replace the PDF image by an EPS file, and use
-output-format dvi, then it does work. That suggests to me
that the problem is not fundamentally caused by a limitation
of IniTeX. But I don't know enough about the inner workings
of the \includegraphics command to be sure about that.
(Perhaps PDF graphics are handled in a sufficiently
different way for this to be a false comparison?)

In any case, as you say, it shouldn't crash!

----------------------------------------------------------------------

Comment By: Martin Schröder (oneiros)
Date: 2007-06-26 00:30

Message:
Logged In: YES 
user_id=421

I can reproduce it, and it obviously shouldn't crash, but
I'm really not shure if it's supposed to work, as the format
of course will not save the picture. You can't do everything
in IniTeX...

----------------------------------------------------------------------

You can respond by visiting: 
http://sarovar.org/tracker/?func=detail&atid=493&aid=824&group_id=106


More information about the ntg-pdftex mailing list