Bugs item #824, was opened at 2007-06-26 00:22 You can respond by visiting: http://sarovar.org/tracker/?func=detail&atid=493&aid=824&group_id=106 Category: Image inclusion Group: v1.40.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Robin Houston (robinhouston) Assigned to: Martin Schröder (oneiros) Summary: Bus error caused by loading an image into a format file Initial Comment: I have written a tool that speeds up repeated compilation of a source file, by generating a format from the preamble of the document, and then using it to compile the body. One user reported a crash when using this tool with a custom letterhead class; the class in question may be found at http://www.soe.ucsc.edu/~elm/LaTeX/ucletter.cls (though note that a simpler demonstration is attached to this report). The bug is triggered by the fact that this class loads an image during the processing of the preamble (i.e. during the processing of the .ini file) and saves it in a box. When this box is used, during the compilation of the body, pdftex crashes. This does work with ordinary (non-PDF) TeX, so the problem is specific to PDFTeX. The attachment contains a simple demonstration of the problem. Unpack it and run 'make'. ----------------------------------------------------------------------
Comment By: Taco Hoekwater (taco) Date: 2007-07-02 18:58
Message: Logged In: YES user_id=1608 Robin, I have merged those extra variables and heads with your previous patch. Can you verify that all is still well? ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-07-02 13:59 Message: Logged In: YES user_id=4579 A small additional improvement: dump_int(pdf_last_xform); dump_int(pdf_last_ximage); dump_int(pdf_last_obj); ... undump_int(pdf_last_xform); undump_int(pdf_last_ximage); undump_int(pdf_last_obj); (so that \lastximage and friends will work correctly after loading the format file). ---------------------------------------------------------------------- Comment By: Hartmut Henkel (hhenkel) Date: 2007-07-02 13:33 Message: Logged In: YES user_id=929
That's odd. The source I'm working from has @d is_obj_written(#) == (obj_offset(#) <> 0) Has this been changed recently?
this has been changed by the patch to bug 799, where also 'scheduled' has been introduced. Regards, Hartmut ---------------------------------------------------------------------- Comment By: Taco Hoekwater (taco) Date: 2007-07-02 13:21 Message: Logged In: YES user_id=1608
Has this been changed recently?
In my (development version) of the source there is a state 'scheduled', that I assume comes from a fix to something else. In any case, we just have ask Martin to be extra careful about that line while creating 1.40.4 :-) It is probably wise do dump the three heads you mentioned, just in case: I really need the ximage one in my tree, and dumping the others seems a logical extension. Well done to you on finding a way around the regression. ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-07-02 13:08 Message: Logged In: YES user_id=4579 That's odd. The source I'm working from has @d is_obj_written(#) == (obj_offset(#) <> 0) Has this been changed recently? (Am I right in thinking that the development version of the source tree is not publicly available?) With regard to which linked list headers to dump, we seem to be reaching the point of making subtle improvements. The original problem (whatsit nodes in dumped boxes referencing non-existent objects) is solved without dumping any of the headers. One might conceivably want to load images/forms/raw objects in the .ini, and then reference them from the .tex: this is addressed by dumping head_tab[obj_type_ximage], head_tab[obj_type_xform] and head_tab[obj_type_obj]. (Neither forms nor raw objects have any additional associated metadata, as far as I can tell.) I can't see any benefit from dumping the other entries, whether or not doing so would cause potential crashes. At any rate, what we have now seems a definite improvement over the status quo, and avoids the undesirable feature-regression of forbidding \dump after \shipout. ---------------------------------------------------------------------- Comment By: Taco Hoekwater (taco) Date: 2007-07-02 12:53 Message: Logged In: YES user_id=1608 WRT type image object: over here, it is only included twice after I do a small fix to your patch: obj_tab[k].int2 := -1; this could just be a simple difference between my and your pdftex source trees, but please re-check. I definately have *no* runtime image with obj_tab[k].int2 := 0, and that is correct/expected, because I can see in the source that the is_obj_written() test checks for >-1. The thing with head_tab[obj_type_xform] is why I initially dumped all of them. The xforms should be safe, but fonts are definately not ok (not unless whole lot more stuff is dumped to the format). I don't know about the others heads. It depends on whether or not the objects need additional data structures besides obj_tab and pdf_mem, and I am not very much at home in that section of the pdftex code. ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-07-02 10:58 Message: Logged In: YES user_id=4579 Taco, do you have an example of the "another problem" that you mention? I thought I had dealt with that, by not dumping obj_tab[k].int2 and resetting it to 0, and indeed it seems to work when I try it. To be precise about what I tried, I added the lines % \pdfrefximage\pdflastximage \par\vfil\penalty-10000 % before the \dump command in test-fmt2.tex. Both the generated PDFs (test-fmt2.pdf and test2.pdf) seem okay. Incidentally, it's easy to make a test, along the lines of test2, that requires head_tab[obj_type_xform]. So I guess that should be dumped too. ---------------------------------------------------------------------- Comment By: Taco Hoekwater (taco) Date: 2007-07-02 10:34 Message: Logged In: YES user_id=1608 The objects (and their numbers) are saved in the formats as well ---------------------------------------------------------------------- Comment By: Hartmut Henkel (hhenkel) Date: 2007-07-02 10:30 Message: Logged In: YES user_id=929 how can test2 work anyway? the pdfximage reserves an absolute object number (not something like an image number), and \pdflastximage gives this object number. But later in the real document objects are numbered in the natural way in ascending order, so how can an image object number fixed already in the format fit into this sequence? Regards, Hartmut ---------------------------------------------------------------------- Comment By: Taco Hoekwater (taco) Date: 2007-07-02 09:11 Message: Logged In: YES user_id=1608 I got an example from Hartmut Henkel that needs head_tab[pdf_obj_ximage], it is test2 in attached tar. I tried a quick test and it indeed seems to work ok if I dump *only* that head_tab entry. But, you have to solve another problem as well. If you use the image in the initex-dumped pdf, then the image wil not be included in the runtime pdf (its state is 'written'). Somehow this state needs to be reset. Taco ---------------------------------------------------------------------- Comment By: Taco Hoekwater (taco) Date: 2007-07-02 09:10 Message: Logged In: YES user_id=1608 I got an example from Hartmut Henkel that needs head_tab[pdf_obj_ximage], it is test2 in attached tar. I tried a quick test and it indeed seems to work ok if I dump *only* that head_tab entry. But, you have to solve another problem as well. If you use the image in the initex-dumped pdf, then the image wil not be included in the runtime pdf (its state is 'written'). Somehow this state needs to be reset. Taco ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-30 18:56 Message: Logged In: YES user_id=4579 The attached patch (initex-objects.patch) implements the suggestions made below, together with a few other improvements. It incorporates Taco Hoekwater's code, so should be applied to a clean source tree. In particular, it reinstates the ability to produce both a PDF document and a memory dump in the same run. I've tested it with \pdfxform as well as \pdfximage. Comments? Counterexamples? Robin ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-30 14:38 Message: Logged In: YES user_id=4579 Fortunately the "two picture" problem (demonstrated in the attachment twopic.tar.gz) is trivial to solve: simply dump and undump the variable pdf_ximage_count. Presumably the same should be done for pdf_obj_count and pdf_xform_count. Whether this is sufficient, I have no idea: none of us have tested forms or raw objects, have we? ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-30 12:08 Message: Logged In: YES user_id=4579 A minor bug in the initex-refximage-dump.patch: the test for fixed_pdfoutput=1 ought to be fixed_pdfoutput>0, otherwise it will fail with \pdfoutput=2, say. Another unrelated problem is that, if you load an image from the .ini file, and then load and display another image from the .tex file, the first image will appear where the second one ought to be. I'll try and find out why this is happening. (Incidentally, I haven't (yet?) discovered any problems stemming from not (un)dumping the linked list headers.) ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-29 20:08 Message: Logged In: YES user_id=4579 Argh! Interesting, thanks. Clearly my understanding of the pdftex code is patchy at best. That said, what is gained by dumping and restoring the head_tab? If I remove the parts of your code that dump and undump the head_tab, then your example works (and mine continue to work). Robin ---------------------------------------------------------------------- Comment By: Taco Hoekwater (taco) Date: 2007-06-29 18:32 Message: Logged In: YES user_id=1608 Try a document that contains items besides a single image. Many of the auxiliary tables are not dumped, resulting in assertion failures in the non-ini run. You don't even need to save or reuse boxes for those crashes, just make a pair of: % test.ini \input plain \pdfoutput=1 hello \dump % test.tex world! \end ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-29 17:24 Message: Logged In: YES user_id=4579 PS. I have just attached a new tar file, containing some simple tests that create PDF from the IniTeX run. They all appear to work, when my patch is applied. ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-29 17:19 Message: Logged In: YES user_id=4579 I don't think creating a PDF file from the IniTeX run is a real problem. The attached tiny patch (to be applied on top of Taco Hoekwater's first patch) seems to make it work. Am I missing something? ---------------------------------------------------------------------- Comment By: Taco Hoekwater (taco) Date: 2007-06-29 12:34 Message: Logged In: YES user_id=1608 Here is the promised extra patch. No error is generated, just a simple warning. Pdftex continues executing as if \end was given instead. ---------------------------------------------------------------------- Comment By: Nobody (None) Date: 2007-06-28 22:33 Message: Logged In: NO Hi, the inability to _both_ dump a format _and_ produce output in the same iniTeX run is probably tolerable as long as no crashes occur. I don't know the workings of WhizzyTeX in detail: it is conceivable that their in-document dump would be affected. However, it is unlikely that WhizzyTeX will indeed require both the output as well as the dump (the dump may well contain pictures, for example from floats that are going to be placed later), so there is at least a reasonable way to tackle this from the macro level (divert \shipout in the manner of the everyshi.sty package). At least mylatex.ltx (and consequently preview-latex) should work fine: \shipout before \begin{document} would be extremely unusual and would result in strange results, anyway. David ---------------------------------------------------------------------- Comment By: Taco Hoekwater (taco) Date: 2007-06-28 21:09 Message: Logged In: YES user_id=1608 Small limitation that I have yet to implement: the ability to move stuff over from the initex run means that if initex actually creates a pdf file, dumping a format should be disallowed, or silently ignored. Attempting to keep track of two disjunct pdf documents is just too hard for me. Note that I do not want to prohibit pdf creation from initex. Just that you can't create a pdf document _as well as_ perform a \dump. ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-28 17:01 Message: Logged In: YES user_id=4579 Bravo! I was ready to embark on this myself, so I'm delighted to be spared the trouble. It passes all the tests I've tried so far. ---------------------------------------------------------------------- Comment By: Taco Hoekwater (taco) Date: 2007-06-28 12:57 Message: Logged In: YES user_id=1608 Here is a patch that fixes the problem by dumping and restoring the image _meta information_ to/from any generated format. The image _data_ is not included (that would be really hard to do in a portable way), and therefore the undumping routines have to redo most of the work of \pdfximage, but this is transparent to the user except that the tests for \pdfminorversion and \pdfinclusionerrorlevel tests are re-done. The patch also saves some pdftex arrays that are needed to rediscover the object. This could eventually be extended to make sure \refxform et al work as well. As of now, that is untested due to lack of example. Comments and testing welcome, as always. ---------------------------------------------------------------------- Comment By: Martin Schröder (oneiros) Date: 2007-06-26 23:01 Message: Logged In: YES user_id=421 Sorry, pdf things work different then dvi, so we can not make \pdfximage etc. in ini mode without a lot of work; we will probably disable a bunch of primitives. ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-26 15:49 Message: Logged In: YES user_id=4579 Simplifying the example by eliminating the graphics package: ==> foo.ini <== \documentclass{minimal} \newbox\picbox \pdfximage{pic.pdf} \setbox\picbox\hbox{\pdfrefximage\pdflastximage} \dump ==> foo.tex <== \begin{document} \box\picbox \end{document} Also, I see now that my comparison with dvi-mode is indeed a false one, because the mechanisms are quite different, and the DVI graphics driver only has to include a \special, rather than the graphic itself. A simple solution, I suppose, would be to make \pdfrefximage (and, presumably, the other \pdfrefx commands) invalid in IniTeX mode. Perhaps this is too brutal, since it forbids certain harmless activities such as finding the dimensions of an image from an .ini file; and of course I would be delighted if it were made to work instead. ---------------------------------------------------------------------- Comment By: Reinhard Kotucha (reinhard) Date: 2007-06-26 01:26 Message: Logged In: YES user_id=4195 Martin, you probably can't do everything in a format file. However, it would be nice to be able to put graphics to a format file at least. Suppose that a web server has to generate PDF files on-the-fly, each contains a company logo. The best place place for the logo is the format file if speed matters. It would be nice if there are no restrictions. Regards, Reinhard ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-26 00:42 Message: Logged In: YES user_id=4579 I'm impressed by the speed of your response! Thanks. If you replace the PDF image by an EPS file, and use -output-format dvi, then it does work. That suggests to me that the problem is not fundamentally caused by a limitation of IniTeX. But I don't know enough about the inner workings of the \includegraphics command to be sure about that. (Perhaps PDF graphics are handled in a sufficiently different way for this to be a false comparison?) In any case, as you say, it shouldn't crash! ---------------------------------------------------------------------- Comment By: Martin Schröder (oneiros) Date: 2007-06-26 00:30 Message: Logged In: YES user_id=421 I can reproduce it, and it obviously shouldn't crash, but I'm really not shure if it's supposed to work, as the format of course will not save the picture. You can't do everything in IniTeX... ---------------------------------------------------------------------- You can respond by visiting: http://sarovar.org/tracker/?func=detail&atid=493&aid=824&group_id=106