[ pdftex-Bugs-824 ] Bus error caused by loading an image into a format file
Bugs item #824, was opened at 2007-06-26 00:22 You can respond by visiting: http://sarovar.org/tracker/?func=detail&atid=493&aid=824&group_id=106 Category: Image inclusion Group: v1.40.3 Status: Open Resolution: Accepted Priority: 5 Submitted By: Robin Houston (robinhouston) Assigned to: Martin Schröder (oneiros) Summary: Bus error caused by loading an image into a format file Initial Comment: I have written a tool that speeds up repeated compilation of a source file, by generating a format from the preamble of the document, and then using it to compile the body. One user reported a crash when using this tool with a custom letterhead class; the class in question may be found at http://www.soe.ucsc.edu/~elm/LaTeX/ucletter.cls (though note that a simpler demonstration is attached to this report). The bug is triggered by the fact that this class loads an image during the processing of the preamble (i.e. during the processing of the .ini file) and saves it in a box. When this box is used, during the compilation of the body, pdftex crashes. This does work with ordinary (non-PDF) TeX, so the problem is specific to PDFTeX. The attachment contains a simple demonstration of the problem. Unpack it and run 'make'. ----------------------------------------------------------------------
Comment By: Martin Schröder (oneiros) Date: 2007-06-26 23:01
Message: Logged In: YES user_id=421 Sorry, pdf things work different then dvi, so we can not make \pdfximage etc. in ini mode without a lot of work; we will probably disable a bunch of primitives. ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-26 15:49 Message: Logged In: YES user_id=4579 Simplifying the example by eliminating the graphics package: ==> foo.ini <== \documentclass{minimal} \newbox\picbox \pdfximage{pic.pdf} \setbox\picbox\hbox{\pdfrefximage\pdflastximage} \dump ==> foo.tex <== \begin{document} \box\picbox \end{document} Also, I see now that my comparison with dvi-mode is indeed a false one, because the mechanisms are quite different, and the DVI graphics driver only has to include a \special, rather than the graphic itself. A simple solution, I suppose, would be to make \pdfrefximage (and, presumably, the other \pdfrefx commands) invalid in IniTeX mode. Perhaps this is too brutal, since it forbids certain harmless activities such as finding the dimensions of an image from an .ini file; and of course I would be delighted if it were made to work instead. ---------------------------------------------------------------------- Comment By: Reinhard Kotucha (reinhard) Date: 2007-06-26 01:26 Message: Logged In: YES user_id=4195 Martin, you probably can't do everything in a format file. However, it would be nice to be able to put graphics to a format file at least. Suppose that a web server has to generate PDF files on-the-fly, each contains a company logo. The best place place for the logo is the format file if speed matters. It would be nice if there are no restrictions. Regards, Reinhard ---------------------------------------------------------------------- Comment By: Robin Houston (robinhouston) Date: 2007-06-26 00:42 Message: Logged In: YES user_id=4579 I'm impressed by the speed of your response! Thanks. If you replace the PDF image by an EPS file, and use -output-format dvi, then it does work. That suggests to me that the problem is not fundamentally caused by a limitation of IniTeX. But I don't know enough about the inner workings of the \includegraphics command to be sure about that. (Perhaps PDF graphics are handled in a sufficiently different way for this to be a false comparison?) In any case, as you say, it shouldn't crash! ---------------------------------------------------------------------- Comment By: Martin Schröder (oneiros) Date: 2007-06-26 00:30 Message: Logged In: YES user_id=421 I can reproduce it, and it obviously shouldn't crash, but I'm really not shure if it's supposed to work, as the format of course will not save the picture. You can't do everything in IniTeX... ---------------------------------------------------------------------- You can respond by visiting: http://sarovar.org/tracker/?func=detail&atid=493&aid=824&group_id=106
noreply@sarovar.org wrote:
Bugs item #824, was opened at 2007-06-26 00:22 You can respond by visiting: http://sarovar.org/tracker/?func=detail&atid=493&aid=824&group_id=106
etc. etc. A very quick, fast, and non-crashing solution is to disable setting \pdfoutput to non-zero in initex mode :-) I do not think it is feasible to support the actual inclusion of image files into the format, but I dont believe that is really needed: the crash is a result of the fact that the image_array and pdf_mem are not written to the format. So, that leaves two options: bluntly disabling \pdfrefximage et.al. in ini mode, or adding dump/undump code for those arrays. It would not be a very invasive patch, and that has my preference. What do you think? Taco
Taco Hoekwater
noreply@sarovar.org wrote:
Bugs item #824, was opened at 2007-06-26 00:22 You can respond by visiting: http://sarovar.org/tracker/?func=detail&atid=493&aid=824&group_id=106
etc. etc.
A very quick, fast, and non-crashing solution is to disable setting \pdfoutput to non-zero in initex mode :-)
I do not think it is feasible to support the actual inclusion of image files into the format, but I dont believe that is really needed: the crash is a result of the fact that the image_array and pdf_mem are not written to the format.
So, that leaves two options: bluntly disabling \pdfrefximage et.al. in ini mode, or adding dump/undump code for those arrays. It would not be a very invasive patch, and that has my preference.
What do you think?
Well, LaTeX has a command \AtBeginDVI which has the principal purpose of saving PostScript header specials (and similar stuff) across a format dump, as used by mylatex.ltx (for example). web2c also has a trend of allowing initex for everything and making it closer to virtex (the main motivation for the latter's existence, squeezing a few dozen kilobytes of code, is no longer present). The ability to transparently dump a snapshot of TeX's state (modulo open files, but including already typeset boxes) is something which we should not lightly give up or make less functional. So it is is feasible, my vote would be on dumping the required arrays. -- David Kastrup
2007/6/27, David Kastrup
Taco Hoekwater
writes: A very quick, fast, and non-crashing solution is to disable setting \pdfoutput to non-zero in initex mode :-)
No, because that get's dumped and the pdf*-formats (like pdflatex.ini) set it.
So, that leaves two options: bluntly disabling \pdfrefximage et.al. in ini mode, or adding dump/undump code for those arrays. It would not be a very invasive patch, and that has my preference.
What do you think?
That's exactly what I propose to do: Disable \pdfximage and other commands (like \pdfxform and \pdfobj) that cause problems in ini mode.
So it is is feasible, my vote would be on dumping the required arrays.
It's not easily done; a patch would be welcomed, but be warned that this is _hard_. And I seriously doubt that luaTeX needs it. Best Martin
"Martin Schröder"
2007/6/27, David Kastrup
: Taco Hoekwater
writes: A very quick, fast, and non-crashing solution is to disable setting \pdfoutput to non-zero in initex mode :-)
No, because that get's dumped and the pdf*-formats (like pdflatex.ini) set it.
My experience from the time when there was a pdftex.cfg around was that this overrode the value of \pdfoutput in the dumped format. Is this still similar, or has this changed.
So, that leaves two options: bluntly disabling \pdfrefximage et.al. in ini mode, or adding dump/undump code for those arrays. It would not be a very invasive patch, and that has my preference.
What do you think?
That's exactly what I propose to do: Disable \pdfximage and other commands (like \pdfxform and \pdfobj) that cause problems in ini mode.
So it is is feasible, my vote would be on dumping the required arrays.
It's not easily done; a patch would be welcomed, but be warned that this is _hard_.
I would not volunteer for further patches until I have managed to find the time to do the already promised ones. I was not sure what solution Taco had been referring to with his "not be a very invasive patch" comment above. It sounded like dump/undump, and if that would be feasible, I still think it the best idea not to make iniTeX less capable than necessary. For example, preview-latex uses mylatex.ltx to dump the state of the TeX at \begin{document} time. Not allowing any image references (like in \savebox commands) before that point of time is a rather brutal restriction.
And I seriously doubt that luaTeX needs it.
Actually, the current "dump core if hyphenation in iniTeX is attempted" inheritage from Aleph is likely even worse as an iniTeX restriction. But this is IIRC slated to get fixed (doing hyphenation on utf-8 rather than ucs-16 might help). And the image problem is a similar ball park concerning its undesirability. Again: the \AtBeginDvi command was added to LaTeX precisely in order to conserve pretypeset stuff that would otherwise get lost at a format dump. -- David Kastrup
2007/6/27, David Kastrup
For example, preview-latex uses mylatex.ltx to dump the state of the TeX at \begin{document} time. Not allowing any image references (like in \savebox commands) before that point of time is a rather brutal restriction.
Why do you need it and what do you gain by using \savebox? The speed increase is most likely minimal and the document needs to read the image anyway -- unless you want to dump fragments of pdf code into the format. If we expanded the dvi model we would dump the meta information about pdf things and then undump them later. This needs code for dumping and undumping this meta information. And \immediate\pdfximage would still fail.
Again: the \AtBeginDvi command was added to LaTeX precisely in order to conserve pretypeset stuff that would otherwise get lost at a format dump.
So store the commands there. Do you really need their expansion? Don't forget: pdfTeX is one-pass, while TeX->dvips is two-pass. Best Martin
"Martin Schröder"
2007/6/27, David Kastrup
: For example, preview-latex uses mylatex.ltx to dump the state of the TeX at \begin{document} time. Not allowing any image references (like in \savebox commands) before that point of time is a rather brutal restriction.
Why do you need it
What? preview-latex? mylatex.ltx?
and what do you gain by using \savebox?
Uh, it is _there_. For a reason. Saying that it (and similar functionality) should not get used anymore is creating a rather sordid restriction and backward incompatibility.
The speed increase is most likely minimal
The speed increase of what is most likely minimal? Of dumping a format with the preamble? Absolutely not. It is very relevant in preview-latex, for common editing work on single formulas easily a factor of 5. And we are talking about an interactive editing task and its response time here.
and the document needs to read the image anyway -- unless you want to dump fragments of pdf code into the format.
In DVI mode, a pointer to the image gets dumped in the form of a special. A similarly sufficient amount of information would have to get there in PDF mode.
If we expanded the dvi model we would dump the meta information about pdf things and then undump them later. This needs code for dumping and undumping this meta information. And \immediate\pdfximage would still fail.
Again: the \AtBeginDvi command was added to LaTeX precisely in order to conserve pretypeset stuff that would otherwise get lost at a format dump.
So store the commands there. Do you really need their expansion?
Not the expansion, the typeset result. This is what LaTeX uses and preserves. Since it is not untypical to put something in there which would rely on \@onlypreamble stuff, I don't see that anything except preserving the actual h/vlists could be depended on to work.
Don't forget: pdfTeX is one-pass, while TeX->dvips is two-pass.
Where is the relevance? -- David Kastrup
2007/6/27, David Kastrup
"Martin Schröder"
writes: 2007/6/27, David Kastrup
: For example, preview-latex uses mylatex.ltx to dump the state of the TeX at \begin{document} time. Not allowing any image references (like in \savebox commands) before that point of time is a rather brutal restriction.
Why do you need it
What? preview-latex? mylatex.ltx?
\savebox
and what do you gain by using \savebox?
Uh, it is _there_. For a reason. Saying that it (and similar functionality) should not get used anymore is creating a rather sordid restriction and backward incompatibility.
The speed increase is most likely minimal
The speed increase of what is most likely minimal? Of dumping a format with the preamble? Absolutely not. It is very relevant in preview-latex, for common editing work on single formulas easily a factor of 5. And we are talking about an interactive editing task and its response time here.
The speed increase of using \savebox.
and the document needs to read the image anyway -- unless you want to dump fragments of pdf code into the format.
In DVI mode, a pointer to the image gets dumped in the form of a special. A similarly sufficient amount of information would have to get there in PDF mode.
If we expanded the dvi model we would dump the meta information about pdf things and then undump them later. This needs code for dumping and undumping this meta information. And \immediate\pdfximage would still fail.
[...]
Don't forget: pdfTeX is one-pass, while TeX->dvips is two-pass.
Where is the relevance?
It was not designed with making two-pass-like solutions (which we are discussing here) working, instead the two passes are intermingled. The "sufficient amount of information" aka "meta information" is most likely not in the code in one piece and there is currently no way to dump and undump that. And a proper solution would not only handle \pdfximage, but also \pdfobj, \pdfxform, ... Best Martin
"Martin Schröder"
2007/6/27, David Kastrup
: "Martin Schröder"
writes: 2007/6/27, David Kastrup
: For example, preview-latex uses mylatex.ltx to dump the state of the TeX at \begin{document} time. Not allowing any image references (like in \savebox commands) before that point of time is a rather brutal restriction.
Why do you need it
What? preview-latex? mylatex.ltx?
\savebox
It is the mechanism LaTeX _offers_ to users (it will commonly get used in connection with fancyhdr.sty, for example). And it uses a similar mechanism itself in \AtBeginDVI. Saying that it shouldn't is a bit late in the game.
and what do you gain by using \savebox?
Uh, it is _there_. For a reason. Saying that it (and similar functionality) should not get used anymore is creating a rather sordid restriction and backward incompatibility.
The speed increase is most likely minimal
The speed increase of what is most likely minimal?
The speed increase of using \savebox.
It is there and gets used.
and the document needs to read the image anyway -- unless you want to dump fragments of pdf code into the format.
In DVI mode, a pointer to the image gets dumped in the form of a special. A similarly sufficient amount of information would have to get there in PDF mode.
If we expanded the dvi model we would dump the meta information about pdf things and then undump them later. This needs code for dumping and undumping this meta information. And \immediate\pdfximage would still fail.
[...]
Don't forget: pdfTeX is one-pass, while TeX->dvips is two-pass.
Where is the relevance?
It was not designed with making two-pass-like solutions (which we are discussing here) working, instead the two passes are intermingled.
I don't see that. PDFTeX has to deal with boxes and hlists/vlists being boxed, reboxed, unboxed, thrown away and duplicated. So this is basically a two-pass process between creation of an object and a list-capable reference to it, and shipout. An object might appear in arbitrary duplication (including zero times) in multiple shipouts, so the multipass abstraction that makes it possible to copy boxes already needs to be present in PDFTeX.
The "sufficient amount of information" aka "meta information" is most likely not in the code in one piece and there is currently no way to dump and undump that. And a proper solution would not only handle \pdfximage, but also \pdfobj, \pdfxform, ...
Sure. But as I said, PDFTeX already _needs_ to organize its meta information in order to deal with box manipulations. -- David Kastrup
2007/6/27, David Kastrup
"Martin Schröder"
writes: \savebox
It is the mechanism LaTeX _offers_ to users (it will commonly get used in connection with fancyhdr.sty, for example). And it uses a similar mechanism itself in \AtBeginDVI.
Saying that it shouldn't is a bit late in the game.
This is the first user complaining in years (it has always been broken). I don't forsee many people complaining if we disable it. How many people do you know that do \includegraphics in the ini? Disabling \pdfximage is easy, enabling it is probably much harder (Taco is investigating that), and only very few users need it (the ones raising the bug don't really unless pdfTeX dumps the complete image in the format; meta-infos won't be enough). Best Martin
"Martin Schröder"
2007/6/27, David Kastrup
: "Martin Schröder"
writes: \savebox
It is the mechanism LaTeX _offers_ to users (it will commonly get used in connection with fancyhdr.sty, for example). And it uses a similar mechanism itself in \AtBeginDVI.
Saying that it shouldn't is a bit late in the game.
This is the first user complaining in years (it has always been broken). I don't forsee many people complaining if we disable it. How many people do you know that do \includegraphics in the ini?
Disabling \pdfximage is easy, enabling it is probably much harder (Taco is investigating that), and only very few users need it (the ones raising the bug don't really unless pdfTeX dumps the complete image in the format; meta-infos won't be enough).
In DVI mode, the dumped meta-info in the special _is_ enough. It is not expected that the actual contents of included files make it into hlists/vlists or even the DVI file. If external references break when moving a format around in the tree or changing the referenced files, nobody will be very much surprised. PDFTeX should likely try to avoid crashing or acting all too unpredictable, but that's more or less all. It would seem reasonable to not change the reserved space at all in the hlist/vlist, and align the image at the upper left corner of it, regardless of what size it has now. There is the question of just when to complain when an image source has disappeared. Probably format read time would be correct. -- David Kastrup
participants (4)
-
David Kastrup
-
Martin Schröder
-
noreply@sarovar.org
-
Taco Hoekwater