Hi, latex -ini latex.ltx gives real 0m0.498s user 0m0.080s sys 0m0.408s luatex -ini latex.8bit (where the latter just sets up the bytes->utf8 conversion through \directlua0{ callback.register("process_input_buffer", function(buf) return unicode.utf8.char(unicode.latin1.byte(buf,1,-1)) end)} like in the last mail and then loads LaTeX) is real 0m2.730s user 0m1.836s sys 0m0.700s Sadly, I have no good way to see how much of this is caused by the callback, and how much is due to other LuaTeX particularities. However, one can run tex.tex through LuaTeX with and without this translation, and just run the normal TeX engine, too. tex tex.tex gives us real 0m1.253s user 0m0.496s sys 0m0.664s luatex tex.tex gives us real 0m14.329s user 0m13.413s sys 0m0.872s and using time luatex "&plain" '\directlua0{callback.register("process_input_buffer",function(buf)return unicode.utf8.char(unicode.latin1.byte(buf,1,-1))end)}\input tex' gives us real 0m14.801s user 0m12.709s sys 0m1.304s So the good news is that using the callback makes LuaTeX faster (more probably the difference gets lost in the noise). And the bad news is that it is about a factor of 25 slower than the normal TeX executable in either case. Some of it might be the difference in table sizes for the plain TeX executable. But the factor of 25 seems to fit rather well also with the LaTeX format test. Any idea where the bulk of this would be from? What would somebody wanting to use LuaTeX in a production environment do (apart from getting his head examined, I mean)? -- David Kastrup
David Kastrup wrote:
Some of it might be the difference in table sizes for the plain TeX executable. But the factor of 25 seems to fit rather well also with the LaTeX format test. Any idea where the bulk of this would be from? What would somebody wanting to use LuaTeX in a production environment do (apart from getting his head examined, I mean)?
Luatex is slower than normal tex, and I expect the values for dumping latex are about right.Dumping any format is quite a bit slower, because the format is a) bigger, b) more complex, and c) compressed. What is wrong with that tex.tex file is a mystery. I have not seen such slowness here and do not (yet) comprehend what is going on. Is there any particular part where it hesitates, or is it just overall much slower? Best, Taco
Taco Hoekwater
David Kastrup wrote:
Some of it might be the difference in table sizes for the plain TeX executable. But the factor of 25 seems to fit rather well also with the LaTeX format test. Any idea where the bulk of this would be from? What would somebody wanting to use LuaTeX in a production environment do (apart from getting his head examined, I mean)?
Luatex is slower than normal tex, and I expect the values for dumping latex are about right.Dumping any format is quite a bit slower, because the format is a) bigger, b) more complex, and c) compressed.
What is wrong with that tex.tex file is a mystery. I have not seen such slowness here and do not (yet) comprehend what is going on. Is there any particular part where it hesitates, or is it just overall much slower?
No, just going slowly overall the way it looks, so it can't be kpathsea, I guess. The file is just generated by weave tex.web and then compiled either with tex tex or with (after luatex -ini plain.tex "\dump") luatex "&plain" tex I should probably repeat the test with pdfTeX rather than TeX, but I have my doubts that it will account for _such_ a difference. In particular when it is running in DVI mode. I have not yet looked at the optimization options with which LuaTeX gets compiled. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum
Hi David, David Kastrup wrote:
What is wrong with that tex.tex file is a mystery. I have not seen such slowness here and do not (yet) comprehend what is going on. Is there any particular part where it hesitates, or is it just overall much slower?
No, just going slowly overall the way it looks, so it can't be kpathsea, I guess. The file is just generated by
After some testing with a profiled binary, it turned out that LuaTeX spends nearly 90% of its run time inside the get_token() function when it is processing tex.tex completely (535 pages), but only 10% if it runs only the first 20 or so pages. Since get_token() is tex's internal version of malloc() more or less, I deduced that it was likely that there was an internal memory leak (unfreed node) that makes it harder for get_node() to find a new one when it is asked. Running a test file with \tracingstats=2 shows the variable memory usage gradually going up in both luatex and aleph, but not at all in pdftex, so the leak probably comes from omega. That makes the 'dir_node' the most likely suspect. More later. If interested, here is a small (context) test file: % tex=luatex \tracingstats=2 \dorecurse{50}{\dorecurse{20}{Hi\par}\page} \bye Best wishes, Taco
Taco Hoekwater wrote:
After some testing with a profiled binary, it turned out that LuaTeX spends nearly 90% of its run time inside the get_token() function when it is processing tex.tex completely (535 pages), but only 10% if it runs only the first 20 or so pages.
I should have formulated this a bit differently: Of course it spends a smaller percentage of its time in get_token() when the format loading becomes more important. But it is also overtaken by functions that are not related to format loading, so I amn confident that the gist of my original post is really true. Taco
Taco Hoekwater
Hi David,
David Kastrup wrote:
What is wrong with that tex.tex file is a mystery. I have not seen such slowness here and do not (yet) comprehend what is going on. Is there any particular part where it hesitates, or is it just overall much slower?
No, just going slowly overall the way it looks, so it can't be kpathsea, I guess. The file is just generated by
After some testing with a profiled binary, it turned out that LuaTeX spends nearly 90% of its run time inside the get_token() function when it is processing tex.tex completely (535 pages), but only 10% if it runs only the first 20 or so pages.
Since get_token() is tex's internal version of malloc() more or less, I deduced that it was likely that there was an internal memory leak (unfreed node) that makes it harder for get_node() to find a new one when it is asked.
Running a test file with \tracingstats=2 shows the variable memory usage gradually going up in both luatex and aleph, but not at all in pdftex, so the leak probably comes from omega. That makes the 'dir_node' the most likely suspect. More later.
Wow. I do the first stupid thing that comes into my mind, and hit upon some problem. I am not sure that the normal effect of a memory leak would be to make it "harder to find a new node": after all, when there are no free nodes, allocation is fast. And it can happen that repeatedly a large node gets freed, a smaller node gets allocated from the large node, and the next allocation of a large node has to look elsewhere. Unless adjacent small nodes get coalesced when freed, this could keep allocating more memory without being able to use free memory. I am afraid that I am too tired to dig into the allocation routines of TeX right now. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum
David Kastrup wrote:
I am afraid that I am too tired to dig into the allocation routines of TeX right now.
That's what I thought as well. I did some imperical tests to reach my conclusion and they could be skewed. But even if this is not the cause of all the slowness, it still needs to be resolved. Leaking memory is never a good thing. Taco
Taco Hoekwater
David Kastrup wrote:
I am afraid that I am too tired to dig into the allocation routines of TeX right now.
That's what I thought as well. I did some imperical tests to reach my conclusion and they could be skewed. But even if this is not the cause of all the slowness, it still needs to be resolved. Leaking memory is never a good thing.
I had one idea, namely dumping the format file after compiling the file, and see whether I find large repetitive patterns (in which case I'd look up their node type). Unfortunately, I got the following: luatex -ini -fmt ztex plain "\let\end\dump \input tex" This is luaTeX, Version 3.141592-snapshot-2007040210 (Web2C 7.5.6) (INITEX) (/usr/local/texlive/2007/texmf-dist/tex/plain/base/plain.tex Preloading the plain format: codes, registers, parameters, fonts, more fonts, macros, math definitions, output routines, hyphenation (/usr/local/texlive/2007/texmf/tex/generic/hyphen/hyphen.tex)) (tex.tex (/usr/local/texlive/2007/texmf-dist/tex/plain/base/webmac.tex) *1 [3] [4] [5] [6] [7] [8] [9] *17 [10] [11]Segmentation fault (core dumped) Now the LuaTeX documentation explicitly says that hyphenation in iniTeX may crash the system. Any idea how hard it would be to fix? In the mean time, I'll probably switch off hyphenation and hope that the situation turns out similar enough for the bug to show. -- David Kastrup
David Kastrup wrote:
Now the LuaTeX documentation explicitly says that hyphenation in iniTeX may crash the system. Any idea how hard it would be to fix?
Probably fairly easy, but there is little point in fixing code that will be removed soon. If you want to jump into this issue at this moment, just use Aleph instead. The whole problem is inherited from Omega 1 anyway. Taco
Taco Hoekwater
David Kastrup wrote:
Now the LuaTeX documentation explicitly says that hyphenation in iniTeX may crash the system. Any idea how hard it would be to fix?
Probably fairly easy, but there is little point in fixing code that will be removed soon. If you want to jump into this issue at this moment, just use Aleph instead. The whole problem is inherited from Omega 1 anyway.
I am not interested in fixing bugs as an intellectual exercise. If you are planning on removing that code, no point in trying to get acquainted with the current code. The dump file now contains some large sections of zeros at the end, but I presume that those will just be Omega's 64k register arrays. There is vast seemingly repetitive content earlier on in the dump; I'll see whether I can make anything of it. Of course, a large number of character nodes is to be expected, but there might be more involved. -- David Kastrup
David Kastrup
Taco Hoekwater
writes: David Kastrup wrote:
Now the LuaTeX documentation explicitly says that hyphenation in iniTeX may crash the system. Any idea how hard it would be to fix?
Probably fairly easy, but there is little point in fixing code that will be removed soon. If you want to jump into this issue at this moment, just use Aleph instead. The whole problem is inherited from Omega 1 anyway.
I am not interested in fixing bugs as an intellectual exercise. If you are planning on removing that code, no point in trying to get acquainted with the current code.
The dump file now contains some large sections of zeros at the end, but I presume that those will just be Omega's 64k register arrays. There is vast seemingly repetitive content earlier on in the dump; I'll see whether I can make anything of it. Of course, a large number of character nodes is to be expected, but there might be more involved.
There is a fair amount of directional whatsits in the dump, but it is in plausible relation to the other nodes. Lots of hlist nodes, too. The total size of the dump is about 8MB uncompressed. That does not look too leaky to me. Still, I tried clearing out all box registers and finish typesetting before dumping. If I understood dumping correctly, it was supposed to compact memory before dumping. One also has to keep in mind that tex.tex can be typeset with a TeX that has 64k words of memory. So those 8MB, of which the main part is main memory, could still be considered fishy. But it seems like there is leakage of general average material. dir nodes are just part of the matter. Doing the test with normal TeX leads to a format 300k in size. Doing it with PDFTeX results in 360k. So the 8MB from LuaTeX _do_ look out of kilter, even considering the larger register arrays at the end of the dump. -- David Kastrup
David Kastrup
David Kastrup
writes: Taco Hoekwater
writes: David Kastrup wrote:
Now the LuaTeX documentation explicitly says that hyphenation in iniTeX may crash the system. Any idea how hard it would be to fix?
Probably fairly easy, but there is little point in fixing code that will be removed soon. If you want to jump into this issue at this moment, just use Aleph instead. The whole problem is inherited from Omega 1 anyway.
I am not interested in fixing bugs as an intellectual exercise. If you are planning on removing that code, no point in trying to get acquainted with the current code.
The dump file now contains some large sections of zeros at the end, but I presume that those will just be Omega's 64k register arrays. There is vast seemingly repetitive content earlier on in the dump; I'll see whether I can make anything of it. Of course, a large number of character nodes is to be expected, but there might be more involved.
There is a fair amount of directional whatsits in the dump, but it is in plausible relation to the other nodes. Lots of hlist nodes, too. The total size of the dump is about 8MB uncompressed. That does not look too leaky to me. Still, I tried clearing out all box registers and finish typesetting before dumping. If I understood dumping correctly, it was supposed to compact memory before dumping. One also has to keep in mind that tex.tex can be typeset with a TeX that has 64k words of memory. So those 8MB, of which the main part is main memory, could still be considered fishy.
But it seems like there is leakage of general average material. dir nodes are just part of the matter.
Doing the test with normal TeX leads to a format 300k in size. Doing it with PDFTeX results in 360k.
So the 8MB from LuaTeX _do_ look out of kilter, even considering the larger register arrays at the end of the dump.
What is this? @ Conversely, when \TeX\ is finished on the current level, the former state is restored by calling |pop_nest|. This routine will never be called at the lowest semantic level, nor will it be called unless |head| is a node that should be returned to free memory. @p procedure pop_nest; {leave a semantic level, re-enter the old} begin if local_par<>null then begin if local_par_bool then begin end {|tail_append(local_par)|} else free_node(local_par,local_par_size); end; free_avail(head); decr(nest_ptr); cur_list:=nest[nest_ptr]; end; The local_par stuff pretty much looks like an _intentional_ memory leak. But it is probably not triggered by tex.tex. I think I am getting annoyed by the hackish appearance of Omega, and the quality of the code documentation does not exactly help. I guess that's enough headaches for today. -- David Kastrup
Taco Hoekwater
After some testing with a profiled binary, it turned out that LuaTeX spends nearly 90% of its run time inside the get_token() function when it is processing tex.tex completely (535 pages), but only 10% if it runs only the first 20 or so pages.
Interesting.
Since get_token() is tex's internal version of malloc() more or less, I deduced that it was likely that there was an internal memory leak (unfreed node) that makes it harder for get_node() to find a new one when it is asked.
Running a test file with \tracingstats=2 shows the variable memory usage gradually going up in both luatex and aleph, but not at all in pdftex, so the leak probably comes from omega. That makes the 'dir_node' the most likely suspect. More later.
Well, looking through, I noticed something of sub-Knuthian quality: @d push_dir(#)== begin dir_tmp:=new_dir(#); link(dir_tmp):=dir_ptr; dir_ptr:=dir_tmp; dir_ptr:=dir_tmp; end Note the duplication of the last assignment. Not tragic, but ugly. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum
David Kastrup wrote:
Well, looking through, I noticed something of sub-Knuthian quality:
@d push_dir(#)== begin dir_tmp:=new_dir(#); link(dir_tmp):=dir_ptr; dir_ptr:=dir_tmp; dir_ptr:=dir_tmp; end
Note the duplication of the last assignment. Not tragic, but ugly.
That could be related to the deallocation error. The node free-ing code tests whether dir_ptr is equal to null, but in fact there are often two dir_ptr-s: a global one and a local one. I will get back to this later, after the more serious issues are solved. It'll take me a loong time to find out why the code does what it does, let alone where and why it does it wrongly. Taco
Taco Hoekwater
David Kastrup wrote:
Well, looking through, I noticed something of sub-Knuthian quality:
@d push_dir(#)== begin dir_tmp:=new_dir(#); link(dir_tmp):=dir_ptr; dir_ptr:=dir_tmp; dir_ptr:=dir_tmp; end
Note the duplication of the last assignment. Not tragic, but ugly.
That could be related to the deallocation error. The node free-ing code tests whether dir_ptr is equal to null, but in fact there are often two dir_ptr-s: a global one and a local one.
With a program as simple and readable as TeX, such shortcuts make a lot of sense and add a new dimension of fun to debugging. A good thing none of the Omega developers will be at EuroTeX this year, I'd have a mind to tell them... Incidentally, I noticed that quite a few variables were renamed in LuaTeX as compared to the upstream tex.web (underscores added or removed). Again, I don't know whether eTeX, Aleph, Omega, PDFTeX or whoever else are to blame, but it increases the size of diffs, and if done globally, carries the potential to introduce problems like global/local variable shadowing.
I will get back to this later, after the more serious issues are solved. It'll take me a loong time to find out why the code does what it does, let alone where and why it does it wrongly.
Who are you telling. What a mess. -- David Kastrup
David Kastrup wrote:
A good thing none of the Omega developers will be at EuroTeX this year, I'd have a mind to tell them...
I'll send an email off to Gabor, since this problem will probably be present in Omega 2 as well.
Incidentally, I noticed that quite a few variables were renamed in LuaTeX as compared to the upstream tex.web (underscores added or removed).
I did that, I did it absolutely on purpose, it helped me catch a few bugs that went unnoticed *because of* tangle's underscore removal, and I do not care at all about the size of diff files. I will now send that message to Gabor Bella, and then go back to my normal mode of actually trying to do actual coding. Best wishes, Taco
Taco Hoekwater
David Kastrup wrote:
A good thing none of the Omega developers will be at EuroTeX this year, I'd have a mind to tell them...
I'll send an email off to Gabor, since this problem will probably be present in Omega 2 as well.
Incidentally, I noticed that quite a few variables were renamed in LuaTeX as compared to the upstream tex.web (underscores added or removed).
I did that, I did it absolutely on purpose, it helped me catch a few bugs that went unnoticed *because of* tangle's underscore removal, and I do not care at all about the size of diff files.
Hm. I'd have expected that making an identifier table would have been the more robust solution rather than changing the source around. Can't one make web2c or tangle deliver that information? Do you think that the duplicated assignment I pointed out earlier might be due to different spellings of the dirptr variable before you made those changes?
I will now send that message to Gabor Bella, and then go back to my normal mode of actually trying to do actual coding.
I am aware that these sidetracks take a lot of time. Still, we _do_ get some bugs flattened in the process. -- David Kastrup
Taco Hoekwater
David Kastrup wrote:
Do you think that the duplicated assignment I pointed out earlier might be due to different spellings of the dirptr variable before you made those changes?
Of course not. Otherwise the bug wouldn't be present in Aleph.
If tangle removes the underlines... -- David Kastrup
Taco Hoekwater
David Kastrup wrote:
Do you think that the duplicated assignment I pointed out earlier might be due to different spellings of the dirptr variable before you made those changes?
Of course not. Otherwise the bug wouldn't be present in Aleph.
The duplicated assignment is there in omdir.ch in Aleph already, and with identical spelling both times. So tangle's underscore removal would not play into this here. -- David Kastrup
Taco Hoekwater
David Kastrup wrote:
Incidentally, I noticed that quite a few variables were renamed in LuaTeX as compared to the upstream tex.web (underscores added or removed).
I did that, I did it absolutely on purpose, it helped me catch a few bugs that went unnoticed *because of* tangle's underscore removal, and I do not care at all about the size of diff files.
Stupid question: would some of the following have helped? In particular the -strict option sounds helpful. File: web2c.info, Node: tangle invocation, Next: weave invocation, Up: WEB 8.1 Tangle: Translate WEB to Pascal =================================== Tangle creates a compilable Pascal program from a WEB source file (*note WEB::). Synopsis: tangle [OPTION]... WEBFILE[.web] [CHANGEFILE[.ch]] [...] `-length=NUMBER' The number of characters that are considered significant in an identifier. Whether underline characters are counted depends on the `-underline' option. The default value is 32, the original tangle used 7, but this proved too restrictive for use by Web2c. `-lowercase' `-mixedcase' `-uppercase' These options specify the case of identifiers in the output of tangle. If `-uppercase' (`-lowercase') is specified, tangle will convert all identfiers to uppercase (lowercase). The default is `-mixedcase', which specifies that the case will not be changed. `-underline' When this option is given, tangle does not strip underline characters from identifiers. `-loose' `-strict' These options specify how strict tangle must be when checking identifiers for equality. The default is `-loose', which means that tangle will follow the rules set by the case-smashing and underline options above. If `-strict' is set, then identifiers will always be stripped of underlines and converted to uppercase before checking whether they collide. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum
David Kastrup wrote:
Incidentally, I noticed that quite a few variables were renamed in LuaTeX as compared to the upstream tex.web (underscores added or removed).
I did that, I did it absolutely on purpose, it helped me catch a few bugs that went unnoticed *because of* tangle's underscore removal, and I do not care at all about the size of diff files.
Stupid question: would some of the following have helped? In particular the -strict option sounds helpful.
No, that would be even worse. ;-) The problem was not identifier conflicts, but I wanted to get rid of automatic renaming of variables. By default, tangle removes underscores (web2c adds a 'z' prefix to a number of variables). Quite a bit of luatex is being written in C (and will become noweb source eventually), and has to interface with lua and the pascal web source at the same time. It was annoying and confusing that, for instance, the web identifier |input_ln| as seen in the pascal web source (a web2c library function) really is defined as |inputln|, without the underscore. Similarly, the pascal array |font_ec| was translated automatically into |fontec|. I now run (lua)tangle with --underline, so that all _ characters are kept, thereby keeping a much closer connection between the pascal web and the generated C code. But it also meant that I had to go through the web and change the web2c and pdftex library function identifiers back to how they should have been coded in the first place: not using underscores at all. Overall, I succeeded fairly well. There are about a dozen identifiers left that I am not happy about, and I have let those slip because I did not want to change the web2c library and tools. For example, fixwrites.c depends on pascal's |write_ln| being renamed to |writeln|. Best wishes, Taco
David Kastrup wrote:
Hi, latex -ini latex.ltx gives
real 0m0.498s user 0m0.080s sys 0m0.408s
here the format generation ratio is pdftex : luatex = 11 : 15 (luatex format is larger due to lua code)
luatex -ini latex.8bit (where the latter just sets up the bytes->utf8 conversion through
\directlua0{ callback.register("process_input_buffer", function(buf) return unicode.utf8.char(unicode.latin1.byte(buf,1,-1)) end)}
\directlua0{ do local uuc = unicode.utf8.char local ulb = unicode.latin1.byte callback.register("process_input_buffer", function(buf) return uuc(ulb(buf,1,-1)) end) end } return uuc(ulb(buf)) probably also works ok
Some of it might be the difference in table sizes for the plain TeX executable. But the factor of 25 seems to fit rather well also with the LaTeX format test. Any idea where the bulk of this would be from? What would somebody wanting to use LuaTeX in a production environment do (apart from getting his head examined, I mean)?
maybe sparse tables in the tex part Hans -- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
David Kastrup wrote:
Hi, latex -ini latex.ltx gives
real 0m0.498s user 0m0.080s sys 0m0.408s
luatex -ini latex.8bit (where the latter just sets up the bytes->utf8 conversion through
\directlua0{ callback.register("process_input_buffer", function(buf) return unicode.utf8.char(unicode.latin1.byte(buf,1,-1)) end)}
like in the last mail and then loads LaTeX) is
real 0m2.730s user 0m1.836s sys 0m0.700s
Sadly, I have no good way to see how much of this is caused by the callback, and how much is due to other LuaTeX particularities.
However, one can run tex.tex through LuaTeX with and without this translation, and just run the normal TeX engine, too.
tex tex.tex gives us real 0m1.253s user 0m0.496s sys 0m0.664s
luatex tex.tex gives us real 0m14.329s user 0m13.413s sys 0m0.872s
pdftex -progname=context --ini *x.tex luatex -progname=context --ini *x.tex") with x == \input plain \bye gives luatex: 1.021 pdftex: 0.941 so, actually luatex is not much slower; saving the format takes a bit more time (i tremember that we did timings to determint the compress level and accepted a 30% slow down and slightly less than maximum compression) luatex: 1.071 pdftex: 1.252 when testing you need to keep an eye on the memory settings as well as (e)tex mode; luatex is always etex Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (3)
-
David Kastrup
-
Hans Hagen
-
Taco Hoekwater