Re: [NTG-pdftex] processing speed
On 1 July 2009 Jérome Laurens wrote:
Hahn found a bug in synctex, I fixed it and made the tests with the stable branch of pdftex
Hi Jérome, of course, if there had been a bug in synctex, it's good that it is fixed now. However, I doubt that synctex was responsible for the problem I reported. There must still be another problem. Isn't synctex very new technology? I could reproduce the problem with older versions of TeX Live. Please try: http://tug.org/~kotucha/testpgfplots.tex.gz I suppose that it's a better test file than the one I prepared first. What I've found out so far is that 1. the problem is not new, 2. luatex has the same problem, 3. the string pool is growing. Because I don't know what exactly causes the problem, I can't even provide a minimal test file. "Unfortunately" pgfplots depends on e-TeX and I don't have a "Knuth's TeX + e-TeX" binary. Maybe there exists something like this in older TL releases. Maybe even Knuth's TeX has this problem, but I don't know to prove it. Hans and Taco said that the string pool is the culprit. Yes, after adding some tiny things to the .tex file, I had to create a new format file with an increased string pool in order to get PDF files at all. It's definitely the string pool which causes trouble. I don't know anything about TeX's innards but the name "string pool" implies that it contains human readable stuff. If this is true, isn't it possible to convince TeX to dump the string pool to a file or to standard output, for instance, whenever a page is shipped out? This probably helps to find out what's going on. Anyway, please try http://tug.org/~kotucha/testpgfplots.tex.gz instead of mktestfiles.gz. Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------
Reinhard Kotucha wrote:
3. the string pool is growing.
this is because filenames are added to the string pool and not freed; so, the more files, the larger the pool gets; i though that etex kind of fixed this but it might fail when strings are added before it can reclaim the space taken by the filename Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On 2 July 2009 Hans Hagen wrote:
Reinhard Kotucha wrote:
3. the string pool is growing.
this is because filenames are added to the string pool and not freed; so, the more files, the larger the pool gets; i though that etex kind of fixed this but it might fail when strings are added before it can reclaim the space taken by the filename
Hi Hans, there might be a problem with file names but please try: http://tug.org/~kotucha/testpgfplots.tex.gz It demonstrates the problem but doesn't \input any file. Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------
Reinhard Kotucha wrote:
On 2 July 2009 Hans Hagen wrote:
Reinhard Kotucha wrote:
3. the string pool is growing.
this is because filenames are added to the string pool and not freed; so, the more files, the larger the pool gets; i though that etex kind of fixed this but it might fail when strings are added before it can reclaim the space taken by the filename
Hi Hans, there might be a problem with file names but please try:
http://tug.org/~kotucha/testpgfplots.tex.gz
It demonstrates the problem but doesn't \input any file.
i tried but the pgf in my texlive tree is too old (\if related messages in the log) and i can't update right now in order to solve this problem we need to know what happens at the tex end which boils down to stripping the the smallest failing example maybe a tracingall will give you a clue (lots of literals or lost of long csnames or so) unless i have a plain example that also shows it in context i fear i cannot be of much help Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On 2 July 2009 Hans Hagen wrote:
in order to solve this problem we need to know what happens at the tex end which boils down to stripping the the smallest failing example
Hi Hans, the problem is that I don't know how to strip it down. The first test file I provided was such an attempt, but it turned out that it was misleading.
maybe a tracingall will give you a clue (lots of literals or lost of long csnames or so)
Yes, this is a good idea, I'll try it tomorrow.
unless i have a plain example that also shows it in context i fear i cannot be of much help
I just provided one: http://tug.org/~kotucha/testpgfplotsplain.tex.gz If you are using pdftex, it will show times in scaled seconds. If you you are using luatex, the times are in seconds. But if you are using a recent version of luatex, you probably have to replace \directlua0 by \directlua in the script. Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------
On Thu, Jul 02, 2009 at 12:31:20AM +0200, Reinhard Kotucha wrote:
On 1 July 2009 Jérome Laurens wrote:
Hahn found a bug in synctex, I fixed it and made the tests with the stable branch of pdftex
Hi Jérome, of course, if there had been a bug in synctex, it's good that it is fixed now.
that is certainly a bug: memory was allocated and deallocated with inconsistent size (allocated with size=4memory_word and deallocated with size=2memory_word). This didn't cause much visible harm because: - the number of this occurence is low (1 per each paragraph) - the memory was deallocated with smaller size (the other way around would have caused weird and hard-to-catch problems). After running Reinhard's test for a while, the variable-size memory is defragmented with a lot of nodes with size=2, which made memory allocation slower and slower. It's questionable whether synctex or pdftex should fix this. I asked Jerome for this change because it seems the simplest way to fix it (thanks to Taco's suggestion). It could be done in pdftex too, but would be more complicated and less efficient. I have verified this with gprof, running Reinhard's test file (slightly modified): ,-------- | \hbadness10000 | \vbadness10000 | \ifx\pdfoutput\undefined\else\pdfoutput=0 \fi | \catcode`\%=12 \catcode`\&=12 | | \newcount\n | \loop | \input sometext.tex | \immediate\write18{date +%s >&2} | \advance\n 1 | \ifnum\n < 100 \repeat | \bye `-------- Result for pdftex stable, without the fix: ,-------- | Each sample counts as 0.01 seconds. | % cumulative self self total | time seconds seconds calls s/call s/call name | 38.89 229.27 229.27 708325637 0.00 0.00 zgetnode | 25.85 381.66 152.39 90100100 0.00 0.00 ztrybreak | 5.85 416.17 34.51 1 34.51 587.29 maincontrol | 4.87 444.85 28.68 1253467536 0.00 0.00 zbadness | 3.42 465.00 20.15 4566038 0.00 0.00 hlistout | 3.40 485.03 20.03 4566038 0.00 0.00 zhpack | 3.22 503.99 18.96 100000 0.00 0.00 zlinebreak | 2.58 519.22 15.23 140432176 0.00 0.00 zmovement | 2.09 531.52 12.30 464226970 0.00 0.00 getnext | 1.58 540.82 9.30 354818 0.00 0.00 zflushnodelist | 1.41 549.14 8.32 708325636 0.00 0.00 zfreenode | ... `-------- Result for pdftex stable plus the fix: ,-------- | Each sample counts as 0.01 seconds. | % cumulative self self total | time seconds seconds calls s/call s/call name | 38.97 130.20 130.20 90100100 0.00 0.00 ztrybreak | 9.12 160.66 30.46 1 30.46 331.61 maincontrol | 8.04 187.51 26.85 708325637 0.00 0.00 zgetnode | 4.83 203.65 16.14 4566038 0.00 0.00 zhpack | 4.79 219.66 16.01 1253467536 0.00 0.00 zbadness | 4.74 235.51 15.85 4566038 0.00 0.00 hlistout | 4.15 249.39 13.88 100000 0.00 0.00 zlinebreak | 3.76 261.94 12.55 140432176 0.00 0.00 zmovement | 3.34 273.09 11.16 464226970 0.00 0.00 getnext | 3.22 283.86 10.77 708325636 0.00 0.00 zfreenode | ... `-------- Result for tex: ,-------- | Each sample counts as 0.01 seconds. | % cumulative self self total | time seconds seconds calls s/call s/call name | 37.44 108.71 108.71 90100100 0.00 0.00 ztrybreak | 9.11 135.16 26.45 1 26.45 289.93 maincontrol | 8.86 160.90 25.73 708325637 0.00 0.00 zgetnode | 8.37 185.20 24.30 1253467536 0.00 0.00 zbadness | 5.30 200.60 15.40 4566038 0.00 0.00 zhpack | 4.83 214.63 14.04 4566038 0.00 0.00 hlistout | 4.12 226.58 11.95 100000 0.00 0.00 zlinebreak | 3.91 237.93 11.35 140432176 0.00 0.00 zmovement | 3.62 248.46 10.53 464226970 0.00 0.00 getnext | 2.75 256.44 7.98 708325636 0.00 0.00 zfreenode `-------- without the fix, pdftex spends a lot of time of zgetnode(). With the fix, the result looks rather similar to tex.
However, I doubt that synctex was responsible for the problem I reported. There must still be another problem. Isn't synctex very new technology? I could reproduce the problem with older versions of TeX Live.
yes it's quite possible that there is another problem.
Please try:
http://tug.org/~kotucha/testpgfplots.tex.gz
I suppose that it's a better test file than the one I prepared first.
it seems it shows a different problem. I let it run for a while and see that the time processing each plot is slowly increasing, but sometime it goes down as well. Here is what I get after 100 plots: ,-------- | ===851968=== | ===786432=== | ===851968=== | ===917504=== | ===917504=== | ===983040=== | ===1048576=== | ===2097152=== | ... | ===917504=== | ===851968=== | ===917504=== | ===851968=== | ===917504=== | ===851968=== `-------- (tested with pdftex stable, without the above fix).
What I've found out so far is that
1. the problem is not new,
2. luatex has the same problem,
3. the string pool is growing.
Because I don't know what exactly causes the problem, I can't even provide a minimal test file. "Unfortunately" pgfplots depends on e-TeX and I don't have a "Knuth's TeX + e-TeX" binary. Maybe there exists something like this in older TL releases.
Maybe even Knuth's TeX has this problem, but I don't know to prove it.
Hans and Taco said that the string pool is the culprit. Yes, after adding some tiny things to the .tex file, I had to create a new format file with an increased string pool in order to get PDF files at all. It's definitely the string pool which causes trouble.
I don't know anything about TeX's innards but the name "string pool" implies that it contains human readable stuff. If this is true, isn't it possible to convince TeX to dump the string pool to a file or to standard output, for instance, whenever a page is shipped out? This probably helps to find out what's going on.
yes it can be done. However I suggest we make clear first what the problem is. From your 2nd test file, it seems the processing time is not slower that much. Anyway, I will rerun it with a higher number of plots to see if it makes any difference. Regards, Thanh PS: fwiw, the gprof report for testpgfplots.tex looks like this (after making 100 plots): ,-------- | Each sample counts as 0.01 seconds. | % cumulative self self total | time seconds seconds calls s/call s/call name | 32.39 251.25 251.25 6082692346 0.00 0.00 getnext | 15.36 370.37 119.12 1064097580 0.00 0.00 zflushlist | 8.43 435.77 65.40 3911156736 0.00 0.00 getavail | 6.47 485.99 50.22 376907688 0.00 0.00 macrocall | 5.93 531.95 45.96 54922728 0.00 0.00 thetoks | 3.24 557.07 25.12 1404732324 0.00 0.00 endtokenlist | 2.65 577.61 20.55 156920995 0.00 0.00 zscantoks | 2.57 597.55 19.94 543147090 0.00 0.00 expand | 2.31 615.50 17.96 4801962525 0.00 0.00 gettoken | 1.61 627.97 12.47 1426527970 0.00 0.00 getxtoken | 1.61 640.44 12.47 899138347 0.00 0.00 zbegintokenlist | 1.52 652.26 11.82 342016689 0.00 0.00 prefixedcommand | 1.50 663.87 11.61 219961927 0.00 0.00 conditional | 1.34 674.27 10.40 8580 0.00 0.00 checkoutervalidity | 1.16 683.26 9.00 217157551 0.00 0.00 passtext | 1.10 691.76 8.50 221130467 0.00 0.00 zgetnode | 1.00 699.54 7.78 505593977 0.00 0.00 backinput | 0.88 706.33 6.79 1 6.79 770.07 maincontrol | ... `--------
participants (3)
-
Hans Hagen
-
Reinhard Kotucha
-
The Thanh Han