Reinhard Kotucha wrote:
Does the string pool contain the hash for control sequences? This would explain the behavior. From texmf.cnf:
% Max number of characters in all strings, including all error messages, % help texts, font names, control sequences. These values apply to TeX and MP. pool_size = 1250000
Maybe PGF creates a lot of control sequences at runtime, using \csname and \endcsname in macros. This would increase the control sequence hash and then it takes more time to find a particular macro.
But if they are created dynamically at runtime, they are created within a group (\begin{tikzpicture}...\end{tikzpicture}) and I expect that everything created within a paricular group is removed from the hash after \endgroup.
I no longer remember if that is the case, but DEK's words that I cited earlier were extracted from a much longer message which I now repeat /verbatim/ below; it certainly makes references to the impact of control sequences on the string pool. ** Phil. --------
File name overflow of string pool
[ Since this report, I have seen a couple of other reports on this topic in the electronic discussion lists, mostly from Europe. While not a bug, it can certainly be a serious inconvenience. A couple of the reports have mentioned building nonstandard versions of TeX with a separate pool of file names; not good for compatibility. ]
Date: Fri, 12 Jul 91 19:06 +0200 From: "Johannes L. Braams"
Subject: Bug/misfeature in TeX? We have run into a problem with TeX. We have an application where we would like to \input about 2400 files. We can't do that because TeX runs out of string pool space. This application is rather important because it concerns the reports the lab has to make each quarter of a year.
When I studied TeX the program to find out what happens when a file is being \input I found that the name of the file is stored in string pool. AND it never gets removed from the string pool (as far as I could find out). What I don't understand is why filenames are written to string pool in the first place. Isn't it possible to use some kind of stack or array mechanism to store filenames? It should then be possible to free the memory used to store a filename when the file gets closed and the filename is no longer needed.
Do you know the answer or someone who does? Or is this a bug? I would rather call it a design flaw actually.
Regards,
Johannes Braams
PTT Research Neher Laboratorium, P.O. box 421, 2260 AK Leidschendam, The Netherlands. Phone : +31 70 3325051 E-mail : JL_Braams@pttrnl.nl Fax : +31 70 3326477 ------- Date: Mon, 15 Jul 91 01:59:22 BST From: Chris Thompson
Subject: Re: Bug/misfeature in TeX? I agree that it's a design flaw, not a bug. People do keep falling over it from time to time, though, so maybe Don could be asked to think about it again. I suspect, however, that there is no easy fix, for reasons I will explain below.
Johannes asks why the names go in the string pool in the first place: the answer to that is "why not?"... it is the convenient place to keep more or less arbitrarily long strings. The space occupied by things added to the string pool can be reclaimed, provided it is done straight away, before other parts of TeX have been exercised that may add other strings (especially, control sequence names) to the pool. There are two types of file name to think about (neither of which are reclaimed at the moment, with one partial---and wrong---exception):
1. The 1, 2 or 3 strings generated by |scan_file_name|. Usually these are used in some implementation-dependant way to open a file, and maybe then as arguments to |*_make_name_string|, and are then never needed again; and all this would usually happen straight away. Exception: deferred (non-\immediate) \openout's.
2. The string generated by |*_make_name_string|. For things like the log and DVI files, this has to be kept for ever (printing them is almost the last thing TeX does). The interesting case, however, is \input. The string is printed (immediately), and then stored in the |name_field| of the current input stack entry. *Almost* the only thing TeX uses it for thereafter is as a number > 17 (to distinguish the case of an input level being an \input file (as opposed to terminal input or a \read level). The sole exception is in section 84 where it is used to deal with the "E" response to the error prompt: in distribution TeX as part of a message, but in practice as input to the implementation-dependant way of invoking an editor.
(BEGIN ASIDE
The ``partial and wrong exception'' is the code in section 537 introduced by change 283. |start_input| reclaims the space occupied by the result of |a_make_name_string|, if that is still the top string in the pool, and replaces it by the `name' part of the results of |scan_file_name|. I have had to undo this "fix" in my implementations: the *only* thing that the ``file name'' is needed for is as an argument to the editor, and it is an unwarranted assumption that
a. The values of the `area' and `extension' parts of the name are irrelevant to that purpose, and
b. The output of |a_make_name_string| doesn't contain extra information, available as a result of the opening process, that may also be relevant.
END ASIDE)
In theory the contents of the strings of type 2 for \input files could be kept on some sort of separate stack, as Johannes suggests (parallel to the |input_file| and |line_stack| arrays), but this would be quite convoluted and involve a lot of duplication of code. More plausible would be an attempt to reclaim them if they are still the top string in the pool when the file is closed (in |end_file_reading|); this isn't so unlikely in cases like Johannes'... presumably not all 2400 files can use never-before-encountered control sequences, or he will be running out of other things besides the string pool!
The strings of type 1 create a difficulty, however, unless they can be got rid of just after the call of |a_make_name_string| (a certain amount of permuting of the string pool would be required to do that). If they, also, are to be got rid of when the file is closed, again subject to the condition that they are at the top of the pool, one will have to (at least) remember how many of them there were.
Some of this would, in fact, be rather easier in METAFONT than TeX. METAFONT's string pool entries have a use count, and reclaiming space consists of purging consecutive entries at the top of the pool whose use counts have all fallen to zero. One could easily arrange that the strings of type 1 had use counts of zero after the opening process was over, and that the strings of type 2 for "input" files had a use count of 1 which was decremented to 0 at close time; then the right things would happen more or less automatically. However, TeX *doesn't* have such use counts, and I don't really suppose Don is going to introduce them in order to solve this problem.
Chris Thompson -------
[ dek: I think the strings are also needed for font file names. For ordinary input files I put the special code into \S537 [which CET1 disabled] so that the Math Reviews could input lots of files. Of course there's a workaround (using the operating system to concatenate files!) but otherwise all I can suggest is a local change-file routine that tries to reclaim string space when closing files if the unneeded strings are still at the end of the string pool. You could introduce a new array indexed by 1..max_in_open to keep relevant status information if it isn't already present (see \S304). ]