David@lola.quinscape.zz wrote:
Taco Hoekwater
writes: [...] This is discussed in the reference manual, so if you have not looked at that yet, please do so before replying to this message.
Dangerous advice since this gives me ideas...
Here is something I find worth giving a different API:
\subsubsection{\callback{token_filter}}
This callback allows you to change the fetch and preprocess any lexical token that enters \LUATEX, before \LUATEX\ executes or expands the associated command.
\startfunctioncall function () return table <token> end \stopfunctioncall
The calling convention for this callback is bit more complicated then for most other callbacks. The function should either return a lua table representing a valid to-be-processed token or tokenlist, or something else like nil or an empty table.
If your lua function does not return a table representing a valid token, it will be immediately called again, until it eventually does return a useful token or tokenlist (or until you reset the callback value to nil). See the description of \callbacklib{token} for some handy functions to be used in conjunction with this callback.
If your function returns a single usable token, then that token will be processed by \LUATEX\ immediately. If the function returns a token list (a table consisting of a list of consecutive token tables), then that list will be pushed to the input stack as completely new token list level, with it's token type set to `inserted'. In either case, the returned token(s) will not be fed back into the callback function.
I think that I would like to propose a much more luatic solution:
If token_filter is set, it is called with one argument \verb|get_next|, the function originally supposed to get the next token.
token_filter should then call this function as often as it needs to (possibly zero times) and return one token to the caller.
If you need to readahead and buffer tokens (like when simulating OTPs), the easiest way to do this is using something like the following for the filter function:
coroutine.wrap(function(get_token) while true local token1 = get_token() if token1.cmd != "^" then get_token = coroutine.yield(token1) else local token2 = get_token() if token2.cmd != "^" then coroutine.yield(token1) get_token = coroutine.yield(token2) else local token3 = get_token() if token3.cmd ... then get_token = coroutine.yield(something) else coroutine.yield(token1) coroutine.yield(token2) get_token = coroutine.yield(token3) end end end end end)
Ok, the code itself is nonsensically, but it should illustrate the working principle: if the filtering is not 1:1, one can use a coroutine for analysing the input, buffering and producing the tokens. This approach also has the advantage that one can stack filter functions easily.
The existing interface makes that much harder: I actually have no good idea how one would go about it.
One problem with this approach is that the lookahead kept internally within a coroutine will get lost when one switches the filter function out (not that the current approach fares better here). One solution might be to pass an artificial "EOF" token to the filter function as the last act before removing it from token_filter, and accepting a list of lookahead tokens as the return value.
the problem is that this is real slow which renders it rather unusable, even the current implementation is already on the edge of acceptable why do you want to handle the ^'s? you can do that using the input line callback Hans -- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------