[Dev-luatex] token_filter semantics

Hans Hagen pragma at wxs.nl
Tue Mar 27 16:20:59 CEST 2007

David at lola.quinscape.zz wrote:
> Taco Hoekwater <taco at elvenkind.com> writes:
>> [...] This is discussed in the reference manual, so if you have not
>> looked at that yet, please do so before replying to this message.
> Dangerous advice since this gives me ideas...
> Here is something I find worth giving a different API:
>     \subsubsection{\callback{token_filter}}
>     This callback allows you to change the fetch and preprocess any
>     lexical token that enters \LUATEX, before \LUATEX\ executes or expands
>     the associated command.
>     \startfunctioncall
>          function ()
>             return table <token>
>          end
>     \stopfunctioncall
>     The calling convention for this callback is bit more complicated then
>     for most other callbacks.  The function should either return a lua
>     table representing a valid to-be-processed token or tokenlist, or
>     something else like nil or an empty table.
>     If your lua function does not return a table representing a valid
>     token, it will be immediately called again, until it eventually does
>     return a useful token or tokenlist (or until you reset the callback
>     value to nil). See the description of \callbacklib{token} for some
>     handy functions to be used in conjunction with this callback.
>     If your function returns a single usable token, then that token will
>     be processed by \LUATEX\ immediately. If the function returns a token
>     list (a table consisting of a list of consecutive token tables), then
>     that list will be pushed to the input stack as completely new token
>     list level, with it's token type set to `inserted'. In either case,
>     the returned token(s) will not be fed back into the callback function.
> I think that I would like to propose a much more luatic solution:
> If token_filter is set, it is called with one argument
> \verb|get_next|, the function originally supposed to get the next
> token.
> token_filter should then call this function as often as it needs to
> (possibly zero times) and return one token to the caller.
> If you need to readahead and buffer tokens (like when simulating
> OTPs), the easiest way to do this is using something like the
> following for the filter function:
> coroutine.wrap(function(get_token)
>   while true
>     local token1 = get_token()
>     if token1.cmd != "^" then
>       get_token = coroutine.yield(token1)
>     else
>       local token2 = get_token()
>       if token2.cmd != "^" then
>         coroutine.yield(token1)
>         get_token = coroutine.yield(token2)
>       else
>         local token3 = get_token()
>         if token3.cmd ... then
>           get_token = coroutine.yield(something)
>         else
>           coroutine.yield(token1)
>           coroutine.yield(token2)
>           get_token = coroutine.yield(token3)
>         end
>       end
>     end
>   end
> end)
> Ok, the code itself is nonsensically, but it should illustrate the
> working principle: if the filtering is not 1:1, one can use a
> coroutine for analysing the input, buffering and producing the tokens.
> This approach also has the advantage that one can stack filter
> functions easily.
> The existing interface makes that much harder: I actually have no good
> idea how one would go about it.
> One problem with this approach is that the lookahead kept internally
> within a coroutine will get lost when one switches the filter function
> out (not that the current approach fares better here).  One solution
> might be to pass an artificial "EOF" token to the filter function as
> the last act before removing it from token_filter, and accepting a
> list of lookahead tokens as the return value.
the problem is that this is real slow which renders it rather unusable, 
even the current implementation is already on the edge of acceptable

why do you want to handle the ^'s?

you can do that using the input line callback



                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl

More information about the dev-luatex mailing list