Re: [Dev-luatex] token_filter semantics

27 Mar 2007

      David@lola.quinscape.zz wrote:
...
Taco Hoekwater  writes:
...
[...] This is discussed in the reference manual, so if you have not
looked at that yet, please do so before replying to this message.
Dangerous advice since this gives me ideas...
Here is something I find worth giving a different API:
\subsubsection{\callback{token_filter}}
This callback allows you to change the fetch and preprocess any
    lexical token that enters \LUATEX, before \LUATEX\ executes or expands
    the associated command.
\startfunctioncall
         function ()
            return table <token>
         end
    \stopfunctioncall
The calling convention for this callback is bit more complicated then
    for most other callbacks.  The function should either return a lua
    table representing a valid to-be-processed token or tokenlist, or
    something else like nil or an empty table.
If your lua function does not return a table representing a valid
    token, it will be immediately called again, until it eventually does
    return a useful token or tokenlist (or until you reset the callback
    value to nil). See the description of \callbacklib{token} for some
    handy functions to be used in conjunction with this callback.
If your function returns a single usable token, then that token will
    be processed by \LUATEX\ immediately. If the function returns a token
    list (a table consisting of a list of consecutive token tables), then
    that list will be pushed to the input stack as completely new token
    list level, with it's token type set to `inserted'. In either case,
    the returned token(s) will not be fed back into the callback function.
I think that I would like to propose a much more luatic solution:
If token_filter is set, it is called with one argument
\verb|get_next|, the function originally supposed to get the next
token.
token_filter should then call this function as often as it needs to
(possibly zero times) and return one token to the caller.
If you need to readahead and buffer tokens (like when simulating
OTPs), the easiest way to do this is using something like the
following for the filter function:
coroutine.wrap(function(get_token)
  while true
    local token1 = get_token()
    if token1.cmd != "^" then
      get_token = coroutine.yield(token1)
    else
      local token2 = get_token()
      if token2.cmd != "^" then
        coroutine.yield(token1)
        get_token = coroutine.yield(token2)
      else
        local token3 = get_token()
        if token3.cmd ... then
          get_token = coroutine.yield(something)
        else
          coroutine.yield(token1)
          coroutine.yield(token2)
          get_token = coroutine.yield(token3)
        end
      end
    end
  end
end)
Ok, the code itself is nonsensically, but it should illustrate the
working principle: if the filtering is not 1:1, one can use a
coroutine for analysing the input, buffering and producing the tokens.
This approach also has the advantage that one can stack filter
functions easily.
The existing interface makes that much harder: I actually have no good
idea how one would go about it.
One problem with this approach is that the lookahead kept internally
within a coroutine will get lost when one switches the filter function
out (not that the current approach fares better here).  One solution
might be to pass an artificial "EOF" token to the filter function as
the last act before removing it from token_filter, and accepting a
list of lookahead tokens as the return value.
the problem is that this is real slow which renders it rather unusable, 
even the current implementation is already on the edge of acceptable

why do you want to handle the ^'s?

you can do that using the input line callback

Hans

-- 

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------