Hello, some notes I collected while testing LuaTeX beta 0.11.2: - It should be noted in the manual that \luaescapestring completely expands its parameter. - When setting a \toks register in Lua using tex.toks['foo'] = 'bar' et.al., what catcodetable is used? From looking inside luatex.web, I would guess that, since str_toks is used, all characters get catcode 12, except for spaces, which get catcode 10. A quick test confirms this. In case this is neither a bug nor a temporary hack: This IMO severely limits the usefulness of the tex.toks table, since the toks registers written in Lua later have to be re-scanned in TeX. IMO, tex.settoks should be passed an optional parameter to set the catcode table to use. When using array addressing, the current catcode table should be used (just like tex.print/tex.sprint). - What exactly is meant by "Lua strings are converted to token lists using \the\toks style expansion" (pg. 23)? Jonathan
Jonathan Sauer wrote:
Hello,
some notes I collected while testing LuaTeX beta 0.11.2:
- It should be noted in the manual that \luaescapestring completely expands its parameter
Thanks, added a note to that effect.
- When setting a \toks register in Lua using tex.toks['foo'] = 'bar' et.al., what catcodetable is used? From looking inside luatex.web, I would guess that, since str_toks is used, all > characters get catcode 12, except for spaces, which get catcode 10.
Correct. It is one of the first things I wrote, and it definately needs updating. I now believe the toks array should accept and return token list tables (instead of lua strings), and there should be helper functions to_string() and to_tokenlist() for going back and forth. That needs a bit of (not-yet-done) programming, but it is not hard at all.
- What exactly is meant by "Lua strings are converted to token lists using \the\toks style expansion" (pg. 23)?
Precisely the above str_toks reference. I've changed the wording in manual a bit. Cheers, Taco
Hello,
- When setting a \toks register in Lua using tex.toks['foo'] = 'bar' et.al., what catcodetable is used? From looking inside luatex.web, I would guess that, since str_toks is used, all characters get catcode 12, except for spaces, which get catcode 10.
Correct. It is one of the first things I wrote, and it definately needs updating. I now believe the toks array should accept and return token list tables (instead of lua strings), and there should be helper functions to_string() and to_tokenlist() for going back and forth. That needs a bit of (not-yet-done) programming, but it is not hard at all.
I'm wondering how this would affect performance. How costly is the conversion from a string to a table? And what about tex.print? Since it creates/writes tokens as well, will it be changed to accept token list tables, too? (at any rate, I think both should accept the same parameter type)
Cheers, Taco
Jonathan
Jonathan Sauer wrote:
Hello,
- When setting a \toks register in Lua using tex.toks['foo'] = 'bar' et.al., what catcodetable is used? From looking inside luatex.web, I would guess that, since str_toks is used, all characters get catcode 12, except for spaces, which get catcode 10. Correct. It is one of the first things I wrote, and it definately needs updating. I now believe the toks array should accept and return token list tables (instead of lua strings), and there should be helper functions to_string() and to_tokenlist() for going back and forth. That needs a bit of (not-yet-done) programming, but it is not hard at all.
I'm wondering how this would affect performance. How costly is the conversion from a string to a table?
costly; input -> texscanner -> internal tex structure with callback or toks or ...: input -> texscanner -> callback that gets token being table -> returns table (or not) -> internal tex structure so, in the tokentable case there are two conversions (+ allocation and garbage collection for the token table) which is way slower
And what about tex.print? Since it creates/writes tokens as well, will it be changed to accept token list tables, too? (at any rate, I think both should accept the same parameter type)
no, that will be string based; pushing strings into texs scanner is currently pretty efficient and te need to convert them to tokens first would make the interface clumsy; so tex.tprint (token print) would be a natural candidate then (one argument, a table of tokens) in practice tex.sprint is used more often than pushing token tables into tex Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Jonathan Sauer wrote:
I'm wondering how this would affect performance. How costly is the conversion from a string to a table?
pretty bad. But would you normally assign lots of stuff to a toks register after building it up by hand? I imagined that would more likely be something previously returned from a different token register, or the replacement of a macro or something. In lua it is actually possible to do both types at the same time, so there is no need to decide either way, but token list tables do fit the internal structure better.
And what about tex.print? Since it creates/writes tokens as well, will it be changed to accept token list tables, too? (at any rate, I think both should accept the same parameter type)
tex.print writes input strings (one-line files, if you will). The concept and effect is somewhat different from a token list. The differences are subtle, but important. A special function like tex.tprint is not a bad idea, but we have not really needed it yet. Cheers, Taco
participants (3)
-
Hans Hagen
-
Jonathan Sauer
-
Taco Hoekwater