On 7/3/2020 2:55 PM, Marcel Fabian Krüger wrote:
Hi,
I recently noticed some cases where luametatex behaved in unexpected ways:
- The "Extra \fi" error isn't triggered, instead an extra `\fi` freezes luametatex. (Can be reproduced by compiling a document which only consists of a single \fi)
i already fixed here (noticed it when documenting some conditionals)
- token.new can only create some `data` tokens, but it doesn't apply bound checking on it's arguments:
there is no checking yet, there is an upper limit of 0x1FFFFF, so i'll add a check for that
Also for all other commands LuaTeX seems to apply range-checks to ensure that such overflows don't happen, even if invalid values are passed as firstargument.
indeed, but hadn't yet done that for data, it also need a more strict check at the tex end (i'm still not sure if i make a slightly different implementation of it but i can add the test anyway)
- There is token.primitives(). My assumption is that the returned table is meant to indicate the command is, mode and name corresponding to every primitive. (I think it is awesome that such a table is made available in luametatex) But especially the mode field sometimes has values which do not correspond to the mode of the actual primitives:
indeed.
I tried running the following in an almost iniTeX setting where all primitives aside from \shipout and \Umathcodenum have their default definitions:
``` \catcode`\%=12 \catcode`\~=12 \directlua{ local sorted = token.primitives() table.sort(sorted, function(a,b) return a[1]
This indicates that there are two kinds of differences: For some command codes, there are multiple primitives whose second entry in the token.primitives table is zero even though their mode is not zero. This especially affects the commands `above`, `after_something`, `make_box`, `un_vbox`, `set_specification` and `car_ret`. E.g. for after_something, all of \atendofgrouped, \afterassigned and \aftergrouped have a zero as second entry in token.primitives.
some tokens are more complex in the sense that they are combinations (have a follow up) and i'm not sure to what extedn i want to block that ... all a matter of experimenting and time, so the 'mode' field will be dropped but for now i kept it some like after_something i need to check (i just didn't update their ranges yet after adding some more primitives that use them) (maybe some otheres need an offset added but i'll check it)
The other difference is that all the internal_... commands have a fixed offset which differes between commands in their mode field.
IMO the difference for the internal_... commands make sense because they make for easier to use numbers, but having multiple primitives indicating mode 0 for the other commands seems to make this table significantly less useful because it can't be used to get a unique description of a primitive.
(I may have completely misinterpreted the table of course, but given that for other primitives the values match I do not think so)
it's a it work in progress as there are some exceptions that use special chr codes (for instance in conditionals several cmd codes need to have exclusive codes, so adapting it is a stepwise process; one decision i need to make there is how close to stay to the original tex codes eventually i want all to have reasonable ranges in the token interface (not per se the same as in the engine itself but that's a black box anyway) which involves some offsetting .. i do that stepwise in order to keep a working engine (the token interface is not used in context that much) Hans hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------