[NTG-context] Multiple cases of unexpected behaviour in luametatex
Marcel Fabian Krüger
tex at 2krueger.de
Fri Jul 3 14:55:45 CEST 2020
Hi,
I recently noticed some cases where luametatex behaved in unexpected
ways:
- The "Extra \fi" error isn't triggered, instead an extra `\fi`
freezes luametatex. (Can be reproduced by compiling a document which
only consists of a single \fi)
- token.new can only create some `data` tokens, but it doesn't apply
bound checking on it's arguments:
Take
```
\directlua{
t = token.new(0x200000, token.command_id'data')
print(t.cmdname, t.command, t.mode)
}
```
which prints
register 102 0
The issue does not seem to be that such tokens do not exists because
\letdatacode\somedata="200000
\directlua{
local t = token.create'somedata'
print(t.cmdname, t.command, t.index)
}
does print
data 101 2097152
Also for all other commands LuaTeX seems to apply range-checks to
ensure that such overflows don't happen, even if invalid values are
passed as firstargument.
- There is token.primitives(). My assumption is that the returned
table is meant to indicate the command is, mode and name
corresponding to every primitive. (I think it is awesome that such a
table is made available in luametatex) But especially the mode
field sometimes has values which do not correspond to the mode of
the actual primitives:
I tried running the following in an almost iniTeX setting where all
primitives aside from \shipout and \Umathcodenum have their default
definitions:
```
\catcode`\%=12
\catcode`\~=12
\directlua{
local sorted = token.primitives()
table.sort(sorted, function(a,b) return a[1]<b[1] or a[1]==b[1] and a[2]<b[2]end)
for _,info in ipairs(sorted) do
local t = token.create(info[3])
local rc, rm = t.command, t.mode
if rc==info[1] and rm ~= info[2] then
if info[2] == 0 then
print(string.format('MODE MISMATCH, expected zero: \string\\%s: real: %i, command: %i', info[3], rm, rc))
else
print(string.format('MODE MISMATCH: \string\\%s: offset: %i, command: %i', info[3], rm-info[2], rc))
end
elseif rc~=info[1] then print(t.csname)
end
end
}
```
This indicates that there are two kinds of differences:
For some command codes, there are multiple primitives whose second
entry in the token.primitives table is zero even though their mode
is not zero. This especially affects the commands `above`,
`after_something`, `make_box`, `un_vbox`, `set_specification` and
`car_ret`.
E.g. for after_something, all of \atendofgrouped, \afterassigned and
\aftergrouped have a zero as second entry in token.primitives.
The other difference is that all the internal_... commands have a
fixed offset which differes between commands in their mode field.
IMO the difference for the internal_... commands make sense because
they make for easier to use numbers, but having multiple primitives
indicating mode 0 for the other commands seems to make this table
significantly less useful because it can't be used to get a unique
description of a primitive.
(I may have completely misinterpreted the table of course, but given
that for other primitives the values match I do not think so)
Best regards,
Marcel
More information about the ntg-context
mailing list