[Dev-luatex] lastnodetype, again

Hans Hagen j.hagen at xs4all.nl
Sun Aug 22 11:49:58 CEST 2021


On 8/22/2021 1:17 AM, Robert wrote:
> On 19.08.21 18:30, Hans Hagen wrote:
>> On 8/19/2021 5:00 PM, Robert wrote:
>>> Hans has replied off-list, saying that this is basically expected
>>> behaviour and that checking for ligatures by means of \lastnodetype is
>>> inherently unreliable in luatex. In that case I would suggest to change
>>> the wording in the manual, which quite unequivocally claims the 
>>> opposite:
>>>
>>> | The \lastnodetype primitive is 𝜀-TEX compliant. The valid range is
>>> | still [−1, 15] and glyph nodes (formerly known as char nodes) have
>>> | number 0 while ligature nodes are mapped to 7. That way macro packages
>>> | can use the same symbolic names as in traditional 𝜀-TEX.
>>> (p.123)
>>
>> this is correct .. it doesn't say that the nodelist is the same and in
>> fact luatex does report the right node in etex speak ..
> 
> Hm, just saying, numbers are in the same range, but they actually may be
> totally different, is not really what I would call "compliant"...

from the (rest of the) manual it's clear that luatex (1) has different 
nodes, (2) has split the interwoven "read input, handle fonts, hyphenate 
when needed" approach and that (3) one can kick in font handler functions

so, this etex lastnode command just looks back at the moment it is 
invoked and *that* is what you then get back (with luatex glyph node 
number changed into zero which in luatex actually is a hlist node)

>> in the case of
>> luatex there is no ligature node because the nodelist isn't processed
>> and even then it could as well be a disc node
> 
> Well yes, something comparable (I guess) happens in etex/pdftex: without
> the \relax after the ligature, they also just report a glyph node --
> with the \relax, however, they do report a ligature (or disc) node. But
> with luatex it doesn't make a difference whether there's a \relax after
> the ligature or not. That's kind of the crux of my report, I suppose.

because, as said, the list is handled after it has been completely 
constructed ... (if you don't believe this, just compare the pdftex 
source with luatex source)

> Also, luatex does get the node type right when the ligature is wrapped
> in a box first:
> \setbox0\hbox{--}
> \unhbox0 \the\lastnodetype % OK
> 
> So deep down luatex seems to know better...

compare that with

 > \setbox0\hbox{--\the\lastnodetype}

again, whole list read, then treatment (and that do be anything, even 
remove these -)

you either 'immediately look at the last node (currently constructed 
list) or you look at it after the list has been 'typeset'

(if lastnodetype could be negative you even got different results 
because then you get three hyphens in a row)

>> just don't assume that luatex, pdftex, xetex produce the same node lists
> 
> Not even if there's no opentype font involved? And just for the record,
> xetex does report the same as pdftex.

indeed, split read/hyphenate/lig/kerning (unless overloaded which can be 
done)

>> and don't assume that f + i is a ligature in each font (or script /
>> language) either because it can as wel be some kerning between f (either
>> or not substituted) and i (either or not substituted)
> 
> I have no idea why you would think that I assume that f+i is a ligature
> in every font (I don't), and furthermore, I have no idea what this has
> to do with \lastnodetype not returning the expected value (my example
> didn't even contain "fi").

because if you test for (etex) last node type and expect a char or 
ligature node type (you explicitly point to codes 0 and 7 being 
different in the engines)

you cannot predict what gets out; glyph/ligature are subtypes in a node 
but not different nodes so 'looking at the last node type in order to 
see what one gets is unreliable wrt this detail: in luatex there is no 
guarantee that the lig subtype is set, so etex-number-7 quite often 
might not show up when you look at the end of an unboxed list)

>> in luatex when you want to mess around at that level you have to use a
>> callback (or preprocess the input)
> 
> Not quite sure how preprocessing the /input/ could tell me whether a
> /font/ has a specific ligature. Also I'm a bit baffled that expecting a
> luatex command to be compatible with etex/pdftex (as per the manual)
> should be tantamount to "messing around".

if you want to know that you can best write a callback that looks at he 
list after it has gone through the font handler

the messing around refers to 'looking at a specific node when processing 
input using lastnodetype and handling on that'

btw, in general the only lastnodetypes that are sort of reliable are 
those testing for penalties, kerns glue (inserts and marks can travel, 
whatsits can be anything).

(there are more differences like this: {} doesn't break a ligature for 
instance and reprocessing of an unboxed list can also have side effects, 
depending on what callbacks kick in; there are also subtle differences, 
some under mode control, wrt successive hyphens, because these are 
handled at a different time in luatex)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------


More information about the dev-luatex mailing list