On Sun, 27 Nov 2022 at 14:59, Marcel Fabian Krüger <tex@2krueger.de> wrote:
Hi,

in LuaTeX node.ligaturing should return the head and the tail of the
list after ligatures got applied. But in the case that the tail of the new
list is a discretionary the current code returns the passed in tail
instead, assuming that it didn't change. This breaks if the
discretionary node wasn't the tail before the ligature pass:.

Take for example

\directlua{
  local b = token.scan_list()
  local h = b.head
  local t = node.tail(h)
  local post_h, post_t = node.ligaturing(h, node.tail(t))
  print(node.tail(post_h), post_t)
}\hpack{a\discretionary{}{f}{f}f}
\bye

Here the `f`s in the discretionary form ligatures with the `f` after the
discretionary, so after node.ligaturing the list if effectively
`a\discretionary{}{ff}{ff}` and therefore `post_t` should point to the
discretionary node. But as the output shows, post_t instead is a glyph
node while node.tail(post_h) is the expected disc node.

This is caused by `handle_lig_word` in luafont.c. `handle_lig_word` is
used in `handle_ligaturing` as

    while (cur != null) {
        if (type(cur) == glyph_node || (type(cur) == boundary_node)) {
            cur = handle_lig_word(cur);
        }
        prev = cur;
        cur = vlink(cur);
    }

so it is expected that handle_lig_word returns the last node already handled
so that afterwards the loop can continue at vlink(cur). But if the handled
segment ends with a discretionary `handle_lig_word` returns the node `after`
the last already handled node instead. This is fine if the last node is a
non glyph/non boundary node, but if the discretionary ends the segment passed
to handle_ligaturing than this means that `handle_lig_word` returns null,
therefore settings `prev` to `null` instead of tracking the tail.
Also this relies on `vlink(null)` being null, causing issues if this
ever gets corrupted (see my next mail for an example...).

It can be fixed by making sure that handle_lig_word always returns the
last handled node, e.g. using

diff --git a/source/texk/web2c/luatexdir/font/luafont.c b/source/texk/web2c/luatexdir/font/luafont.c
index 2e08facb1..c33037244 100644
--- a/source/texk/web2c/luatexdir/font/luafont.c
+++ b/source/texk/web2c/luatexdir/font/luafont.c
@@ -2242,6 +2242,7 @@ static halfword handle_lig_nest(halfword root, halfword cur)
 static halfword handle_lig_word(halfword cur)
 {
     halfword right = null;
+    halfword last = null;
     if (type(cur) == boundary_node) {
         halfword prev = alink(cur);
         halfword fwd = vlink(cur);
@@ -2481,9 +2482,10 @@ static halfword handle_lig_word(halfword cur)

         } else {
             /*tex We have glyph nor disc. */
-            return cur;
+            return last;
         }
         /*tex Goto the next node, where |\par| allows |vlink(cur)| to be NULL. */
+        last = cur;
         cur = vlink(cur);
     }
     return cur;
--

Sorry for the delay, I will check it this evening .

--
luigi