[NTG-context] Count (and limit) glyphs per line?
mseven at telus.net
Mon Jun 27 00:32:51 CEST 2022
On 2022-06-26 9:59 a.m., Benjamin Buchmuller wrote:
> Hi Max,
> Thank you so much for your help and pointing me to the documents; always a lot of things to learn in TeX!
No problem :)
> I'm afraid that including the hyphen width doesn't solve the issue yet. It seems to move the problem to other parts of the text.
Ah, too bad. My next step would have been to insert \penalty10000's
(prevent breaks) at the potential breaks before/after the "selected"
break, but Hans provided a _much_ better solution that you should use
> My guess is that one could equivalently have said "local max_length = 111", right?
Not really; the hyphen is added to the accumulated width, not the
accumulated character count.
> I made the following MWE (reproducible also online) to illustrate what I see:
My code assumes that each line has _roughly_ "max_length" characters
before it runs. These lines each only have ~60 characters, so I'm not
entirely surprised that there are issues. Just use Hans's solution,
which is much less of a total hack than this is :)
> * Running with hsize only makes the problem worse in itemizations, so I think localhsize is the way to go. My guess, localhsize is the width of the "text" part of a paragraph, for example, excluding the symbols in the itemization.
I forgot about \leftskip. Replace "tex.hsize" with "tex.hsize -
tex.leftskip.width" and everything should work properly. Using
localhsize would also work, whenever it's non-zero.
> I'm wondering if I do understand the second while loop correctly:
So how it works is the outer loop goes through each "node" in the
paragraph. If it is a glyph or glue, then we increment the character
counter by one and the width by the node's width.
If we have exceeded the maximum character count or maximum width, then
we switch directions and start going backwards through each node,
starting at the character that was too long. If one of the previous 5
characters is glue, then force a break there; otherwise, we force a
break at the nearest glue or hyphen.
Now we reset the width and character counters and return to the outer
loop, which continues until the end of the paragraph.
> for disc we add hyphen.width, do we?)
Once we're trying to make a break, it's too late to add anything to the
width. Instead, I'm just adding the width of a hyphen unconditionally at
the very beginning. (Of course, this code actually has a bug: I add the
hyphen width at the very beginning, but then I reset the total width to
zero each loop. That's probably why this wasn't working before).
> * The other cases still seem a bit obscure to me, and I tried to trace where each of them might be triggered:
> if n.id == glue_id then
> local penalty = node.new "penalty"
> penalty.penalty = -10000
> node.insertbefore(head, n, penalty)
When going backwards, if we find any glue, break there, since breaking
at a space is always preferred to breaking at a hyphen.
> if not end_disc and n.id == disc_id then
> end_disc = n
Save the location of the potential hyphen closest to the maximum length,
just in case we need it later.
> if end_disc and back_chars >= 5 then
> end_disc.penalty = -10000
We've already went back 5 characters from the maximum length and we
haven't found any spaces to break at; if we have already found a
potential hyphen, let's force break there.
> if n.id == glyph_id then
> back_chars = back_chars + 1
Count how many characters we've went backwards by.
(Oh, and be really careful when using "context()" inside Lua engine
callbacks. If you had done something like "context.vbox('some text')",
you would have triggered the paragraph builder while inside the
paragraph builder, which could lead to an infinite loop)
> I'm maybe doing this wrong, but I see these conditions triggered more often than probably expected for a 25 line document?
Without running this code below, I'd guess that either "glue" or "end"
should trigger for every line, "disc" should come before whenever "end"
is triggered", and "glyph" is probably triggered about 3 times per line.
> Many thanks again!
(self-promotion warning) If these Lua callbacks interest you, I use
quite a few of them in my "lua-widow-control" module.
There's lots of comments in the code, so hopefully it's not quite as
cryptic as my character limiting code.
Let me know if you have any other questions.
More information about the ntg-context