[NTG-context] Count (and limit) glyphs per line?

Max Chernoff mseven at telus.net
Mon Jun 27 00:32:51 CEST 2022


On 2022-06-26 9:59 a.m., Benjamin Buchmuller wrote:
> Hi Max,
> 
> Thank you so much for your help and pointing me to the documents; always a lot of things to learn in TeX!

No problem :)

> I'm afraid that including the hyphen width doesn't solve the issue yet. It seems to move the problem to other parts of the text.

Ah, too bad. My next step would have been to insert \penalty10000's 
(prevent breaks) at the potential breaks before/after the "selected" 
break, but Hans provided a _much_ better solution that you should use 
instead.

> My guess is that one could equivalently have said "local max_length = 111", right?

Not really; the hyphen is added to the accumulated width, not the 
accumulated character count.

> I made the following MWE (reproducible also online) to illustrate what I see:

My code assumes that each line has _roughly_ "max_length" characters 
before it runs. These lines each only have ~60 characters, so I'm not 
entirely surprised that there are issues. Just use Hans's solution, 
which is much less of a total hack than this is :)

> * Running with hsize only makes the problem worse in itemizations, so I think localhsize is the way to go. My guess, localhsize is the width of the "text" part of a paragraph, for example, excluding the symbols in the itemization.

I forgot about \leftskip. Replace "tex.hsize" with "tex.hsize - 
tex.leftskip.width" and everything should work properly. Using 
localhsize would also work, whenever it's non-zero.

> I'm wondering if I do understand the second while loop correctly:

So how it works is the outer loop goes through each "node" in the 
paragraph. If it is a glyph or glue, then we increment the character 
counter by one and the width by the node's width.

If we have exceeded the maximum character count or maximum width, then 
we switch directions and start going backwards through each node, 
starting at the character that was too long. If one of the previous 5 
characters is glue, then force a break there; otherwise, we force a 
break at the nearest glue or hyphen.

Now we reset the width and character counters and return to the outer 
loop, which continues until the end of the paragraph.

> for disc we add hyphen.width, do we?)

Once we're trying to make a break, it's too late to add anything to the 
width. Instead, I'm just adding the width of a hyphen unconditionally at 
the very beginning. (Of course, this code actually has a bug: I add the 
hyphen width at the very beginning, but then I reset the total width to 
zero each loop. That's probably why this wasn't working before).

> * The other cases still seem a bit obscure to me, and I tried to trace where each of them might be triggered:
> 
>                         if n.id == glue_id then
>                             local penalty = node.new "penalty"
>                             penalty.penalty = -10000
>                             node.insertbefore(head, n, penalty)
> 						   context.inrightmargin("glue")
>                             break
>                         end

When going backwards, if we find any glue, break there, since breaking 
at a space is always preferred to breaking at a hyphen.

> 
>                         if not end_disc and n.id == disc_id then
> 					   	   context.inrightmargin("disc")
>                             end_disc = n
>                         end

Save the location of the potential hyphen closest to the maximum length, 
just in case we need it later.

> 
>                         if end_disc and back_chars >= 5 then
> 					       context.inrightmargin("end")
>                             end_disc.penalty = -10000
>                             break
>                         end

We've already went back 5 characters from the maximum length and we 
haven't found any spaces to break at; if we have already found a 
potential hyphen, let's force break there.

> 
>                         if n.id == glyph_id then
>                             context.inrightmargin("glyph")
> 						   back_chars = back_chars + 1
>                         end

Count how many characters we've went backwards by.

(Oh, and be really careful when using "context()" inside Lua engine 
callbacks. If you had done something like "context.vbox('some text')", 
you would have triggered the paragraph builder while inside the 
paragraph builder, which could lead to an infinite loop)

> I'm maybe doing this wrong, but I see these conditions triggered more often than probably expected for a 25 line document?

Without running this code below, I'd guess that either "glue" or "end" 
should trigger for every line, "disc" should come before whenever "end" 
is triggered", and "glyph" is probably triggered about 3 times per line.

> Many thanks again!

(self-promotion warning) If these Lua callbacks interest you, I use 
quite a few of them in my "lua-widow-control" module.

 
https://github.com/gucci-on-fleek/lua-widow-control/blob/master/source/lua-widow-control.lua

There's lots of comments in the code, so hopefully it's not quite as 
cryptic as my character limiting code.

Let me know if you have any other questions.

-- Max


More information about the ntg-context mailing list