On 2022-06-26 9:59 a.m., Benjamin Buchmuller wrote:
Hi Max,
Thank you so much for your help and pointing me to the documents; always a lot of things to learn in TeX!
No problem :)
I'm afraid that including the hyphen width doesn't solve the issue yet. It seems to move the problem to other parts of the text.
Ah, too bad. My next step would have been to insert \penalty10000's (prevent breaks) at the potential breaks before/after the "selected" break, but Hans provided a _much_ better solution that you should use instead.
My guess is that one could equivalently have said "local max_length = 111", right?
Not really; the hyphen is added to the accumulated width, not the accumulated character count.
I made the following MWE (reproducible also online) to illustrate what I see:
My code assumes that each line has _roughly_ "max_length" characters before it runs. These lines each only have ~60 characters, so I'm not entirely surprised that there are issues. Just use Hans's solution, which is much less of a total hack than this is :)
* Running with hsize only makes the problem worse in itemizations, so I think localhsize is the way to go. My guess, localhsize is the width of the "text" part of a paragraph, for example, excluding the symbols in the itemization.
I forgot about \leftskip. Replace "tex.hsize" with "tex.hsize - tex.leftskip.width" and everything should work properly. Using localhsize would also work, whenever it's non-zero.
I'm wondering if I do understand the second while loop correctly:
So how it works is the outer loop goes through each "node" in the paragraph. If it is a glyph or glue, then we increment the character counter by one and the width by the node's width. If we have exceeded the maximum character count or maximum width, then we switch directions and start going backwards through each node, starting at the character that was too long. If one of the previous 5 characters is glue, then force a break there; otherwise, we force a break at the nearest glue or hyphen. Now we reset the width and character counters and return to the outer loop, which continues until the end of the paragraph.
for disc we add hyphen.width, do we?)
Once we're trying to make a break, it's too late to add anything to the width. Instead, I'm just adding the width of a hyphen unconditionally at the very beginning. (Of course, this code actually has a bug: I add the hyphen width at the very beginning, but then I reset the total width to zero each loop. That's probably why this wasn't working before).
* The other cases still seem a bit obscure to me, and I tried to trace where each of them might be triggered:
if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) context.inrightmargin("glue") break end
When going backwards, if we find any glue, break there, since breaking at a space is always preferred to breaking at a hyphen.
if not end_disc and n.id == disc_id then context.inrightmargin("disc") end_disc = n end
Save the location of the potential hyphen closest to the maximum length, just in case we need it later.
if end_disc and back_chars >= 5 then context.inrightmargin("end") end_disc.penalty = -10000 break end
We've already went back 5 characters from the maximum length and we haven't found any spaces to break at; if we have already found a potential hyphen, let's force break there.
if n.id == glyph_id then context.inrightmargin("glyph") back_chars = back_chars + 1 end
Count how many characters we've went backwards by. (Oh, and be really careful when using "context()" inside Lua engine callbacks. If you had done something like "context.vbox('some text')", you would have triggered the paragraph builder while inside the paragraph builder, which could lead to an infinite loop)
I'm maybe doing this wrong, but I see these conditions triggered more often than probably expected for a 25 line document?
Without running this code below, I'd guess that either "glue" or "end" should trigger for every line, "disc" should come before whenever "end" is triggered", and "glyph" is probably triggered about 3 times per line.
Many thanks again!
(self-promotion warning) If these Lua callbacks interest you, I use quite a few of them in my "lua-widow-control" module. https://github.com/gucci-on-fleek/lua-widow-control/blob/master/source/lua-w... There's lots of comments in the code, so hopefully it's not quite as cryptic as my character limiting code. Let me know if you have any other questions. -- Max