Count (and limit) glyphs per line?
Dear list, I've been confronted with the following 'intriguing' formatting requirement for a document: " • Type density: Must be no more than 15 characters per linear inch (including characters and spaces). • Line spacing: Must be no more than six lines per vertical inch. " While the line spacing resolves in ConTeXt to \setupinterlinespace[line=\dimexpr(1in / 6)] I was wondering if one can limit "type density" as the number of glyphs per inch in TeX too? I thought, it is more convenient to rephrase this request (for a 7 in textwidth) to limit the number of glyphs per line to 112. (Font must be sans or serif, of course ...) I've tried \setuplayout[width=112\averagecharwidth] which, however, results in ~120–130 characters and spaces per line. Pragmatically, I'm narrowing the text width to empirically match the requirement, but I'm nevertheless curious if there is a Lua/TeX solution to this "problem"? Thank you! Benjamin
I've been confronted with the following 'intriguing' formatting requirement for a document:
"Intriguing" is definitely right here. I suspect these guidelines were made for typewriters and haven't been updated since.
to limit the number of glyphs per line to 112.
112 characters per line sounds much too long anyways. From "The Elements of Typographic Style":
Anything from 45 to 75 characters is widely regarded as a satisfactory length of line for a single-column page set in a serifed text face in a text size. The 66-character line (counting both letters and spaces) is widely regarded as ideal. For multiple-column work, a better average is 40 to 50 characters.
If the type is well set and printed, lines of 85 or 90 characters will pose no problem in discontinuous texts, such as bibliographies, or, with generous leading, in footnotes. But even with generous leading, a line that averages more than 75 or so characters is likely to be too long for continuous reading.
If you use something like \setuplayout[width=80\averagecharwidth] then your lines will for sure have fewer than 112 characters and will probably be more readable too.
I'm nevertheless curious if there is a Lua/TeX solution to this "problem"?
Option 1: Use a monospaced font. Then 112 characters per line <=> page width = 112em. Option 2: A hacky Lua solution \startluacode local max_length = 112 local glyph_id = node.id "glyph" local disc_id = node.id "disc" local glue_id = node.id "glue" function userdata.limiter(head) language.hyphenate(head) local chars = 0 local width = 0 local n = head while n do if n.id == glyph_id or n.id == glue_id then chars = chars + 1 width = width + n.width - (n.shrink or 0) end if chars >= max_length or width > tex.hsize then local back_chars = 0 local end_disc = nil while n do if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) break end if not end_disc and n.id == disc_id then end_disc = n end if end_disc and back_chars >= 5 then end_disc.penalty = -10000 break end if n.id == glyph_id then back_chars = back_chars + 1 end n = n.prev end width = 0 chars = 0 end n = n.next end return head end nodes.tasks.appendaction( "processors", "before", "userdata.limiter" ) \stopluacode \setuppapersize[landscape,letter] \showframe \starttext \setupalign[flushleft] \setupbodyfont[14pt] \samplefile{knuth} \setupbodyfont[12pt] \samplefile{knuth} \setupbodyfont[10pt] \samplefile{knuth} \page \setupalign[normal] \setupbodyfont[14pt] \samplefile{knuth} \setupbodyfont[12pt] \samplefile{knuth} \setupbodyfont[10pt] \samplefile{knuth} \stoptext This code will ensure that no line ever exceeds "max_length" characters. It uses a greedy algorithm instead of the standard TeX algorithm for line breaking, but it still produces mostly decent results. -- Max
Am 24.06.22 um 05:15 schrieb Benjamin Buchmuller via ntg-context:
• Type density: Must be no more than 15 characters per linear inch (including characters and spaces).
This talks about "type density", not characters per line. This depends mostly on the font (and letterspacing). I.e. you should not use a narrow font. Hraban
On 6/24/2022 5:15 AM, Benjamin Buchmuller via ntg-context wrote:
Dear list,
I've been confronted with the following 'intriguing' formatting requirement for a document:
" • Type density: Must be no more than 15 characters per linear inch (including characters and spaces). • Line spacing: Must be no more than six lines per vertical inch. "
While the line spacing resolves in ConTeXt to
\setupinterlinespace[line=\dimexpr(1in / 6)]
I was wondering if one can limit "type density" as the number of glyphs per inch in TeX too? I thought, it is more convenient to rephrase this request (for a 7 in textwidth) to limit the number of glyphs per line to 112. (Font must be sans or serif, of course ...)
I've tried
\setuplayout[width=112\averagecharwidth]
which, however, results in ~120–130 characters and spaces per line. Pragmatically, I'm narrowing the text width to empirically match the requirement, but I'm nevertheless curious if there is a Lua/TeX solution to this "problem"? Just assume the worst case and take the narrowest character:
\showframe \setupbodyfont[modern] % we need to set the font \normalexpanded { \setuplayout [textwidth=\the\dimexpr112\fontcharwd\font`.\relax] } \starttext \input tufte \stoptext ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
Wow, that works like a charm! Thank you, Max! It's also a very insightful example of how to use and inject Lua code in the TeX output routine. Do you mind if I add it to the wiki? (Probably under "Wrapping".) Many thanks again! Benjamin
On Jun 24, 2022, at 01:44, Max Chernoff
wrote: I've been confronted with the following 'intriguing' formatting requirement for a document:
"Intriguing" is definitely right here. I suspect these guidelines were made for typewriters and haven't been updated since.
to limit the number of glyphs per line to 112.
112 characters per line sounds much too long anyways.
From "The Elements of Typographic Style":
Anything from 45 to 75 characters is widely regarded as a satisfactory length of line for a single-column page set in a serifed text face in a text size. The 66-character line (counting both letters and spaces) is widely regarded as ideal. For multiple-column work, a better average is 40 to 50 characters.
If the type is well set and printed, lines of 85 or 90 characters will pose no problem in discontinuous texts, such as bibliographies, or, with generous leading, in footnotes. But even with generous leading, a line that averages more than 75 or so characters is likely to be too long for continuous reading.
If you use something like
\setuplayout[width=80\averagecharwidth]
then your lines will for sure have fewer than 112 characters and will probably be more readable too.
I'm nevertheless curious if there is a Lua/TeX solution to this "problem"?
Option 1: Use a monospaced font. Then 112 characters per line <=> page width = 112em.
Option 2: A hacky Lua solution
\startluacode local max_length = 112
local glyph_id = node.id "glyph" local disc_id = node.id "disc" local glue_id = node.id "glue"
function userdata.limiter(head) language.hyphenate(head)
local chars = 0 local width = 0 local n = head while n do if n.id == glyph_id or n.id == glue_id then chars = chars + 1 width = width + n.width - (n.shrink or 0) end
if chars >= max_length or width > tex.hsize then local back_chars = 0 local end_disc = nil
while n do if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) break end
if not end_disc and n.id == disc_id then end_disc = n end
if end_disc and back_chars >= 5 then end_disc.penalty = -10000 break end
if n.id == glyph_id then back_chars = back_chars + 1 end
n = n.prev end
width = 0 chars = 0 end
n = n.next end
return head end
nodes.tasks.appendaction( "processors", "before", "userdata.limiter" ) \stopluacode
\setuppapersize[landscape,letter] \showframe
\starttext \setupalign[flushleft]
\setupbodyfont[14pt] \samplefile{knuth}
\setupbodyfont[12pt] \samplefile{knuth}
\setupbodyfont[10pt] \samplefile{knuth}
\page \setupalign[normal]
\setupbodyfont[14pt] \samplefile{knuth}
\setupbodyfont[12pt] \samplefile{knuth}
\setupbodyfont[10pt] \samplefile{knuth} \stoptext
This code will ensure that no line ever exceeds "max_length" characters. It uses a greedy algorithm instead of the standard TeX algorithm for line breaking, but it still produces mostly decent results.
-- Max
Dear list, A brief follow-up for (1) itemizations [resolved; but question on ConTeXt hsize defaults] and (2) hyphenation [troubles]. (1) To deal with itemizations and other situation where texts are indented such as: \setuppapersize[landscape,letter] \showframe \starttext \samplefile{knuth} \ctxlua{context(tex.dimen["textwidth"])} % 37213340 \ctxlua{context(tex.dimen["localhsize"])} % 0 \startitemize[width=5em] \item \samplefile{knuth} \ctxlua{context(tex.dimen["textwidth"])} % 37213340 \ctxlua{context(tex.dimen["localhsize"])} % 33283340 \stopitemize \stoptext The following part in the script must be adapted to the local horizontal size, I guess: if chars >= max_length or width > tex.hsize then However, tex.localhsize (or tex.dimen["localhsize"]) is 0 when the document is initialized. (Maybe a more sensible default would be textwidth rather than 0?) So, I added: local localhsize = tex.dimen["textwidth"] if tex.dimen["localhsize"] > 0 then localhsize = tex.dimen["localhsize"] end if chars >= max_length or width > localhsize then Maybe someone finds this useful in the future. (2) I'm (now?) running into trouble with hyphenation. With the example above, I get " The separation of any of these four components would have hurt TEX significantly. If I had not partic- i- pated fully in all these activities, literally hundreds of improvements would never have been made, " In my own document, I also get lines with only a single character or hboxed group. I assume, this is because the hyphen is not counted and pushes the remainder to a new line where the intended breakpoint again starts another one. Unfortunately, I don't know what to change; I know a bit about "glyph" and "glue", but what is "disc" and would it help here? Thank you! Benjamin
On Jun 25, 2022, at 11:38, Benjamin Buchmuller
wrote: Wow, that works like a charm! Thank you, Max!
It's also a very insightful example of how to use and inject Lua code in the TeX output routine. Do you mind if I add it to the wiki? (Probably under "Wrapping".)
Many thanks again!
Benjamin
On Jun 24, 2022, at 01:44, Max Chernoff
wrote: I've been confronted with the following 'intriguing' formatting requirement for a document:
"Intriguing" is definitely right here. I suspect these guidelines were made for typewriters and haven't been updated since.
to limit the number of glyphs per line to 112.
112 characters per line sounds much too long anyways.
From "The Elements of Typographic Style":
Anything from 45 to 75 characters is widely regarded as a satisfactory length of line for a single-column page set in a serifed text face in a text size. The 66-character line (counting both letters and spaces) is widely regarded as ideal. For multiple-column work, a better average is 40 to 50 characters.
If the type is well set and printed, lines of 85 or 90 characters will pose no problem in discontinuous texts, such as bibliographies, or, with generous leading, in footnotes. But even with generous leading, a line that averages more than 75 or so characters is likely to be too long for continuous reading.
If you use something like
\setuplayout[width=80\averagecharwidth]
then your lines will for sure have fewer than 112 characters and will probably be more readable too.
I'm nevertheless curious if there is a Lua/TeX solution to this "problem"?
Option 1: Use a monospaced font. Then 112 characters per line <=> page width = 112em.
Option 2: A hacky Lua solution
\startluacode local max_length = 112
local glyph_id = node.id "glyph" local disc_id = node.id "disc" local glue_id = node.id "glue"
function userdata.limiter(head) language.hyphenate(head)
local chars = 0 local width = 0 local n = head while n do if n.id == glyph_id or n.id == glue_id then chars = chars + 1 width = width + n.width - (n.shrink or 0) end
if chars >= max_length or width > tex.hsize then local back_chars = 0 local end_disc = nil
while n do if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) break end
if not end_disc and n.id == disc_id then end_disc = n end
if end_disc and back_chars >= 5 then end_disc.penalty = -10000 break end
if n.id == glyph_id then back_chars = back_chars + 1 end
n = n.prev end
width = 0 chars = 0 end
n = n.next end
return head end
nodes.tasks.appendaction( "processors", "before", "userdata.limiter" ) \stopluacode
\setuppapersize[landscape,letter] \showframe
\starttext \setupalign[flushleft]
\setupbodyfont[14pt] \samplefile{knuth}
\setupbodyfont[12pt] \samplefile{knuth}
\setupbodyfont[10pt] \samplefile{knuth}
\page \setupalign[normal]
\setupbodyfont[14pt] \samplefile{knuth}
\setupbodyfont[12pt] \samplefile{knuth}
\setupbodyfont[10pt] \samplefile{knuth} \stoptext
This code will ensure that no line ever exceeds "max_length" characters. It uses a greedy algorithm instead of the standard TeX algorithm for line breaking, but it still produces mostly decent results.
-- Max
It's also a very insightful example of how to use and inject Lua code in the TeX output routine.
This is injecting Lua code before the paragraph builder, not in the output routine. Something like https://tex.stackexchange.com/a/644613/270600 or my module "lua-widow-control" would be an example of Lua code in the output routine.
Do you mind if I add it to the wiki? (Probably under "Wrapping".)
Sure
However, tex.localhsize (or tex.dimen["localhsize"]) is 0 when the document is initialized. (Maybe a more sensible default would be textwidth rather than 0?)
So, I added:
local localhsize = tex.dimen["textwidth"]
if tex.dimen["localhsize"] > 0 then localhsize = tex.dimen["localhsize"] end
if chars >= max_length or width > localhsize then
I don't think that's necessary. \hsize is a primitive TeX parameter that sets the width of the paragraph. It may be zero at the start of the document, but it is definitely non-zero by the end of every paragraph. The Lua function gets the current value of \hsize at the end of every paragraph, so it should be using the exact same value that TeX's paragraph builder uses, meaning that it should account for itemizations and such. I'm not really sure what \localhsize is, but it's probably similar to \hsize.
(2) I'm (now?) running into trouble with hyphenation.
In my own document, I also get lines with only a single character or hboxed group. I assume, this is because the hyphen is not counted and pushes the remainder to a new line where the intended breakpoint again starts another one.
Try this: \startluacode local max_length = 112 local glyph_id = node.id "glyph" local disc_id = node.id "disc" local glue_id = node.id "glue" function userdata.limiter(head) language.hyphenate(head) local hyphen = node.new "glyph" hyphen.char = language.prehyphenchar(0) hyphen.font = font.current() local width = hyphen.width node.free(hyphen) local chars = 0 local n = head while n do if n.id == glyph_id or n.id == glue_id then chars = chars + 1 width = width + n.width - (n.shrink or 0) end if chars >= max_length or width > tex.hsize then local back_chars = 0 local end_disc = nil while n do if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) break end if not end_disc and n.id == disc_id then end_disc = n end if end_disc and back_chars >= 5 then end_disc.penalty = -10000 break end if n.id == glyph_id then back_chars = back_chars + 1 end n = n.prev end width = 0 chars = 0 end n = n.next end return head end nodes.tasks.appendaction( "processors", "before", "userdata.limiter" ) \stopluacode I've just added the width of a hyphen to the accumulated width. Let me know if this works; if not, there's a more complex fix that I can try.
Unfortunately, I don't know what to change; I know a bit about "glyph" and "glue", but what is "disc" and would it help here?
"disc" nodes are "discretionaries", which are usually potential hyphens. See "The TeXbook" (page 95) or "TeX by Topic" (https://texdoc.org/serve/texbytopic/0#subsection.19.3.1) for details on the TeX side, or the LuaMetaTeX manual (https://www.pragma-ade.com/general/manuals/luametatex.pdf#%231205) for details on the Lua side. -- Max
Wow, that works like a charm! Thank you, Max!
It's also a very insightful example of how to use and inject Lua code in the TeX output routine. Do you mind if I add it to the wiki? (Probably under "Wrapping".) I didn't check it but it's not the output routine one kicks code in, but
On 6/25/2022 5:38 PM, Benjamin Buchmuller via ntg-context wrote: likely some other spot. Be aware that messing around with node lists - can have a performance hit - can have interferences with other mechanisms - use the user callback hooks (before/after), not the system ones so: - i ignore complains about performance - i won't spent time on issues related to it so, when do do such thingsm best work with a 'frozen' install and check your stuff after an update: - play safe Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
Hi Max, Thank you so much for your help and pointing me to the documents; always a lot of things to learn in TeX! I'm afraid that including the hyphen width doesn't solve the issue yet. It seems to move the problem to other parts of the text. My guess is that one could equivalently have said "local max_length = 111", right? I made the following MWE (reproducible also online) to illustrate what I see: * Here, instead of a breaking point, the trouble is caused by not being able to break it. This causes the next line to be underfull. (I get a lot of these, but also some with hyphenated breakpoints, in my own document. Maybe the insertion point of the penalty/breaking bonus needs to move up?) * Running with hsize only makes the problem worse in itemizations, so I think localhsize is the way to go. My guess, localhsize is the width of the "text" part of a paragraph, for example, excluding the symbols in the itemization. (More thoughts below) \startluacode local max_length = 112 local glyph_id = node.id "glyph" local disc_id = node.id "disc" local glue_id = node.id "glue" function userdata.limiter(head) head = language.hyphenate(head) local hyphen = node.new "glyph" hyphen.char = language.prehyphenchar(0) hyphen.font = font.current() local width = hyphen.width node.free(hyphen) local chars = 0 local n = head while n do if n.id == glyph_id or n.id == glue_id then chars = chars + 1 width = width + n.width - (n.shrink or 0) end local localhsize = tex.dimen["textwidth"] if tex.dimen["localhsize"] > 0 then localhsize = tex.dimen["localhsize"] end if chars >= max_length or width > localhsize then local back_chars = 0 local end_disc = nil while n do if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) break end if not end_disc and n.id == disc_id then end_disc = n end if end_disc and back_chars >= 5 then end_disc.penalty = -10000 break end if n.id == glyph_id then back_chars = back_chars + 1 end n = n.prev end width = 0 chars = 0 end n = n.next end return head end nodes.tasks.appendaction( "processors", "before", "userdata.limiter" ) \stopluacode \setuppapersize[A5] \showframe \starttext This is text width: \ctxlua{context(tex.dimen["textwidth"])} This is hsize: \ctxlua{context(tex.dimen["hsize"])} This is localhsize: \ctxlua{context(tex.dimen["localhsize"])} \startitemize[width=5em] \item Thus, I came to the conclusion that the \hbox{designer} of a new system must not only be the implementer and first large--scale user; the de signer should also write the first user manual. \item \samplefile{knuth} This is text width: \ctxlua{context(tex.dimen["textwidth"])} This is hsize: \ctxlua{context(tex.dimen["hsize"])} This is localhsize: \ctxlua{context(tex.dimen["localhsize"])} \stopitemize \stoptext I'm wondering if I do understand the second while loop correctly: * Once we find the node that exceeds either the character limit or the (local-)hsize (glyphs and glues summed, for disc we add hyphen.width, do we?), then we insert an incredibly good breaking point for a new line. And exit the loop. * The other cases still seem a bit obscure to me, and I tried to trace where each of them might be triggered: if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) context.inrightmargin("glue") break end if not end_disc and n.id == disc_id then context.inrightmargin("disc") end_disc = n end if end_disc and back_chars >= 5 then context.inrightmargin("end") end_disc.penalty = -10000 break end if n.id == glyph_id then context.inrightmargin("glyph") back_chars = back_chars + 1 end I'm maybe doing this wrong, but I see these conditions triggered more often than probably expected for a 25 line document? local count_me = 0 ... if chars >= max_length or width > localhsize then local back_chars = 0 local end_disc = nil while n do local check = "glyph" count_me = count_me + 1 if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) context.inrightmargin("\\color[red]{" .. string.rep("_", count_me) .. count_me .. "}") break end if not end_disc and n.id == disc_id then end_disc = n end -- if end_disc and back_chars >= 5 then context.inrightmargin("\\color[blue]{" .. string.rep("_", count_me) .. count_me .. "}") end_disc.penalty = -10000 break end if n.id == glyph_id then context.inrightmargin("\\color[black]{" .. string.rep("_", count_me) .. count_me .. "}") back_chars = back_chars + 1 end n = n.prev end Many thanks again! Benjamin
On Jun 25, 2022, at 17:40, Max Chernoff
wrote: It's also a very insightful example of how to use and inject Lua code in the TeX output routine.
This is injecting Lua code before the paragraph builder, not in the output routine. Something like https://tex.stackexchange.com/a/644613/270600 or my module "lua-widow-control" would be an example of Lua code in the output routine.
Do you mind if I add it to the wiki? (Probably under "Wrapping".)
Sure
However, tex.localhsize (or tex.dimen["localhsize"]) is 0 when the document is initialized. (Maybe a more sensible default would be textwidth rather than 0?) So, I added: local localhsize = tex.dimen["textwidth"]
if tex.dimen["localhsize"] > 0 then localhsize = tex.dimen["localhsize"] end if chars >= max_length or width > localhsize then
I don't think that's necessary. \hsize is a primitive TeX parameter that sets the width of the paragraph. It may be zero at the start of the document, but it is definitely non-zero by the end of every paragraph.
The Lua function gets the current value of \hsize at the end of every paragraph, so it should be using the exact same value that TeX's paragraph builder uses, meaning that it should account for itemizations and such. I'm not really sure what \localhsize is, but it's probably similar to \hsize.
(2) I'm (now?) running into trouble with hyphenation. In my own document, I also get lines with only a single character or hboxed group. I assume, this is because the hyphen is not counted and pushes the remainder to a new line where the intended breakpoint again starts another one.
Try this:
\startluacode local max_length = 112
local glyph_id = node.id "glyph" local disc_id = node.id "disc" local glue_id = node.id "glue"
function userdata.limiter(head) language.hyphenate(head)
local hyphen = node.new "glyph" hyphen.char = language.prehyphenchar(0) hyphen.font = font.current() local width = hyphen.width node.free(hyphen)
local chars = 0 local n = head while n do if n.id == glyph_id or n.id == glue_id then chars = chars + 1 width = width + n.width - (n.shrink or 0) end
if chars >= max_length or width > tex.hsize then local back_chars = 0 local end_disc = nil
while n do if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) break end
if not end_disc and n.id == disc_id then end_disc = n end
if end_disc and back_chars >= 5 then end_disc.penalty = -10000 break end
if n.id == glyph_id then back_chars = back_chars + 1 end
n = n.prev end
width = 0 chars = 0 end
n = n.next end
return head end
nodes.tasks.appendaction( "processors", "before", "userdata.limiter" ) \stopluacode
I've just added the width of a hyphen to the accumulated width. Let me know if this works; if not, there's a more complex fix that I can try.
Unfortunately, I don't know what to change; I know a bit about "glyph" and "glue", but what is "disc" and would it help here?
"disc" nodes are "discretionaries", which are usually potential hyphens. See "The TeXbook" (page 95) or "TeX by Topic" (https://texdoc.org/serve/texbytopic/0#subsection.19.3.1) for details on the TeX side, or the LuaMetaTeX manual (https://www.pragma-ade.com/general/manuals/luametatex.pdf#%231205) for details on the Lua side.
-- Max
On 6/25/2022 10:25 PM, Benjamin Buchmuller via ntg-context wrote:
Dear list,
A brief follow-up for (1) itemizations [resolved; but question on ConTeXt hsize defaults] and (2) hyphenation [troubles].
(1) To deal with itemizations and other situation where texts are indented such as:
\setuppapersize[landscape,letter] \showframe \starttext \samplefile{knuth} \ctxlua{context(tex.dimen["textwidth"])} % 37213340 \ctxlua{context(tex.dimen["localhsize"])} % 0 \startitemize[width=5em] \item \samplefile{knuth} \ctxlua{context(tex.dimen["textwidth"])} % 37213340 \ctxlua{context(tex.dimen["localhsize"])} % 33283340 \stopitemize \stoptext
The problem with these thing is that there is more involved than just counting, like font features, hyphenation, current paragrph properties, etc. and you don't want interference with other features. You also want the paragraphs to look somewhat ok. Folks who enforce such demands on authors never wonder where the tools do do that come from (publishers and probably most designers are not interested in that anyway: thinking probably stops at the number '120'). Attached a proof of concept that gives an idea. No upload as first we need to do some wrapping up of math. Not that we needed something in the engine other than the linebreak helper to accept direct modes (no need to go back and forth then and i only want write this crap once). Could be a module although it's only some 70 lines of code in the end. Maybe it makes a nice (lmtx) demo for the ctx meeting too. Here we work per paragraph not per line which looks better on the average. One could mess with parshapes but why bother. (The todo in the name refers to the fact that it might do into a th elow level paragraph manual.) (No more time now but we can add later; remind me if I forget.) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On 2022-06-26 9:59 a.m., Benjamin Buchmuller wrote:
Hi Max,
Thank you so much for your help and pointing me to the documents; always a lot of things to learn in TeX!
No problem :)
I'm afraid that including the hyphen width doesn't solve the issue yet. It seems to move the problem to other parts of the text.
Ah, too bad. My next step would have been to insert \penalty10000's (prevent breaks) at the potential breaks before/after the "selected" break, but Hans provided a _much_ better solution that you should use instead.
My guess is that one could equivalently have said "local max_length = 111", right?
Not really; the hyphen is added to the accumulated width, not the accumulated character count.
I made the following MWE (reproducible also online) to illustrate what I see:
My code assumes that each line has _roughly_ "max_length" characters before it runs. These lines each only have ~60 characters, so I'm not entirely surprised that there are issues. Just use Hans's solution, which is much less of a total hack than this is :)
* Running with hsize only makes the problem worse in itemizations, so I think localhsize is the way to go. My guess, localhsize is the width of the "text" part of a paragraph, for example, excluding the symbols in the itemization.
I forgot about \leftskip. Replace "tex.hsize" with "tex.hsize - tex.leftskip.width" and everything should work properly. Using localhsize would also work, whenever it's non-zero.
I'm wondering if I do understand the second while loop correctly:
So how it works is the outer loop goes through each "node" in the paragraph. If it is a glyph or glue, then we increment the character counter by one and the width by the node's width. If we have exceeded the maximum character count or maximum width, then we switch directions and start going backwards through each node, starting at the character that was too long. If one of the previous 5 characters is glue, then force a break there; otherwise, we force a break at the nearest glue or hyphen. Now we reset the width and character counters and return to the outer loop, which continues until the end of the paragraph.
for disc we add hyphen.width, do we?)
Once we're trying to make a break, it's too late to add anything to the width. Instead, I'm just adding the width of a hyphen unconditionally at the very beginning. (Of course, this code actually has a bug: I add the hyphen width at the very beginning, but then I reset the total width to zero each loop. That's probably why this wasn't working before).
* The other cases still seem a bit obscure to me, and I tried to trace where each of them might be triggered:
if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) context.inrightmargin("glue") break end
When going backwards, if we find any glue, break there, since breaking at a space is always preferred to breaking at a hyphen.
if not end_disc and n.id == disc_id then context.inrightmargin("disc") end_disc = n end
Save the location of the potential hyphen closest to the maximum length, just in case we need it later.
if end_disc and back_chars >= 5 then context.inrightmargin("end") end_disc.penalty = -10000 break end
We've already went back 5 characters from the maximum length and we haven't found any spaces to break at; if we have already found a potential hyphen, let's force break there.
if n.id == glyph_id then context.inrightmargin("glyph") back_chars = back_chars + 1 end
Count how many characters we've went backwards by. (Oh, and be really careful when using "context()" inside Lua engine callbacks. If you had done something like "context.vbox('some text')", you would have triggered the paragraph builder while inside the paragraph builder, which could lead to an infinite loop)
I'm maybe doing this wrong, but I see these conditions triggered more often than probably expected for a 25 line document?
Without running this code below, I'd guess that either "glue" or "end" should trigger for every line, "disc" should come before whenever "end" is triggered", and "glyph" is probably triggered about 3 times per line.
Many thanks again!
(self-promotion warning) If these Lua callbacks interest you, I use quite a few of them in my "lua-widow-control" module. https://github.com/gucci-on-fleek/lua-widow-control/blob/master/source/lua-w... There's lots of comments in the code, so hopefully it's not quite as cryptic as my character limiting code. Let me know if you have any other questions. -- Max
On 6/27/2022 12:32 AM, Max Chernoff via ntg-context wrote:
On 2022-06-26 9:59 a.m., Benjamin Buchmuller wrote:
I forgot about \leftskip. Replace "tex.hsize" with "tex.hsize - tex.leftskip.width" and everything should work properly. Using localhsize would also work, whenever it's non-zero.
probably also compensate for hangindent and maybe even indent
(Oh, and be really careful when using "context()" inside Lua engine callbacks. If you had done something like "context.vbox('some text')", you would have triggered the paragraph builder while inside the paragraph builder, which could lead to an infinite loop)
maybe you can add a level counter, and only run when level = 1 Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
Dear Hans, This is the friendly reminder you requested for the "crappyspecs" parbuilder as per your example in early July. With ConTeXt ver: 2022.07.06 21:42 LMTX fmt: 2022.7.8 I get tex error > tex error on line 12 in file ./test-wrapping2.tex: Undefined control sequence \crappyspeccount potentially as there is no crappyspec parbuilder yet? \defineparbuilder [crappyspec] % implemented in the builder namespace \defineparbuilder [default] % implemented in the builder namespace \setmainparbuilder[crappyspec] \setuptolerance[verytolerant,stretch] \dontcomplain \protected\def\CrappyTraced {\par \strut \rlap \bgroup\infofont (\enspace max = \the\crappyspeccount \quad step = \the\crappyspecstep \quad hsize = \the\hsize \quad used = \the\crappyspecdimen \enspace ) \egroup \par} \starttext \crappyspeccount60 \samplefile{tufte} \CrappyTraced \par \crappyspeccount40 \samplefile{tufte} \CrappyTraced \par % \crappyspecstep 2pt \samplefile{tufte} \CrappyTraced \par \samplefile{tufte} \CrappyTraced \startitemize \startitem \samplefile{tufte} \CrappyTraced \stopitem \startitem \samplefile{ward} \CrappyTraced \stopitem \stopitemize \startnarrower[6*left,right] \samplefile{tufte} \CrappyTraced \stopnarrower \starthanging [distance=4em,n=2] {test} \samplefile{tufte} \CrappyTraced \stophanging \page % \stoptext \setuppapersize[landscape,letter] \samplefile{knuth} \CrappyTraced \startitemize[width=5em] \startitem \samplefile{knuth} \CrappyTraced \stopitem \startitem {\smallcaps \darkblue \samplefile{knuth}} \CrappyTraced \stopitem \stopitemize \crappyspeccount60 \startitemize[width=5em] \startitem \samplefile{knuth} \CrappyTraced \stopitem \startitem {\smallcaps \darkgreen \samplefile{knuth}} \CrappyTraced \stopitem \stopitemize \page Thank you once again for your help! Benjamin -- The problem with these thing is that there is more involved than just counting, like font features, hyphenation, current paragrph properties, etc. and you don't want interference with other features. You also want the paragraphs to look somewhat ok. Folks who enforce such demands on authors never wonder where the tools do do that come from (publishers and probably most designers are not interested in that anyway: thinking probably stops at the number '120'). Attached a proof of concept that gives an idea. No upload as first we need to do some wrapping up of math. Not that we needed something in the engine other than the linebreak helper to accept direct modes (no need to go back and forth then and i only want write this crap once). Could be a module although it's only some 70 lines of code in the end. Maybe it makes a nice (lmtx) demo for the ctx meeting too. Here we work per paragraph not per line which looks better on the average. One could mess with parshapes but why bother. (The todo in the name refers to the fact that it might do into a th elow level paragraph manual.) (No more time now but we can add later; remind me if I forget.) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On Jun 26, 2022, at 18:32, Max Chernoff
wrote: On 2022-06-26 9:59 a.m., Benjamin Buchmuller wrote:
Hi Max, Thank you so much for your help and pointing me to the documents; always a lot of things to learn in TeX!
No problem :)
I'm afraid that including the hyphen width doesn't solve the issue yet. It seems to move the problem to other parts of the text.
Ah, too bad. My next step would have been to insert \penalty10000's (prevent breaks) at the potential breaks before/after the "selected" break, but Hans provided a _much_ better solution that you should use instead.
My guess is that one could equivalently have said "local max_length = 111", right?
Not really; the hyphen is added to the accumulated width, not the accumulated character count.
I made the following MWE (reproducible also online) to illustrate what I see:
My code assumes that each line has _roughly_ "max_length" characters before it runs. These lines each only have ~60 characters, so I'm not entirely surprised that there are issues. Just use Hans's solution, which is much less of a total hack than this is :)
* Running with hsize only makes the problem worse in itemizations, so I think localhsize is the way to go. My guess, localhsize is the width of the "text" part of a paragraph, for example, excluding the symbols in the itemization.
I forgot about \leftskip. Replace "tex.hsize" with "tex.hsize - tex.leftskip.width" and everything should work properly. Using localhsize would also work, whenever it's non-zero.
I'm wondering if I do understand the second while loop correctly:
So how it works is the outer loop goes through each "node" in the paragraph. If it is a glyph or glue, then we increment the character counter by one and the width by the node's width.
If we have exceeded the maximum character count or maximum width, then we switch directions and start going backwards through each node, starting at the character that was too long. If one of the previous 5 characters is glue, then force a break there; otherwise, we force a break at the nearest glue or hyphen.
Now we reset the width and character counters and return to the outer loop, which continues until the end of the paragraph.
for disc we add hyphen.width, do we?)
Once we're trying to make a break, it's too late to add anything to the width. Instead, I'm just adding the width of a hyphen unconditionally at the very beginning. (Of course, this code actually has a bug: I add the hyphen width at the very beginning, but then I reset the total width to zero each loop. That's probably why this wasn't working before).
* The other cases still seem a bit obscure to me, and I tried to trace where each of them might be triggered: if n.id == glue_id then local penalty = node.new "penalty" penalty.penalty = -10000 node.insertbefore(head, n, penalty) context.inrightmargin("glue") break end
When going backwards, if we find any glue, break there, since breaking at a space is always preferred to breaking at a hyphen.
if not end_disc and n.id == disc_id then context.inrightmargin("disc") end_disc = n end
Save the location of the potential hyphen closest to the maximum length, just in case we need it later.
if end_disc and back_chars >= 5 then context.inrightmargin("end") end_disc.penalty = -10000 break end
We've already went back 5 characters from the maximum length and we haven't found any spaces to break at; if we have already found a potential hyphen, let's force break there.
if n.id == glyph_id then context.inrightmargin("glyph") back_chars = back_chars + 1 end
Count how many characters we've went backwards by.
(Oh, and be really careful when using "context()" inside Lua engine callbacks. If you had done something like "context.vbox('some text')", you would have triggered the paragraph builder while inside the paragraph builder, which could lead to an infinite loop)
I'm maybe doing this wrong, but I see these conditions triggered more often than probably expected for a 25 line document?
Without running this code below, I'd guess that either "glue" or "end" should trigger for every line, "disc" should come before whenever "end" is triggered", and "glyph" is probably triggered about 3 times per line.
Many thanks again!
(self-promotion warning) If these Lua callbacks interest you, I use quite a few of them in my "lua-widow-control" module.
https://github.com/gucci-on-fleek/lua-widow-control/blob/master/source/lua-w...
There's lots of comments in the code, so hopefully it's not quite as cryptic as my character limiting code.
Let me know if you have any other questions.
-- Max
On 7/18/2022 11:24 PM, Benjamin Buchmuller via ntg-context wrote:
This is the friendly reminder you requested for the "crappyspecs" parbuilder as per your example in early July. m-crappyspec is in the next upload
Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
participants (5)
-
Benjamin Buchmuller
-
Hans Hagen
-
Hans Hagen
-
Henning Hraban Ramm
-
Max Chernoff