[Tex-nl] Fontafhankelijke ongewenste extra witregel na minteken

Hans Hagen pragma at wxs.nl
Sun Feb 26 17:16:13 CET 2012


On 25-2-2012 23:25, Piet van Oostrum wrote:
> Als ik het voorbeeld met lualatex draai dan is alles normaal, d.w.z. minpage en vbox geven hetzelfde resultaat. Zou er dan toch een rariteit in de afbreekalgoritme (emergencypass) zitten die er in luatex uitgehaald is? Ook met plain tex en vboxen met/zonder emergencystretch zie je hetzelfde. Gewoon tex geeft de extra regels en luatex niet. Kennelijk vindt de originele afbreekalgoritme in het geval van een emergencystretch een afbreking die met een extra lege regel achter een -teken "optimaler" is dan zonder die extra regel.
>
> Misschien kan Taco wat licht hierop laten schijnen?

Hoi Piet,

Hyphenation gaat anders in luatex. Afgezien van wat details in patronen, 
is de afbreekfase nu een aparte en niet geintegreerd in de par builder. 
De hele node list wordt op een gegeven moment van afbreekpunten voorzien en

Verder zijn we wat meer instelmogelijkheden: [pre|post] [ex] hyphenchar 
enz.

Wat betreft de 'verwerking' zijn er wat dingen aangepast (hieronder 
geeft een indicatie).

De parbuilder code is wat complex omdat er natuurlijk l2r en r2l mix 
mogelijk is, wiskunde in kan zitten, hz en protrusion een rol spelen, 
etc. Verder zijn de drie passes verweven in een serie loops.

==== uit de luatex manual ====

\chapter[languages]{Languages and characters, fonts and glyphs}

....

In \LUATEX, the situation is quite different. The characters you
type are always converted into \type{glyph_node} records with a
special subtype to identify them as being intended as linguistic
characters. \LUATEX\ stores the needed language information in those
records, but does not do any font|-|related processing at the time of
node creation. It only stores the index of the font.

When it becomes necessary to typeset a paragraph, \LUATEX\ first
inserts all hyphenation points right into the whole node list.
Next, it processes all the font information in the whole list
(creating ligatures and adjusting kerning), and finally it adjusts
all the subtype identifiers so that the records are \quote{glyph
nodes} from now on.

....

\section{The main control loop}

In \LUATEX's main loop, almost all input characters that are to be
typeset are converted into \type{glyph_node} records with subtype
\quote{character}, but there are a few small exceptions.

....

Fourth, automatic discretionaries are handled differently. \TEX82
inserts an empty discretionary after sensing an input character that
matches the \tex{hyphenchar} in the current font. This test is wrong,
in our opinion: whether or not hyphenation takes place should not
depend on the current font, it is a language property.

In \LUATEX, it works like this: if \LUATEX\ senses a string of input
characters that matches the value of the new integer parameter
\tex{exhyphenchar}, it will insert an explicit discretionary after that
series of nodes. Initex sets the \tex{exhyphenchar=`\-}.
Incidentally, this is a global parameter instead of a
language-specific one because it may be useful to change the value
depending on the document structure instead of the text language.

Note: as of \LUATEX\ 0.63.0, the insertion of discretionaries after
a sequence of explicit hyphens happens at the same time as the other
hyphenation processing, {\it not\/} inside the main control loop.

The only use \LUATEX\ has for \tex{hyphenchar} is at the check
whether a word should be considered for hyphenation at all. If the
\tex{hyphenchar} of the font attached to the first character node in a
word is negative, then hyphenation of that word is abandoned
immediately. {\bf This behavior is added for backward
compatibility only, and the use of \type{\hyphenchar=-1} as a means of
preventing hyphenation should not be used in new \LUATEX\ documents.}

....

Finally, there is no longer a \type{main_loop} label in the
code. Remember that \TEX82 did quite a lot of processing while adding
\type{char_nodes} to the horizontal list? For speed reasons, it handled
that processing code outside of the \quote{main control} loop, and only the
first character of any \quote{word} was handled by that \quote{main 
control} loop.
In \LUATEX, there is no longer a need for that (all hard work is done
later), and the (now very small) bits of character-handling code have
been moved back inline. When \tex{tracingcommands} is on, this is
visible because the full word is reported, instead of just the initial
character.

etc

====

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------



More information about the TeX-NL mailing list