On 25-2-2012 23:25, Piet van Oostrum wrote:
Als ik het voorbeeld met lualatex draai dan is alles normaal, d.w.z. minpage en vbox geven hetzelfde resultaat. Zou er dan toch een rariteit in de afbreekalgoritme (emergencypass) zitten die er in luatex uitgehaald is? Ook met plain tex en vboxen met/zonder emergencystretch zie je hetzelfde. Gewoon tex geeft de extra regels en luatex niet. Kennelijk vindt de originele afbreekalgoritme in het geval van een emergencystretch een afbreking die met een extra lege regel achter een -teken "optimaler" is dan zonder die extra regel.
Misschien kan Taco wat licht hierop laten schijnen?
Hoi Piet, Hyphenation gaat anders in luatex. Afgezien van wat details in patronen, is de afbreekfase nu een aparte en niet geintegreerd in de par builder. De hele node list wordt op een gegeven moment van afbreekpunten voorzien en Verder zijn we wat meer instelmogelijkheden: [pre|post] [ex] hyphenchar enz. Wat betreft de 'verwerking' zijn er wat dingen aangepast (hieronder geeft een indicatie). De parbuilder code is wat complex omdat er natuurlijk l2r en r2l mix mogelijk is, wiskunde in kan zitten, hz en protrusion een rol spelen, etc. Verder zijn de drie passes verweven in een serie loops. ==== uit de luatex manual ==== \chapter[languages]{Languages and characters, fonts and glyphs} .... In \LUATEX, the situation is quite different. The characters you type are always converted into \type{glyph_node} records with a special subtype to identify them as being intended as linguistic characters. \LUATEX\ stores the needed language information in those records, but does not do any font|-|related processing at the time of node creation. It only stores the index of the font. When it becomes necessary to typeset a paragraph, \LUATEX\ first inserts all hyphenation points right into the whole node list. Next, it processes all the font information in the whole list (creating ligatures and adjusting kerning), and finally it adjusts all the subtype identifiers so that the records are \quote{glyph nodes} from now on. .... \section{The main control loop} In \LUATEX's main loop, almost all input characters that are to be typeset are converted into \type{glyph_node} records with subtype \quote{character}, but there are a few small exceptions. .... Fourth, automatic discretionaries are handled differently. \TEX82 inserts an empty discretionary after sensing an input character that matches the \tex{hyphenchar} in the current font. This test is wrong, in our opinion: whether or not hyphenation takes place should not depend on the current font, it is a language property. In \LUATEX, it works like this: if \LUATEX\ senses a string of input characters that matches the value of the new integer parameter \tex{exhyphenchar}, it will insert an explicit discretionary after that series of nodes. Initex sets the \tex{exhyphenchar=`\-}. Incidentally, this is a global parameter instead of a language-specific one because it may be useful to change the value depending on the document structure instead of the text language. Note: as of \LUATEX\ 0.63.0, the insertion of discretionaries after a sequence of explicit hyphens happens at the same time as the other hyphenation processing, {\it not\/} inside the main control loop. The only use \LUATEX\ has for \tex{hyphenchar} is at the check whether a word should be considered for hyphenation at all. If the \tex{hyphenchar} of the font attached to the first character node in a word is negative, then hyphenation of that word is abandoned immediately. {\bf This behavior is added for backward compatibility only, and the use of \type{\hyphenchar=-1} as a means of preventing hyphenation should not be used in new \LUATEX\ documents.} .... Finally, there is no longer a \type{main_loop} label in the code. Remember that \TEX82 did quite a lot of processing while adding \type{char_nodes} to the horizontal list? For speed reasons, it handled that processing code outside of the \quote{main control} loop, and only the first character of any \quote{word} was handled by that \quote{main control} loop. In \LUATEX, there is no longer a need for that (all hard work is done later), and the (now very small) bits of character-handling code have been moved back inline. When \tex{tracingcommands} is on, this is visible because the full word is reported, instead of just the initial character. etc ==== ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------