[Dev-luatex] generalized hyphenation

Taco Hoekwater taco at elvenkind.com
Thu Mar 6 16:43:34 CET 2008

David Kastrup wrote:
> Arthur Reutenauer <arthur.reutenauer at normalesup.org> writes:
>>> 70 is the fourth value (index 3) in the list of 5 penalty values.
>>   Actually no, it's the fifth of 6 values, but I figured it was the way
>> it should be read :-)
>>>                               this depends on a hyphenation list with
>>> prioritized breakpoints (the printed Duden lexicon shows such
>>> breakpoints, good and emergency ones, so I presume that there might be
>>> some database somewhere).
>>   You're lucky, then; I'm not aware of any such list for French (and I
>> doubt it would be possible for specialists to agree on a single list,
>> but that's another problem).
> Here at the DANTE conference I just learnt that Werner Lemberg is
> creating a large corpus of two separate "all hyphenations" and "main
> hyphenations" lists (about 400000 words IIRC) for German.  So indeed it
> would appear that if LuaTeX offered hyphenation according to prioritized
> patterns, the data to make it typeset better documents in German would
> be reasonably well available.

If there are two 'hyphenation levels', wouldn't it be easier if luatex
supported running through two (or even more) separate pattern sets, and
added the 'hitcount' to the discretionary? So breakpoint that appear in
both sets of patterns would get an internal priority value of 2 instead
of 1?

Main advantage: no need for a patched or postprocessed patgen.
Disadvantage: wastes a few CPU cycles because of multiple passes.

More information about the dev-luatex mailing list