David Kastrup wrote:
Arthur Reutenauer
writes: 70 is the fourth value (index 3) in the list of 5 penalty values. Actually no, it's the fifth of 6 values, but I figured it was the way it should be read :-)
this depends on a hyphenation list with prioritized breakpoints (the printed Duden lexicon shows such breakpoints, good and emergency ones, so I presume that there might be some database somewhere).
You're lucky, then; I'm not aware of any such list for French (and I doubt it would be possible for specialists to agree on a single list, but that's another problem).
Here at the DANTE conference I just learnt that Werner Lemberg is creating a large corpus of two separate "all hyphenations" and "main hyphenations" lists (about 400000 words IIRC) for German. So indeed it would appear that if LuaTeX offered hyphenation according to prioritized patterns, the data to make it typeset better documents in German would be reasonably well available.
If there are two 'hyphenation levels', wouldn't it be easier if luatex supported running through two (or even more) separate pattern sets, and added the 'hitcount' to the discretionary? So breakpoint that appear in both sets of patterns would get an internal priority value of 2 instead of 1? Main advantage: no need for a patched or postprocessed patgen. Disadvantage: wastes a few CPU cycles because of multiple passes.