[Dev-luatex] Libhyphen

Stephan Hennig mailing_list at arcor.de
Wed Sep 21 01:00:08 CEST 2011


schrieb Taco Hoekwater:
> On 09/15/2011 10:56 PM, Khaled Hosny wrote:
>>
>> I just came across this library:
>> http://sourceforge.net/projects/hunspell/files/Hyphen/
>
> Hyphenation is luatex is in fact an adaptation of a (slightly
> earlier) version of libhnj. At that time, it did not do compound word
> stuff yet, so I have to check that out. It did then already have
> non-standard hyphenation.
> 
> However, that was implemented as such an hack that I decided to leave
> it out in the new luatex code, and instead opted for non-standard 
> hyphenation in the exceptions instead of in the patterns proper.
> (what libhnj did at that time was disguising dictionary exceptions
> as patterns, so the non-standard hyphenation 'pattern rules' were in
> fact complete words with a single non-standard hyphenation in it
> somewhere.)

As Taco already pointed out libhnj mixes-up regular patterns and
non-standard hyphenation patterns.  I sent a proposal about compound
word hyphenation to Taco a while ago that clearly separates patterns
with different semantics.

In this context, different semantics means different hyphenation
penalties.  That is, provide different sets of hyphenation patterns for
all needed hyphenation penalties, i.e, patterns

  * for compound word hyphenation,
  * for prefix and suffix hyphenation,
  * for suppressing aesthetically unpleasant hyphenations,
  * etc.

For the German language, I think even more than five different penalty
classes could be desirable.  All these sets of patterns can be applied
to a word in parallel and the penalties are chosen according to which
pattern set matches a spot.

The same pattern approach can be used to handle non-standard
hyphenation, ligaturing, round-s recognition, etc.  (I think there are
use-cases in Arabic script as well.)

I don't know what Taco's current plan is, though.  The corresponding
tracker item reads "multi-pass hyphenation",
<URL:http://tracker.luatex.org/view.php?id=168>, whereas my proposal is
about applying patterns in parallel rather than in multiple passes.

Best regards,
Stephan Hennig


More information about the dev-luatex mailing list