[NTG-context] Hyphenation patterns

Denis Maier denismaier at mailbox.org
Fri Oct 9 09:01:41 CEST 2020


Am 09.10.2020 um 08:57 schrieb Taco Hoekwater:
>
>> On 9 Oct 2020, at 08:52, Denis Maier <denismaier at mailbox.org> wrote:
>>
>> Am 08.10.2020 um 19:05 schrieb Henning Hraban Ramm:
>>> \starttext
>>>
>>> {EN: \en\hyphenatedcoloredword{applicable}}
>>>
>>> {DE: \de\hyphenatedcoloredword{applicable}}
>>>
>>> \stoptext
>>>
>> Wow, that's super helpful. The English pattern seems to be "ap-plic-a-ble"
>> According to Meriam-Webster it should just be "ap·​pli·​ca·​ble".
>>
>> {EN: \en\hyphenatedcoloredword{obligate}} gives me "ob-lig-ate"
>> According to Meriam-Webster it should be "ob·​li·​gate".
>>
>> I've had a look at the files mentioned by Tomáš, but as these are not just wordlists I can not really tell what is happening.
>>
>> So, is that a bug?
> Not really. hyphenation patterns are a bit like applying JPEG compression to
> a dictionary. It makes the data size smaller by recognising patterns while
> ignoring outliers.
>
> Occasional errors are to be expected, which is why \hyphenation exists.
>
>
I see. I've noticed lang-us.lua has a list of exceptions in it:
  ["exceptions"]={
   ["characters"]="abcdefghijlmnoprstuyz",
   ["data"]="as-so-ciate as-so-ciates dec-li-na-tion oblig-a-tory 
phil-an-thropic present presents project projects reci-procity 
re-cog-ni-zance ref-or-ma-tion ret-ri-bu-tion ta-ble",
   ["length"]=168,
   ["n"]=14,
  },

Would it be possible to add more exceptions to that list as they come 
up? Or is that inappropriate?

Denis


More information about the ntg-context mailing list