[NTG-context] Hyphenation patterns
Denis Maier
denismaier at mailbox.org
Fri Oct 9 09:01:41 CEST 2020
Am 09.10.2020 um 08:57 schrieb Taco Hoekwater:
>
>> On 9 Oct 2020, at 08:52, Denis Maier <denismaier at mailbox.org> wrote:
>>
>> Am 08.10.2020 um 19:05 schrieb Henning Hraban Ramm:
>>> \starttext
>>>
>>> {EN: \en\hyphenatedcoloredword{applicable}}
>>>
>>> {DE: \de\hyphenatedcoloredword{applicable}}
>>>
>>> \stoptext
>>>
>> Wow, that's super helpful. The English pattern seems to be "ap-plic-a-ble"
>> According to Meriam-Webster it should just be "ap·pli·ca·ble".
>>
>> {EN: \en\hyphenatedcoloredword{obligate}} gives me "ob-lig-ate"
>> According to Meriam-Webster it should be "ob·li·gate".
>>
>> I've had a look at the files mentioned by Tomáš, but as these are not just wordlists I can not really tell what is happening.
>>
>> So, is that a bug?
> Not really. hyphenation patterns are a bit like applying JPEG compression to
> a dictionary. It makes the data size smaller by recognising patterns while
> ignoring outliers.
>
> Occasional errors are to be expected, which is why \hyphenation exists.
>
>
I see. I've noticed lang-us.lua has a list of exceptions in it:
["exceptions"]={
["characters"]="abcdefghijlmnoprstuyz",
["data"]="as-so-ciate as-so-ciates dec-li-na-tion oblig-a-tory
phil-an-thropic present presents project projects reci-procity
re-cog-ni-zance ref-or-ma-tion ret-ri-bu-tion ta-ble",
["length"]=168,
["n"]=14,
},
Would it be possible to add more exceptions to that list as they come
up? Or is that inappropriate?
Denis
More information about the ntg-context
mailing list