[NTG-context] improving kannada script

luigi scarso luigi.scarso at gmail.com
Sun Nov 4 02:12:01 CET 2018


On Wed, Oct 31, 2018 at 4:03 PM luigi scarso <luigi.scarso at gmail.com> wrote:

>
>
> On Wed, Oct 31, 2018 at 4:01 PM Hans Hagen <j.hagen at xs4all.nl> wrote:
>
>> On 10/31/2018 3:38 PM, luigi scarso wrote:
>>
>> > On Tue, Oct 30, 2018 at 11:22 AM luigi scarso <luigi.scarso at gmail.com
>> > <mailto:luigi.scarso at gmail.com>> wrote:
>> >
>> >
>> >
>> >     On Tue, Oct 30, 2018 at 10:18 AM Ulrike Fischer <news3 at nililand.de
>> >     <mailto:news3 at nililand.de>> wrote:
>> >
>> >         According to the xelatex output in the following example both
>> >         variants are not correct. What would be needed would be the
>> first
>> >         two glyphs from variant 1 (knd2 script) and the third from
>> variant 2
>> >         (knda script).
>> >
>> >
>> >     dunno, but to start with at  least I visually see what's going on:
>> >
>> >     \showotfcomposition{file:notosanskannada-regular.ttf*kannada-testA
>> >     at 48pt}{-1}{ಕ್ರ}
>> >
>> >
>> >     \showotfcomposition{file:notosanskannada-regular.ttf*kannada-testB
>> >     at 48pt}{-1}{ಕ್ರ}
>> >
>> > support for knd2 is still in progress,
>> > and  in knda
>> > glyph ಕ (U+00C95) +  glyph ್ (U+00CCD) + glyph ರ (U+00CB0)
>> > also is not ok ( at least for what I understand until now).
>> > I see where it fails ( and where hb does it right), let's see if I am
>> > able to find a patch.
>>
>
Well..not so easy (at least for me , never seen devanagari before )

glyph ಕ (U+00C95) +  glyph ್ (U+00CCD) + glyph ರ (U+00CB0)
are
KA +  VIRAMA + RA
and it is quite complex case (well,  really it could be a moderately
complex case ...)
For Kannada script we  can start by considering
 http://brahmi.sourceforge.net/docs/KannadaComputing.html
it's an old doc but
1) it shows the rules for kannada
2) its says that
"""
2.5 Rendering Rules
(Based on Microsoft Uniscribe-OpenType implementation of the UNICODE
Rendering Rules)
"""
So Uniscribe is the reference for  the rendering rules of this doc, which
in turn is the base for unicode.

Today the reference  unicode is
http://www.unicode.org/versions/Unicode11.0.0/ch12.pdf
The rules are a bit different from
http://brahmi.sourceforge.net/docs/KannadaComputing.html
but  at page 500 we see
U+0C95 ಕ  ka + U+0CCD  ್  halant + U+0CB0 ರ ra →  ಕ್ರ kra
which is our case (unicode says that the preferred name for  VIRAMA is
halant, so halant is also frequently used )
What happens is explained at page 464 for devanagari:
we should consider KA +  VIRAMA + RA as
KAn +  VIRAMAn + RAi, so

R1
We have KAn + VIRAMAn , hence
KAn + VIRAMAn → KAd

R6:
We have KAd + RAi, hence
KAd + RAi → KAn + RAsub

and then
R13
KAn + RAsub → K.RAn
and indeed ಕ್ರ is K.RAn

All these rules are re-explained at
https://docs.microsoft.com/en-us/typography/script-development/kannada#introduction
and font-osd.lua  is based  exactly  on these rules.
It makes sense, given that, as seen,  these are the reference.
The functions  for knda are those ending  in *_one  and for what I have
seen
after that the base syllable is correctly isolated,  reorder_one doesn't
implement R6 and R13 .
I am quite sure that spending a bit more time I can isolate the point,
but, as Hans has said, there is the plan to review these scripts.

-- 
luigi

PS:
Take these notes  cum grano salis: devanagari is not exactly equal to
 kannada,
and the ms site uses its own terms, not always one finds a match with
unicode.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ntg.nl/pipermail/ntg-context/attachments/20181104/622519f2/attachment-0001.html>


More information about the ntg-context mailing list