Hi, I am trying to prepare a moderate sized document in Malayalam using Context. Overall I have been successful. However, there are a few rough edges for which I need help. This is the first problem I face. When I use the font RIT-Rachana from https://rachana.org.in/, my preferred font, one set of conjuncts are not formed correctly, while the rest are formed properly. The conjuncts that are not formed correctly are those where the second component is ര U+0D30, followed by a symbol that is shown on the right side of the conjunct viz ാ U+0D3E, ി U+0D3F, ീ U+0D40, ു U+0D41, ൂ U+0D42. If no symbols follow, or symbols follow on the left or on both sides, the conjunct is well formed.In the minimum working example given 5 is well formed, while the first conjunct in 6 is well formed and second ill formed conjunct. The conjuncts are well formed when used in Libre Office with the same font. Also, other fonts, for example, those from smc.org don't have this problem. Unfortunately, they don't support all symbols. One of the developers of the font thinks that this is probably a bug in Context. Hence, the reason for posting here. \definefontfamily [malayalam] [rm] [RIT-Rachana] [features=malayalam-two] \setupbodyfont [malayalam] \starttext 1. ശ്രീ 2. അശ്രു 3. ശുശ്രൂഷ 4. പ്രാസം 5. പ്രേയസി 6. പ്രോഗ്രാം \stoptext I am working on Debian 11. The font version is the latest 1.3. The command context --version gives mtx-context | ConTeXt Process Management 1.04 mtx-context | mtx-context | main context file: /home/ajith/Downloads/context-linux-64/tex/texmf-context/tex/context/base/mkiv/context.mkiv mtx-context | current version: 2021.12.25 00:58 mtx-context | main context file: /home/ajith/Downloads/context-linux-64/tex/texmf-context/tex/context/base/mkxl/context.mkxl mtx-context | current version: 2021.12.25 00:58 Thanks, ajith
On Friday, December 31, 2021 6:22:15 PM IST Ajith R via ntg-context wrote:
The conjuncts that are not formed correctly are those where the second component is ര U+0D30, followed by a symbol that is shown on the right side of the conjunct viz ാ U+0D3E, ി U+0D3F, ീ U+0D40, ു U+0D41, ൂ U+0D42. If no symbols follow, or symbols follow on the left or on both sides, the conjunct is well formed.In the minimum working example given 5 is well formed, while the first conjunct in 6 is well formed and second ill formed conjunct.
I have been using ConTeXt to typeset documents in several Indic languages and have run into similar issues (in many languages). Please see this for a similar issue in some conjuncts for Devanagari: https://www.mail-archive.com/ntg-context@ntg.nl/msg99691.html For what its worth, I have not had issues with some fonts while issues with others persist. Some of these issues we can work around as I have pointed out in the above posting. In almost all cases I encountered no issues while using Xe(La)Tex. Based on some advise from Hans and reading about these OTF features and their implementations in Indic fonts, I think these issues might be due to differences in implementation. [Not entirely sure since I am a novice]. My guess is that Harfbuzz (which is what Xe(La)TeX uses by default) uses some heuristics to work out these conjuncts (?!). To answer your specific question regarding the conjuncts in the given words you have to use some Unicode hacking to get what you want in ConTeXt. In each of the following ZWS refers to the Unicode character (zero-width space U+200B) 1. ശ്രീ This is typeset correctly by writing ശ്ര + ZWS (U+200B) + ീ 2. അശ്രു Typeset correctly with അശ്ര + ZWS (U+200B) + ൂ 3. ശുശ്രൂഷ Typeset correctly with ശുശ്ര + ZWS (U+200B) + ൂ + ഷ 4. പ്രാസം Typeset correctly with പ്ര + ZWS (U+200B) + ാ + സം 5. പ്രേയസി (rendered correctly as entered; no hacks necessary) 6. പ്രോഗ്രാം Typeset correctly with പ്രൊ + ഗ്ര + ZWS (U+200B) + ാ + ം where the last character is the Malayalam Anusvara. Consider yet another example: സാന്ദ്രാനന്ദാഅവബൊധാത്മകമ് Here the 'ന്ദ്രാ' conjunct is not typeset in ConTeXt. To fix this I do ന്ദ്ര + ZWS (U+200B) + ാ This is what I have been doing to ensure correct typesetting of Malayalam and other Indic languages in ConTeXt. Honestly, it is inconvenient since the .tex files containing Unicode are no longer sanitary. However, ConTeXt has so many remarkable features that the very thought of having to go back to (Xe)LaTeX (just for harfbuzz rendering) causes me immense pain. As far as I am concerned, in every other way ConTeXt simply has no match in the (Xe)LaTeX world. In my usage of ConTeXt for my academic work (in English with lots of mathematics) I have encountered no issues. Even if I did there was always some legitimate (non-hacky) fix for it. For me personally, the rendering in Indic languages is the only pain point with ConTeXt (which I am willing to live with). So I am willing to live with the drawbacks till the day they are hopefully fixed. Anyway, I hope you can use these fixes temporarily. For example, if your editor supports it, you can replace all glyphs with this issue with the corresponding recipe involving ZWS. Dear Hans and other developers of ConTeXt, LuaTeX, If you happen to see this please look into the font system (where it concerns Indic systems). The present issue is very similar to the one I posted about earlier: https://www.mail-archive.com/ntg-context@ntg.nl/msg99691.html I have described the issue and the hacks to fix it the best I can. In case there is any other information that I can provide please let me know. Best, kauśika
On Sat, 01 Jan 2022 09:48:53 +0530
kauśika
On Friday, December 31, 2021 6:22:15 PM IST Ajith R via ntg-context wrote:
The conjuncts that are not formed correctly are those where the second component is ര U+0D30, followed by a symbol that is shown on the right side of the conjunct viz ാ U+0D3E, ി U+0D3F, ീ U+0D40, ു U+0D41, ൂ U+0D42. If no symbols follow, or symbols follow on the left or on both sides, the conjunct is well formed.In the minimum working example given 5 is well formed, while the first conjunct in 6 is well formed and second ill formed conjunct.
I have been using ConTeXt to typeset documents in several Indic languages and have run into similar issues (in many languages). Please see this for a similar issue in some conjuncts for Devanagari: https://www.mail-archive.com/ntg-context@ntg.nl/msg99691.html
For what its worth, I have not had issues with some fonts while issues with others persist. Some of these issues we can work around as I have pointed out in the above posting.
In almost all cases I encountered no issues while using Xe(La)Tex. Based on some advise from Hans and reading about these OTF features and their implementations in Indic fonts, I think these issues might be due to differences in implementation. [Not entirely sure since I am a novice]. My guess is that Harfbuzz (which is what Xe(La)TeX uses by default) uses some heuristics to work out these conjuncts (?!).
To answer your specific question regarding the conjuncts in the given words you have to use some Unicode hacking to get what you want in ConTeXt.
In each of the following ZWS refers to the Unicode character (zero-width space U+200B)
1. ശ്രീ This is typeset correctly by writing ശ്ര + ZWS (U+200B) + ീ
2. അശ്രു Typeset correctly with അശ്ര + ZWS (U+200B) + ൂ
3. ശുശ്രൂഷ Typeset correctly with ശുശ്ര + ZWS (U+200B) + ൂ + ഷ
4. പ്രാസം Typeset correctly with പ്ര + ZWS (U+200B) + ാ + സം
5. പ്രേയസി (rendered correctly as entered; no hacks necessary)
6. പ്രോഗ്രാം Typeset correctly with പ്രൊ + ഗ്ര + ZWS (U+200B) + ാ + ം where the last character is the Malayalam Anusvara.
Consider yet another example: സാന്ദ്രാനന്ദാഅവബൊധാത്മകമ്
Here the 'ന്ദ്രാ' conjunct is not typeset in ConTeXt. To fix this I do ന്ദ്ര + ZWS (U+200B) + ാ
This is what I have been doing to ensure correct typesetting of Malayalam and other Indic languages in ConTeXt. Honestly, it is inconvenient since the .tex files containing Unicode are no longer sanitary. However, ConTeXt has so many remarkable features that the very thought of having to go back to (Xe)LaTeX (just for harfbuzz rendering) causes me immense pain. As far as I am concerned, in every other way ConTeXt simply has no match in the (Xe)LaTeX world. In my usage of ConTeXt for my academic work (in English with lots of mathematics) I have encountered no issues. Even if I did there was always some legitimate (non-hacky) fix for it. For me personally, the rendering in Indic languages is the only pain point with ConTeXt (which I am willing to live with).
So I am willing to live with the drawbacks till the day they are hopefully fixed. Anyway, I hope you can use these fixes temporarily. For example, if your editor supports it, you can replace all glyphs with this issue with the corresponding recipe involving ZWS.
Dear Hans and other developers of ConTeXt, LuaTeX, If you happen to see this please look into the font system (where it concerns Indic systems). The present issue is very similar to the one I posted about earlier: https://www.mail-archive.com/ntg-context@ntg.nl/msg99691.html I have described the issue and the hacks to fix it the best I can. In case there is any other information that I can provide please let me know.
Best, kauśika
Hi Kausika, Your workaround seems to help, though it entails adding the ZWS throughout. Anyway, I agree with you in that even with this anomaly, Context is worth sticking to. Dear Hans and other developers of ConTeXt, LuaTeX, If a novice can help in any way to resolve this issue, please guide us how we can contribute. Thanks, ajith
participants (2)
-
Ajith R
-
kauśika