Luatex does not mirror characters that has a Bidi_Mirrored property when the text direction is set to RTL (TRT in Aleph), according to http://unicode.org/reports/tr9/#Mirroring, the different types of parenthesis that has Bidi_Mirrored property should be mirreored in RTL mode, but this isn't what I get. Is this a bug, feature, or am I missing some thing? Thanks, Khaled -- Khaled Hosny Arabic localizer and member of Arabeyes.org team
Hello,
Is this a bug, feature, or am I missing some thing?
You're not missing anything, it's kind of a misfeature because the Bidi_Mirrored property is not taken in account by ConTeXt (yet). Happily enough, it's one of the things I'm sponsored by Google to implement as a part of my Summer of Code project (http://code.google.com/soc/2008/tex/appinfo.html?csaid=8BC22C657B7F0D0E) -- of course the project is more general and there is a part on bidirectional behaviour, but I didn't get to it yet. In this case, I guess the mirroring property should be added to char-def.lua and handled accordingly; contrary to XeTeX, LuaTeX doesn't know it intrinsically, so it has to be dealt with at the ConTeXt level. Incidentally, you may note that the guillemets are not mirrored (although I observed some incoherent behaviour with the Geeza font on the Mac). Arthur
Arthur Reutenauer wrote:
Hello,
Is this a bug, feature, or am I missing some thing?
You're not missing anything, it's kind of a misfeature because the Bidi_Mirrored property is not taken in account by ConTeXt (yet). Happily enough, it's one of the things I'm sponsored by Google to implement as a part of my Summer of Code project (http://code.google.com/soc/2008/tex/appinfo.html?csaid=8BC22C657B7F0D0E) -- of course the project is more general and there is a part on bidirectional behaviour, but I didn't get to it yet. In this case, I guess the mirroring property should be added to char-def.lua and handled accordingly; contrary to XeTeX, LuaTeX doesn't know it intrinsically, so it has to be dealt with at the ConTeXt level.
this morning i added that info to the main table, but i'm still pondering about how to use if properly ... in tex normally there are explicit mode switches (in the source of the document) which gives ultimate control; there's also the issue of these pardir etc changes that then need to be injected into the node list; anyhow, i'll look into it, but whatever solution i come up with, it has to be under user control; i don't want hard coded automatisms that then are hard to bypass (especially implicid properties can result in messy situations, unicode math is a candidate for that) ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On Mon, Jun 09, 2008 at 05:20:23PM +0200, Hans Hagen wrote:
Arthur Reutenauer wrote:
Hello,
Is this a bug, feature, or am I missing some thing?
You're not missing anything, it's kind of a misfeature because the Bidi_Mirrored property is not taken in account by ConTeXt (yet). Happily enough, it's one of the things I'm sponsored by Google to implement as a part of my Summer of Code project (http://code.google.com/soc/2008/tex/appinfo.html?csaid=8BC22C657B7F0D0E) -- of course the project is more general and there is a part on bidirectional behaviour, but I didn't get to it yet. In this case, I guess the mirroring property should be added to char-def.lua and handled accordingly; contrary to XeTeX, LuaTeX doesn't know it intrinsically, so it has to be dealt with at the ConTeXt level.
this morning i added that info to the main table, but i'm still pondering about how to use if properly ... in tex normally there are explicit mode switches (in the source of the document) which gives ultimate control; there's also the issue of these pardir etc changes that then need to be injected into the node list;
anyhow, i'll look into it, but whatever solution i come up with, it has to be under user control; i don't want hard coded automatisms that then are hard to bypass (especially implicid properties can result in messy situations, unicode math is a candidate for that)
Giving the user more control is good idea, as long as the correct behaviour is one of the options. Regards, Khaled
----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl ----------------------------------------------------------------- ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
-- Khaled Hosny Arabic localizer and member of Arabeyes.org team
Khaled Hosny wrote:
Luatex does not mirror characters that has a Bidi_Mirrored property when the text direction is set to RTL (TRT in Aleph), according to http://unicode.org/reports/tr9/#Mirroring, the different types of parenthesis that has Bidi_Mirrored property should be mirreored in RTL mode, but this isn't what I get. Is this a bug, feature, or am I missing some thing?
experimental in the beta \setcharactermirroring[1] no high level interface yet, i need to think of how to do such things as efficient as possible and prevent interference with font features and such (currently it's an attribute handler that pops in quite early) -- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On Mon, Jun 09, 2008 at 11:08:09PM +0200, Hans Hagen wrote:
Khaled Hosny wrote:
Luatex does not mirror characters that has a Bidi_Mirrored property when the text direction is set to RTL (TRT in Aleph), according to http://unicode.org/reports/tr9/#Mirroring, the different types of parenthesis that has Bidi_Mirrored property should be mirreored in RTL mode, but this isn't what I get. Is this a bug, feature, or am I missing some thing?
experimental in the beta
\setcharactermirroring[1]
It does work perfectly with unidirectional texts (RTL or LTR), but when mixing bi-directional text, like Arabic text between brackets inside English line, the closing bracket takes the direction of the embedded text not the main line. See the attached example. Regards, Khaled
no high level interface yet, i need to think of how to do such things as efficient as possible and prevent interference with font features and such (currently it's an attribute handler that pops in quite early)
--
----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl ----------------------------------------------------------------- ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
-- Khaled Hosny Arabic localizer and member of Arabeyes.org team
Hi Khaled,
On Fri, 13 Jun 2008 15:52:27 -0600, Khaled Hosny
experimental in the beta
\setcharactermirroring[1]
It does work perfectly with unidirectional texts (RTL or LTR), but when mixing bi-directional text, like Arabic text between brackets inside English line, the closing bracket takes the direction of the embedded text not the main line. See the attached example.
I just worked with this, and made some symmetrical definitions: see the attached modified files. \setcharactermirroring[0,1] works in both uni- and bi-directional text -- note that each bracketpair is inside of its respective directional context. However, for the FIRST occurence of a bracket-pair in a bidi paragraph it does not behave as expected. It may be that the second invocation of \setcharactermirroring does not immediately override the last. Maybe we need a \flush or \clearsetcharactermirroring command? Best wishes Idris -- Professor Idris Samawi Hamid, Editor-in-Chief International Journal of Shi`i Studies Department of Philosophy Colorado State University Fort Collins, CO 80523
On Fri, Jun 13, 2008 at 07:34:03PM -0600, Idris Samawi Hamid wrote:
Hi Khaled,
On Fri, 13 Jun 2008 15:52:27 -0600, Khaled Hosny
wrote: experimental in the beta
\setcharactermirroring[1]
It does work perfectly with unidirectional texts (RTL or LTR), but when mixing bi-directional text, like Arabic text between brackets inside English line, the closing bracket takes the direction of the embedded text not the main line. See the attached example.
I just worked with this, and made some symmetrical definitions: see the attached modified files.
\setcharactermirroring[0,1] works in both uni- and bi-directional text -- note that each bracketpair is inside of its respective directional context.
Thanks for the fix, this makes it more organized indeed.
However, for the FIRST occurence of a bracket-pair in a bidi paragraph it does not behave as expected.
It may be that the second invocation of \setcharactermirroring does not immediately override the last. Maybe we need a \flush or \clearsetcharactermirroring command?
I'm new at TeX/ConTeXt, all what I can do is to confirm that I'm getting the same output as yours. Regards, Khaled
Best wishes Idris
-- Professor Idris Samawi Hamid, Editor-in-Chief International Journal of Shi`i Studies Department of Philosophy Colorado State University Fort Collins, CO 80523
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
-- Khaled Hosny Arabic localizer and member of Arabeyes.org team
Idris Samawi Hamid wrote:
Hi Khaled,
On Fri, 13 Jun 2008 15:52:27 -0600, Khaled Hosny
wrote: experimental in the beta
\setcharactermirroring[1]
It does work perfectly with unidirectional texts (RTL or LTR), but when mixing bi-directional text, like Arabic text between brackets inside English line, the closing bracket takes the direction of the embedded text not the main line. See the attached example.
I just worked with this, and made some symmetrical definitions: see the attached modified files.
\setcharactermirroring[0,1] works in both uni- and bi-directional text -- note that each bracketpair is inside of its respective directional context.
i wonder what gives you the impression that you can use 0.1 as argument currently there is no reset command but you can say \setcharactermirroring[-1] since -1 resets an attribute ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On Sun, 15 Jun 2008 14:32:40 -0600, Hans Hagen
\setcharactermirroring[0,1] works in both uni- and bi-directional text -- note that each bracketpair is inside of its respective directional context.
i wonder what gives you the impression that you can use 0.1 as argument
No, this was an abbreviation for "both \setcharactermirroring[0] and \setcharactermirroring[1] work in..." ;-) Best wishes Idris -- Professor Idris Samawi Hamid, Editor-in-Chief International Journal of Shi`i Studies Department of Philosophy Colorado State University Fort Collins, CO 80523
Idris Samawi Hamid wrote:
On Sun, 15 Jun 2008 14:32:40 -0600, Hans Hagen
wrote: \setcharactermirroring[0,1] works in both uni- and bi-directional text -- note that each bracketpair is inside of its respective directional context. i wonder what gives you the impression that you can use 0.1 as argument
No, this was an abbreviation for "both \setcharactermirroring[0] and \setcharactermirroring[1] work in..."
ah ... anyhow, you're lucky that 0 is not implementing something then -) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On Sun, 15 Jun 2008 14:32:40 -0600, Hans Hagen
currently there is no reset command but you can say \setcharactermirroring[-1] since -1 resets an attribute
I tried that, but the result is the same: first occurrence of an RL open-paren in an LR paragraph is mis-mirrored. Next occurrences are fine. The enclosed test file can serve as a benchmark for now. Best Idris =============mirroring.tex============== % engine=luatex % OpenType features needed for Arabtype font \definefontfeature [arab] [mode=node,language=dflt,script=arab, init=yes,medi=yes,fina=yes,isol=yes, liga=yes,dlig=yes,rlig=yes,clig=yes, mark=yes,mkmk=yes,kern=yes,curs=yes] \font\Arabic = arabtype*arab at 19pt \def\ArabicParDir{\textdir TRT\pardir TRT} \def\ArabicTextDir{\textdir TRT} \def\LatinParDir{\textdir TLT\pardir TLT} \def\LatinTextDir{\textdir TLT} \definestartstop [arabicpar] [commands=% {\Arabic \ArabicParDir \setcharactermirroring[1]% }] \definestartstop [arabictext] [commands=% {\Arabic\setcharactermirroring[1] \ArabicTextDir % }] \definestartstop [latinpar] [commands=% {\Arabic\LatinParDir \setcharactermirroring[-1]% }] \definestartstop [latintext] [commands={\LatinTextDir \setcharactermirroring[-1]% }] \setupwhitespace[big] \showframe[text] \starttext \startarabicpar سلام (قوس) وقوس <قوس> و [قوس] وهذا قوس حول نص غير عربي \startlatintext(Latin)\stoplatintext\ ثم عربي \crlf وهذا قوس حول نص غير عربي \startlatintext(Latin)\stoplatintext\ ثم عربي \crlf وهذا قوس حول نص غير عربي \startlatintext(Latin)\stoplatintext\ ثم عربي \stoparabicpar \blank \startlatinpar Peace (paren) \& [paren] \& <paren> Here is some mixed Arabic {\startarabictext(عربي)\stoparabictext} \ and Latin script. \crlf Here is some mixed Arabic {\startarabictext(عربي)\stoparabictext} \ and Latin script. \crlf Here is some mixed Arabic {\startarabictext(عربي)\stoparabictext} \ and Latin script. As you can see, \LUATEX\ does a very good job mixing LR \startarabictext(يسار-يمين)\stoparabictext \ and RL \startarabictext(يمين-يسار)\stoparabictext \ texts. \crlf As you can see, \LUATEX\ does a very good job mixing LR \startarabictext(يسار-يمين)\stoparabictext \ and RL \startarabictext(يمين-يسار)\stoparabictext \ texts. \crlf As you can see, \LUATEX\ does a very good job mixing LR \startarabictext(يسار-يمين)\stoparabictext \ and RL \startarabictext(يمين-يسار)\stoparabictext \ texts. % \hfill\break temporary workaround for luatex bug \LUATEX\ even does a great job breaking Arabic phrases \startarabictext(و هنا جملة منقطعة في وسط قرينة لاتينية)\stoparabictext \ across lines. \hfill\break \LUATEX\ even does a great job breaking Arabic phrases \startarabictext(و هنا جملة منقطعة في وسط قرينة لاتينية)\stoparabictext \ across lines. \hfill\break \LUATEX\ even does a great job breaking Arabic phrases \startarabictext(و هنا جملة منقطعة في وسط قرينة لاتينية)\stoparabictext \ across lines. (bracket) and <bracket> and [bracket] \stoplatinpar \stoptext ======================================== -- Professor Idris Samawi Hamid, Editor-in-Chief International Journal of Shi`i Studies Department of Philosophy Colorado State University Fort Collins, CO 80523
On Sun, Jun 15, 2008 at 03:42:30PM -0600, Idris Samawi Hamid wrote:
On Sun, 15 Jun 2008 14:32:40 -0600, Hans Hagen
wrote: currently there is no reset command but you can say \setcharactermirroring[-1] since -1 resets an attribute
I tried that, but the result is the same: first occurrence of an RL open-paren in an LR paragraph is mis-mirrored. Next occurrences are fine.
The enclosed test file can serve as a benchmark for now.
I just updated my minimals installation and this example is rendered correctly now. I think I should try to make a sort of testbed based on this example. Regards, Khaled -- Khaled Hosny Arabic localizer and member of Arabeyes.org team
Idris Samawi Hamid wrote:
Hi Khaled,
On Fri, 13 Jun 2008 15:52:27 -0600, Khaled Hosny
wrote: experimental in the beta
\setcharactermirroring[1]
It does work perfectly with unidirectional texts (RTL or LTR), but when mixing bi-directional text, like Arabic text between brackets inside English line, the closing bracket takes the direction of the embedded text not the main line. See the attached example.
I just worked with this, and made some symmetrical definitions: see the attached modified files.
i need to look into the luatex source to see what kind of direction nodes are possible (i just check for one now) or maybe tac has such a list at hand Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Khaled Hosny wrote:
On Mon, Jun 09, 2008 at 11:08:09PM +0200, Hans Hagen wrote:
Khaled Hosny wrote:
Luatex does not mirror characters that has a Bidi_Mirrored property when the text direction is set to RTL (TRT in Aleph), according to http://unicode.org/reports/tr9/#Mirroring, the different types of parenthesis that has Bidi_Mirrored property should be mirreored in RTL mode, but this isn't what I get. Is this a bug, feature, or am I missing some thing? experimental in the beta
\setcharactermirroring[1]
It does work perfectly with unidirectional texts (RTL or LTR), but when mixing bi-directional text, like Arabic text between brackets inside English line, the closing bracket takes the direction of the embedded text not the main line. See the attached example.
i need to figure out what the subtypes are of the direction nodes i suggest that you and idris cook up a set of test files that can serve as benchmark, preferably something organized (arabtest-whatever) and such; eventually these can go into the regression test machinery Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hi Khaled,
On Sun, 08 Jun 2008 19:21:28 -0600, Khaled Hosny
Luatex does not mirror characters that has a Bidi_Mirrored property when the text direction is set to RTL (TRT in Aleph), according to http://unicode.org/reports/tr9/#Mirroring, the different types of parenthesis that has Bidi_Mirrored property should be mirreored in RTL mode, but this isn't what I get. Is this a bug, feature, or am I missing some thing?
In addition to the comments by Arthur, Hans: This issue needs to be looked at in the larger context of bidi in luatex. For example, we need a flexible model to implement Bidirectional Character Types. For example, we need an option that will recognize certain character-types as RL and switch directions accordingly -- that is, automatically switch \textdir, within a given paragraph, in accordance with the directionality of the word-string. This needs a well thought-out model so that we can map the Unicode specification to our \texdir \pardir functionality... So the whole directionality issue needs to be looked at carefully, in addition to temporary patches like \setcharactermirroring[1] OTOH: SC Unipad gets this right is perhaps the ideal to strive for... Indeed, it implements unicode better than any text application of which I am aware... Also, we need not be too slavish: The Yudit author has pointed out areas where the bidi algorithm makes no sense or is deficient: http://www.yudit.org/bidi/surprise.html So this all needs careful thought. On my list... Best wishes Idris -- Professor Idris Samawi Hamid, Editor-in-Chief International Journal of Shi`i Studies Department of Philosophy Colorado State University Fort Collins, CO 80523
Also, we need not be too slavish: The Yudit author has pointed out areas where the bidi algorithm makes no sense or is deficient:
To be honest, this page dates back to half a dozen years ago and reflects Gáspár Sinai's positions at that time; there are hints that he has changed his mind at least slightly, as he started to make quite a scandal about possible "security problems" in the bidi algorithm and later backed off (http://yudit.org/security/). This is of course not to say that the bidi algorithm is perfect, because it's not, but I don't think that the link you quote makes a really strong point against it. In particular, I find that some of the recommandations Sinai criticizes are amazingly close to what we do in the TeX world ("formatting codes should be inserted" => mark up the text with direction-switching commands, etc.) Arthur
Hi,
On Mon, 09 Jun 2008 19:36:00 -0600, Arthur Reutenauer
Also, we need not be too slavish: The Yudit author has pointed out areas where the bidi algorithm makes no sense or is deficient:
To be honest, this page dates back to half a dozen years ago and reflects Gáspár Sinai's positions at that time; there are hints that he has changed his mind at least slightly, as he started to make quite a scandal about possible "security problems" in the bidi algorithm and later backed off (http://yudit.org/security/). This is of course not to say that the bidi algorithm is perfect, because it's not, but I don't think that the link you quote makes a really strong point against it. In particular, I find that some of the recommandations Sinai criticizes are amazingly close to what we do in the TeX world ("formatting codes should be inserted" => mark up the text with direction-switching commands, etc.)
Granted. The main point is that we have to reinterpret bidi in way that fits with TeX's/ConTeXt's needs, idiosyncracies, etc.... I don't believe we need to treat the unicode bidi algorithm as canonical. But I'm still studying the matter. Best Idris -- Professor Idris Samawi Hamid, Editor-in-Chief International Journal of Shi`i Studies Department of Philosophy Colorado State University Fort Collins, CO 80523
Idris Samawi Hamid wrote:
Granted. The main point is that we have to reinterpret bidi in way that fits with TeX's/ConTeXt's needs, idiosyncracies, etc.... I don't believe we need to treat the unicode bidi algorithm as canonical.
i think that we should stick to things that make sense; with tex we're often talking of tagged sources and anything ambiguous should be tagged; in a sense this is not even related to arab at all, take an url ... i can imagine an url-algorithm, but it could never be perfect (just see what some programs that try to do it sometimes make of it) if for instance ( ) are officially not symbols but open/close thingies, then we need to deal with them (although i then wonder why we have no proper open/close code point for them instead of reusing the ascii () which have for users some expected visual appearance, but in that respect unicode puzzles me on a daily basis) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
if for instance ( ) are officially not symbols but open/close thingies, then we need to deal with them
Yes, they are. That's what the Bidi_Mirrored property is for.
(although i then wonder why we have no proper open/close code point for them instead of reusing the ascii ()
If anything, for backward compatibility. Arthur
Arthur Reutenauer wrote:
if for instance ( ) are officially not symbols but open/close thingies, then we need to deal with them
Yes, they are. That's what the Bidi_Mirrored property is for.
(although i then wonder why we have no proper open/close code point for them instead of reusing the ascii ()
If anything, for backward compatibility.
sure but originally they were in ascii just ( and ) and not some generic opener and closer Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
sure but originally they were in ascii just ( and ) and not some generic opener and closer
I did not mean ASCII ... in the 8-bit encodings for Arabic that built over ASCII (ISO 8859-6, Windows-1256, etc.) there also was a single couple of parentheses, and I don't think they had a different meaning as in Unicode ... I couldn't find the exact reference (the ISO 8859-6 text sells for 70 Swiss Francs on the ISO site, and the only specification Microsoft provides on his site is the character table -- http://www.microsoft.com/globaldev/reference/sbcs/1256.mspx), but HTML pages I find on the Web tend to indicate that in Arabic text encoded in Windows-1256, the character '(' (byte 0x28) was really used as an opening bracket, and ')' as a closing one. Maybe people on the list from Arabic-speaking countries, Israel or Iran can tell us more (in particular, what do one type on a standard keyboard to input an opening bracket?). Arthur
On 10 juin 08, at 23:30, Arthur Reutenauer wrote:
[…] Maybe people on the list from Arabic-speaking countries, Israel or Iran can tell us more (in particular, what do one type on a standard keyboard to input an opening bracket?).
Arthur
Hi Arthur, Sorry for answering to this message so late (this thread is far beyond my knowledge and I read it just by curiosity…). Since I am Iranian and familiar with Persian and a little bit of Arabic, I wanted to let you know that one types bracket and such as follows: --- when writing a Persian text enclosed in parentheses, typing on a Persian keyboard, one types a "right parenthesis" that is U+0029, followed by the text intended to be enclosed in brackets (running right to left), and then a "left parenthesis" that is U+0028. The same applies when the text is enclosed within "guillemets". (For numbers there is no difference with roman script). However on my Mac's Persian or Arabic AZERTY keyboards, the mapping is not correct (that is typing a right parenthesis results in U+0028 in the text one is typing). Best regards: OK
Hello Otared, Thanks for the answer (I should have known the results depended on the actual configuration :-) Arthur
Hi Arthur,
On Wed, 18 Jun 2008 02:29:00 -0600, Arthur Reutenauer
Thanks for the answer (I should have known the results depended on the actual configuration :-)
Get SC unipad and type identical logical input in Arabic and Farsi modes. The makers of Unipad are Iranian and they implement both Arabic and Farsi rules correctly, have keyboards etc. AFAIK, no other editor implements the bidi alg as well. I'll forward you something in this regard. Best wishes Idris -- Professor Idris Samawi Hamid, Editor-in-Chief International Journal of Shi`i Studies Department of Philosophy Colorado State University Fort Collins, CO 80523
On Wed, Jun 18, 2008 at 09:22:58AM -0600, Idris Samawi Hamid wrote:
Hi Arthur,
On Wed, 18 Jun 2008 02:29:00 -0600, Arthur Reutenauer
wrote: Thanks for the answer (I should have known the results depended on the actual configuration :-)
Get SC unipad and type identical logical input in Arabic and Farsi modes. The makers of Unipad are Iranian and they implement both Arabic and Farsi rules correctly, have keyboards etc. AFAIK, no other editor implements the bidi alg as well.
I didn't use Unipad that much, but I think Gedit (GTK pased) does a perfect job, I'm not aware of any windows port of it though. I'd like to know what features in Unipad (regarding Arabic) that Gedit hasn't. Regards, Khaled
I'll forward you something in this regard.
Best wishes Idris
-- Professor Idris Samawi Hamid, Editor-in-Chief International Journal of Shi`i Studies Department of Philosophy Colorado State University Fort Collins, CO 80523 ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
-- Khaled Hosny Arabic localizer and member of Arabeyes.org team
On Wed, 18 Jun 2008 09:44:19 -0600, Khaled Hosny
Get SC unipad and type identical logical input in Arabic and Farsi modes. The makers of Unipad are Iranian and they implement both Arabic and Farsi rules correctly, have keyboards etc. AFAIK, no other editor implements the bidi alg as well.
I didn't use Unipad that much, but I think Gedit (GTK pased) does a perfect job,
No, it does not ;-)
I'm not aware of any windows port of it though.
andLinux on W32 is wonderful, no more cygwin etc.
I'd like to know what features in Unipad (regarding Arabic) that Gedit hasn't.
One example: The following comes out correctly in Unipad, also in Opera's email composer (don't know how it will look in your client). It does not come out right in Windows Notepad or in gedit. Kate does get it right (and other kde apps as well), as well as OOo: ============== European: 18%, 1--3 Arabic: ١٨٪، ١--٣ Farsi: ۱۸٪، ۱--۳ ============== See attached as well. Unipad is designed as a complete unicode utility, and has lots of other nifty features as well, especially for editing Arabic script. Its Arabic-script editing features are truly unique and useful. Eg, I select/edit vowels horizontally not vertically. Unipad has: Char Info bar: character code value (U+0000) character name (if assigned) character category (letter, decimal digit, etc.) character block (script, collection, etc.) Extended Char Info bar: decimal character code value encoded byte sequence (octals) bidirectional character type decimal digit value (only if the character represents a decimal digit) Unipad is not a good choice for a programmer's or TeX editor, but in overall unicode editing it is unrivalled, although it has not been updated since Unicode 4.1.0 :-( Best wishes Idris -- Professor Idris Samawi Hamid, Editor-in-Chief International Journal of Shi`i Studies Department of Philosophy Colorado State University Fort Collins, CO 80523
On Wed, Jun 18, 2008 at 01:50:35PM -0600, Idris Samawi Hamid wrote:
On Wed, 18 Jun 2008 09:44:19 -0600, Khaled Hosny
wrote: Get SC unipad and type identical logical input in Arabic and Farsi modes. The makers of Unipad are Iranian and they implement both Arabic and Farsi rules correctly, have keyboards etc. AFAIK, no other editor implements the bidi alg as well.
I didn't use Unipad that much, but I think Gedit (GTK pased) does a perfect job,
No, it does not ;-)
I'm not aware of any windows port of it though.
andLinux on W32 is wonderful, no more cygwin etc.
I'd like to know what features in Unipad (regarding Arabic) that Gedit hasn't.
One example:
The following comes out correctly in Unipad, also in Opera's email composer (don't know how it will look in your client). It does not come out right in Windows Notepad or in gedit. Kate does get it right (and other kde apps as well), as well as OOo:
============== European: 18%, 1--3
Arabic: ١٨٪، ١--٣
Farsi: ۱۸٪، ۱--۳ ==============
Oh, it is shame that even my terminal (mlterm) renders it right! I think I'm going to report this :)
See attached as well.
Unipad is designed as a complete unicode utility, and has lots of other nifty features as well, especially for editing Arabic script. Its Arabic-script editing features are truly unique and useful. Eg, I select/edit vowels horizontally not vertically.
Unipad has:
Char Info bar: character code value (U+0000) character name (if assigned) character category (letter, decimal digit, etc.) character block (script, collection, etc.)
Extended Char Info bar: decimal character code value encoded byte sequence (octals) bidirectional character type decimal digit value (only if the character represents a decimal digit)
Unipad is not a good choice for a programmer's or TeX editor, but in overall unicode editing it is unrivalled, although it has not been updated since Unicode 4.1.0 :-(
Fortunately, thanks to WINE, it does run under Linux, even their web site lists WINE as one of the supported platforms. Regards, Khaled -- Khaled Hosny Arabic localizer and member of Arabeyes.org team
Idris Samawi Hamid wrote:
This issue needs to be looked at in the larger context of bidi in luatex. For example, we need a flexible model to implement Bidirectional Character Types. For example, we need an option that will recognize certain character-types as RL and switch directions accordingly -- that is, automatically switch \textdir, within a given paragraph, in accordance with the directionality of the word-string. This needs a well thought-out model so that we can map the Unicode specification to our \texdir \pardir functionality...
it's not hard to do but indeed we need a clean definition; it may even be that we need to change some of this \*dir stuff; (it's no problem for me to inject dir nodes in the node list); especially messy seems the number stuff btw is there a reason why in bidi arabic r->l is tagged 'al' and not 'r'? Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Arthur Reutenauer wrote:
btw is there a reason why in bidi arabic r->l is tagged 'al' and not 'r'?
Arabic Letter. Basic right-to-left characters have type R.
do you have any idea why is arab treated special? ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (5)
-
Arthur Reutenauer
-
Hans Hagen
-
Idris Samawi Hamid
-
Khaled Hosny
-
Otared Kavian