wrong behaviour with ConTeXt unicode bidi
Hi The following document shows the wrong behaviour (the second parentheses is mirrored incorrectly and it should be after the number not before the number). Assuming "test" is an RTL word when I write (test 1363) I expect to get exactly that but somehow I get (test (1363 One related question: is it possible to change font automatically when luatex sees a LTR word? Thanks Vafa Khalighi
On 9/10/2013 2:57 PM, Vafa Khalighi wrote:
Hi
The following document shows the wrong behaviour (the second parentheses is mirrored incorrectly and it should be after the number not before the number).
Assuming "test" is an RTL word when I write (test 1363) I expect to get exactly that but somehow I get (test (1363
new beta ... also with fix for issue khaled mentioned \starttext \setupalign[r2l] \definefont[arabicfont][Arial*arabic at 20pt] \enabletrackers[typesetters.directions.one] \enabletrackers[typesetters.directions.two] \setupdirections[bidi=global,method=default] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,method=one] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,method=two] \arabicfont این (یک آزمایش 1363) است. \par \stoptext
One related question: is it possible to change font automatically when luatex sees a LTR word?
no, but you can define start\stop commands that deal with such switches also, you can combine fonts (and there a yet not documented auto script/language switcher .. i have no time now to explain that one) btw, never use \textdir and \pardir directly (i might even define them as no-ops some day) but use the higher level alignment commands Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Thanks. That is now fixed.
On Wed, Sep 11, 2013 at 1:26 AM, Hans Hagen
On 9/10/2013 2:57 PM, Vafa Khalighi wrote:
Hi
The following document shows the wrong behaviour (the second parentheses is mirrored incorrectly and it should be after the number not before the number).
Assuming "test" is an RTL word when I write (test 1363) I expect to get exactly that but somehow I get (test (1363
new beta ... also with fix for issue khaled mentioned
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial***arabic at 20pt]
\enabletrackers[typesetters.**directions.one] \enabletrackers[typesetters.**directions.two]
\setupdirections[bidi=global,**method=default] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,**method=one] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,**method=two] \arabicfont این (یک آزمایش 1363) است. \par
\stoptext
One related question: is it possible to change font automatically when
luatex sees a LTR word?
no, but you can define start\stop commands that deal with such switches
also, you can combine fonts (and there a yet not documented auto script/language switcher .. i have no time now to explain that one)
btw, never use \textdir and \pardir directly (i might even define them as no-ops some day) but use the higher level alignment commands
Hans
------------------------------**------------------------------**----- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl ------------------------------**------------------------------**----- ______________________________**______________________________** _______________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/** listinfo/ntg-context http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/**projects/contextrev/http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ______________________________**______________________________** _______________________
Sorry that is not fixed. If you type two of these, the second one will be
broken.
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial*arabic at 20pt]
\setupdirections[bidi=global,method=default] \arabicfont
این (آزمایش 1363) و
یک (آزمایش 1984) خوب
است و باقی ماجرا.
\stoptext
Vafa Khalighi
On Wed, Sep 11, 2013 at 1:45 AM, Vafa Khalighi
Thanks. That is now fixed.
On Wed, Sep 11, 2013 at 1:26 AM, Hans Hagen
wrote: On 9/10/2013 2:57 PM, Vafa Khalighi wrote:
Hi
The following document shows the wrong behaviour (the second parentheses is mirrored incorrectly and it should be after the number not before the number).
Assuming "test" is an RTL word when I write (test 1363) I expect to get exactly that but somehow I get (test (1363
new beta ... also with fix for issue khaled mentioned
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial***arabic at 20pt]
\enabletrackers[typesetters.**directions.one] \enabletrackers[typesetters.**directions.two]
\setupdirections[bidi=global,**method=default] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,**method=one] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,**method=two] \arabicfont این (یک آزمایش 1363) است. \par
\stoptext
One related question: is it possible to change font automatically when
luatex sees a LTR word?
no, but you can define start\stop commands that deal with such switches
also, you can combine fonts (and there a yet not documented auto script/language switcher .. i have no time now to explain that one)
btw, never use \textdir and \pardir directly (i might even define them as no-ops some day) but use the higher level alignment commands
Hans
------------------------------**------------------------------**----- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl ------------------------------**------------------------------**----- ______________________________**______________________________** _______________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/** listinfo/ntg-context http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/**projects/contextrev/http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ______________________________**______________________________** _______________________
I tried the latest beta; it fixes the problem I mentioned but breaks
something else:
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial*arabic at 20pt]
\setupdirections[bidi=global,method=default]
\hbox dir TRT{\arabicfont (1984)}
\stoptext
If you have \hbox dir TLT, you get expected result. I am not sure if this
is the side effect of using \hbox dir TRT.
On Wed, Sep 11, 2013 at 2:20 AM, Vafa Khalighi
Sorry that is not fixed. If you type two of these, the second one will be broken.
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial*arabic at 20pt]
\setupdirections[bidi=global,method=default] \arabicfont
این (آزمایش 1363) و یک (آزمایش 1984) خوب است و باقی ماجرا.
\stoptext
Vafa Khalighi
On Wed, Sep 11, 2013 at 1:45 AM, Vafa Khalighi
wrote: Thanks. That is now fixed.
On Wed, Sep 11, 2013 at 1:26 AM, Hans Hagen
wrote: On 9/10/2013 2:57 PM, Vafa Khalighi wrote:
Hi
The following document shows the wrong behaviour (the second parentheses is mirrored incorrectly and it should be after the number not before the number).
Assuming "test" is an RTL word when I write (test 1363) I expect to get exactly that but somehow I get (test (1363
new beta ... also with fix for issue khaled mentioned
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial***arabic at 20pt]
\enabletrackers[typesetters.**directions.one] \enabletrackers[typesetters.**directions.two]
\setupdirections[bidi=global,**method=default] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,**method=one] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,**method=two] \arabicfont این (یک آزمایش 1363) است. \par
\stoptext
One related question: is it possible to change font automatically when
luatex sees a LTR word?
no, but you can define start\stop commands that deal with such switches
also, you can combine fonts (and there a yet not documented auto script/language switcher .. i have no time now to explain that one)
btw, never use \textdir and \pardir directly (i might even define them as no-ops some day) but use the higher level alignment commands
Hans
------------------------------**------------------------------**----- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl ------------------------------**------------------------------**----- ______________________________**______________________________** _______________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/** listinfo/ntg-context http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/**projects/contextrev/http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ______________________________**______________________________** _______________________
Also the itemize environment is broken too (perhaps same issue is here too):
\setupitemize[left=(, right=), margin=4em, stopper=]
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial*arabic at 20pt]
\setupdirections[bidi=global,method=default]
\arabicfont
\startitemize[a]
\item اولی
\item دومی
\stopitemize
\stoptext
and if I swap right and left parentheses, I get the following error:
error: .../context/tex/texmf-context/tex/context/base/typo-dha.lua:184:
attempt to index local 'current' (a nil value)
On Wed, Sep 11, 2013 at 7:47 PM, Vafa Khalighi
I tried the latest beta; it fixes the problem I mentioned but breaks something else:
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial*arabic at 20pt]
\setupdirections[bidi=global,method=default] \hbox dir TRT{\arabicfont (1984)}
\stoptext
If you have \hbox dir TLT, you get expected result. I am not sure if this is the side effect of using \hbox dir TRT.
On Wed, Sep 11, 2013 at 2:20 AM, Vafa Khalighi
wrote: Sorry that is not fixed. If you type two of these, the second one will be broken.
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial*arabic at 20pt]
\setupdirections[bidi=global,method=default] \arabicfont
این (آزمایش 1363) و یک (آزمایش 1984) خوب است و باقی ماجرا.
\stoptext
Vafa Khalighi
On Wed, Sep 11, 2013 at 1:45 AM, Vafa Khalighi
wrote: Thanks. That is now fixed.
On Wed, Sep 11, 2013 at 1:26 AM, Hans Hagen
wrote: On 9/10/2013 2:57 PM, Vafa Khalighi wrote:
Hi
The following document shows the wrong behaviour (the second parentheses is mirrored incorrectly and it should be after the number not before the number).
Assuming "test" is an RTL word when I write (test 1363) I expect to get exactly that but somehow I get (test (1363
new beta ... also with fix for issue khaled mentioned
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial***arabic at 20pt]
\enabletrackers[typesetters.**directions.one] \enabletrackers[typesetters.**directions.two]
\setupdirections[bidi=global,**method=default] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,**method=one] \arabicfont این (یک آزمایش 1363) است. \par \setupdirections[bidi=global,**method=two] \arabicfont این (یک آزمایش 1363) است. \par
\stoptext
One related question: is it possible to change font automatically when
luatex sees a LTR word?
no, but you can define start\stop commands that deal with such switches
also, you can combine fonts (and there a yet not documented auto script/language switcher .. i have no time now to explain that one)
btw, never use \textdir and \pardir directly (i might even define them as no-ops some day) but use the higher level alignment commands
Hans
------------------------------**------------------------------**----- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl ------------------------------**------------------------------**----- ______________________________**______________________________** _______________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/** listinfo/ntg-context http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/**projects/contextrev/http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ______________________________**______________________________** _______________________
On 9/11/2013 1:34 PM, Vafa Khalighi wrote:
Also the itemize environment is broken too (perhaps same issue is here too):
\setupitemize[left=(, right=), margin=4em, stopper=] \starttext \setupalign[r2l] \definefont[arabicfont][Arial*arabic at 20pt] \setupdirections[bidi=global,method=default] \arabicfont \startitemize[a] \item اولی \item دومی \stopitemize \stoptext
and if I swap right and left parentheses, I get the following error:
thanks for testing the problem is that sometimes it's hard to deduce the state so i've added \righttolefthbox \rtlhbox \lefttorighthbox \ltrhbox \righttoleftvbox \rtlvbox \lefttorightvbox \ltrvbox \righttoleftvtop \rtlvtop \lefttorightvtop \ltrvtop \autodirhbox \autodirvbox \autodirvtop some cases are (using the default parser, which is a one-pass forward scanner) tricky to determine \hbox{\righttoleft(0001)}\par \dontleavehmode\hbox{\righttoleft(0002)}\par {\righttoleft(0003)\par} {\righttoleft(0004)}\par \dontleavehmode{\righttoleft(0005)\par} \dontleavehmode{\righttoleft(0006)}\par \rtlhbox{(0007)}\par \ltrhbox{(0008)}\par \dontleavehmode\rtlhbox{(0009)}\par \dontleavehmode\ltrhbox{(0010)}\par \hbox{(0011)}\par \dontleavehmode\hbox{(0012)}\par the other parsers do several passes and are slower can can handle some cases better anyway, it would be nice to see where the three methods fail: \setuplayout[middle] \starttext \setupalign[r2l] \definefont[arabicfont][Arial*arabic at 20pt] \enabletrackers[typesetters.directions.default] \enabletrackers[typesetters.directions.one] \enabletrackers[typesetters.directions.two] \setupdirections[bidi=global,method=default] % \setupdirections[bidi=global,method=one] % \setupdirections[bidi=global,method=two] \arabicfont \setupinterlinespace \hbox{\righttoleft(0001)}\par \dontleavehmode\hbox{\righttoleft(0002)}\par {\righttoleft(0003)\par} {\righttoleft(0004)}\par \dontleavehmode{\righttoleft(0005)\par} \dontleavehmode{\righttoleft(0006)}\par \rtlhbox{(0007)}\par \ltrhbox{(0008)}\par \dontleavehmode\rtlhbox{(0009)}\par \dontleavehmode\ltrhbox{(0010)}\par \hbox{(0011)}\par \dontleavehmode\hbox{(0012)}\par \setupitemize[left=(,right=),distance=1em] \startitemize[a] \item اولی \item دومی \stopitemize (1984) این (آزمایش 1363] و یک (آزمایش 1984] خوب است و باقی ماجرا. \blank این (آزمایش (oeps 1363)) و یک (آزمایش 1984) خوب است و باقی ماجرا. \blank این (آزمایش (oeps 1363)) و یک \blank این (آزمایش [oeps 1363]) و یک \blank این (آزمایش [oeps 1363)] و یک \blank \stoptext Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On Wed, Sep 11, 2013 at 02:37:35PM +0200, Hans Hagen wrote:
anyway, it would be nice to see where the three methods fail:
Of the three, method one seems to give correct results on all the given tests (I’m yet to test with my own documents). I’m skeptical that bidi can be implemented in a one pass algorithm, there have been a one pass algorithm called “Pretty Good Bidi Algorithm”, but it has its limitation (I never tested it myself). http://web.archive.org/web/20090225171532/http://crl.nmsu.edu/~mleisher/ucda... Regards, Khaled
On 9/11/2013 5:24 PM, Khaled Hosny wrote:
On Wed, Sep 11, 2013 at 02:37:35PM +0200, Hans Hagen wrote:
anyway, it would be nice to see where the three methods fail:
Of the three, method one seems to give correct results on all the given tests (I’m yet to test with my own documents).
I’m skeptical that bidi can be implemented in a one pass algorithm, there have been a one pass algorithm called “Pretty Good Bidi Algorithm”, but it has its limitation (I never tested it myself).
Sure, although it can be close to okay with some backward and forward scanning but I'm nbot really in the mood for that now. Anyhow, for the occasional mix of arabic and latin this method works ok. For more extreme cases method 'one' will do and method 'two' ... well it depends on developments in unicode as this method will be the more configurable one. (And I can probably make a faster implementation of method two when performance matters.) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
There is one more problem with the default method:
\starttext
\setupalign[r2l]
\definefont[arabicfont][Arial*arabic at 20pt]
\setupdirections[bidi=global,method=default]
1.
\stoptext
from right to left perspective it prints "1." (dot after digit 1) as ".1"
(dot before digit 1) however method one and two work correctly.
On Thu, Sep 12, 2013 at 3:10 AM, Hans Hagen
On 9/11/2013 5:24 PM, Khaled Hosny wrote:
On Wed, Sep 11, 2013 at 02:37:35PM +0200, Hans Hagen wrote:
anyway, it would be nice to see where the three methods fail:
Of the three, method one seems to give correct results on all the given tests (I’m yet to test with my own documents).
I’m skeptical that bidi can be implemented in a one pass algorithm, there have been a one pass algorithm called “Pretty Good Bidi Algorithm”, but it has its limitation (I never tested it myself).
Sure, although it can be close to okay with some backward and forward scanning but I'm nbot really in the mood for that now. Anyhow, for the occasional mix of arabic and latin this method works ok. For more extreme cases method 'one' will do and method 'two' ... well it depends on developments in unicode as this method will be the more configurable one. (And I can probably make a faster implementation of method two when performance matters.)
Hans
------------------------------**------------------------------**----- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl ------------------------------**------------------------------**----- ______________________________**______________________________** _______________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/** listinfo/ntg-context http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/**projects/contextrev/http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ______________________________**______________________________** _______________________
participants (3)
-
Hans Hagen
-
Khaled Hosny
-
Vafa Khalighi