arab (omega)

Hans Hagen

23 Jun 2006 23 Jun '06

8:59 p.m.

Hi otp lovers, i managed to get this number stuff running (i.e. bypass the otp messing up numbers): i'm uploading a beta beware, there is no need to setup separators, and no need to reverse numbers! \def\ArabicUTF {\ArabicDirGlobal \usefiltersequence[UTFArabic]% \switchtobodyfont[omarb]% \isolateseparators} % will handle separators of course you should not expect proper kerning when isolation is used with latin things like this actually need some advanced control (special otp's for difference situations, depending on what one's dealing with, then to be hooked into the code at the right place, which is tricky since it may get lost later one etc etc) still messed up: math display formulas, strange number reversion and swapping l/r [since omega has bugs with math i'm not sure what is the reason] maybe duncan/idris/taco have an idea Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Show replies by date

Idris Samawi Hamid

24 Jun 24 Jun

6:37 p.m.

Hi Hans, Thank you very much for this; will test in the coming days (I have not updated ConTeXt since Jan 28; hope there are are no deadly installation surprises-) ) On Fri, 23 Jun 2006 12:59:59 -0600, Hans Hagen wrote:

...

Hi otp lovers,

Well, `love' is much too strong a word ;-)

...

i managed to get this number stuff running (i.e. bypass the otp messing up numbers):

How does this happen? I know the otp's have it so that numerals are always typeset l-r, is there some nasty side-effect? Hmm just checked: 5792-684 and 5792{}-{}684 produce two different results; the second one is correct. The otp includes the separator as part of the number (unless it is isolated-) I will look at the responsible otp to see if I can fix this at the otp level.

...

i'm uploading a beta

beware, there is no need to setup separators, and no need to reverse numbers!

\def\ArabicUTF {\ArabicDirGlobal \usefiltersequence[UTFArabic]% \switchtobodyfont[omarb]% \isolateseparators} % will handle separators

Ah! you also isolate the separators; should be possible in the otp...

...

of course you should not expect proper kerning when isolation is used with latin

Why is this?

...

things like this actually need some advanced control (special otp's for difference situations, depending on what one's dealing with, then to be hooked into the code at the right place, which is tricky since it may get lost later one etc etc)

still messed up: math display formulas, strange number reversion and swapping l/r [since omega has bugs with math i'm not sure what is the reason]

maybe duncan/idris/taco have an idea

I will look at it; need examples etc. Thnx as always Best Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Idris Samawi Hamid

7:25 p.m.

I found the problem: In uni2cuni.otp the numbering handling does two things: It DOES isolate non-numerals as separators within a given expression. So placing an Arabic letter between two numbers 5792ر684 processes fine; each individual number gets reversed. But the otp makes exceptions for the following punctuation: + - . If we get rid of those exceptions the separator problem will go away. But then math will be messed up. The problem is that the + - . are ambiguous; sometimes they have a mathematical significance; sometimes a separator significance. We need the exception for math (generally done the usual l-r way) but don't need it for separators (done in the r-l way). What I could do is define two filter sequences: UTFArabic and UTFArabicMath. Hans, could you do a conditional that calls up one in math mode and the other everywhere else? This is a stop-gap solution until we replace otp's with something smarter. Taco, Hans, let me know what you think before I work on this. It is really inefficient to have to define an entire ot stack just to change one otp. There must be a better way to abtract things so we can plug a given otp without redoing the entire stack. Best Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Taco Hoekwater

25 Jun 25 Jun

10:08 a.m.

Idris Samawi Hamid wrote:

...

But the otp makes exceptions for the following punctuation:

+ - .

If we get rid of those exceptions the separator problem will go away. But then math will be messed up. The problem is that the

My guess is you could just remove the punctuation support from the current OTP. If a user really needs to say "1.2 million", (s)he can just write $1.2$ instead, which is more or less standard TeX practice anyway. It is my understanding that the contents of $$ is unaffected by OTPs (and if it is not, it should be made so. Math is a language on its own).

...

This is a stop-gap solution until we replace otp's with something smarter. Taco, Hans, let me know what you think before I work on this.

TeX will never be smart enough to understand that "see figure 1.2" is fundamentally different from "averaging 1.2 figures per page". Not until it can actually interpret English text, anyway. Greetings, Taco

Idris Samawi Hamid

7:38 p.m.

On Sun, 25 Jun 2006 02:08:37 -0600, Taco Hoekwater wrote:

...

My guess is you could just remove the punctuation support from the current OTP. If a user really needs to say "1.2 million", (s)he can just write $1.2$ instead, which is more or less standard TeX practice anyway.

It is my understanding that the contents of $$ is unaffected by OTPs (and if it is not, it should be made so. Math is a language on its own).

ok, it's done. The new ocp is here: http://wiki.contextgarden.net/images/2/27/Uni2cuni.zip and the instructions are here: http://wiki.contextgarden.net/Aleph_Guide#Installing Now, in normal text, +, -, and . are treated as separators, not plus sign, minus, and decimal point. If you want math use $$ etc. but then you will get the math font not omarb (which is standard practice in the Arabic-script world). I have no idea if this affects Hans' solution (have not upgraded yet); this is all experimental so things may change. An aside: Classical Arabic is more sensical. Consider the number 3721. In classical Arabic one says, "one and twenty and and seven hundred and three thousand", which makes much more sense for a r-l language. So one would write the numeral from r to l and it would look the same. How decimals would be handled in the classical case needs a bit of research Best Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Idris Samawi Hamid

9:54 p.m.

Dear gang, I have redone this. Now there are two filter sequences, one for main text and one for using the arabic-font digits in math mode. http://wiki.contextgarden.net/Aleph_Guide#Installing has been updated, get the new uni2cuni.zip. I'll update m-gamma soon so this will be part of the distro if Hans accepts-) Here is a test file that works here. Note (see the sqrt example) that omarb apparently does not have U+066B, the Arabic decimal point. And the Arabic comma maps to the lm comma in math mode (since omarb maps the arabic comma to U+002C in the omarb virtual font-sigh). So there is further work needed to get this all just right, including recompiling the fonts etc... Best Idris ================================================== % tex=aleph output=dvipdfmx \usetypescriptfile[type-omg] \usetypescript[OmegaArab] \hoffset=0pt %% Individual Filters % Input filters (from what you type) \definefiltersynonym [UTF8] [inutf8] % Contextual filter \definefiltersynonym [UniCUni] [uni2cuni] \definefiltersynonym [UniCUniMath] [uni2cuni-math] % Output filters (font mapping) \definefiltersynonym [CUniArab] [cuni2oar] %% Filter Sequences \definefiltersequence [UTFArabic] [UTF8,UniCUni,CUniArab] \definefiltersequence [UTFArabicMath] [UTF8,UniCUniMath,CUniArab] \appendtoks \clearocplists \usefiltersequence[UTFArabicMath] \to \everymathematics % For global Arabic script \def\ArabicDirGlobal{% \pagedir TRT\bodydir TRT\textdir TRT\pardir TRT }% \def\ArabicUTF{\ArabicDirGlobal\usefiltersequence[UTFArabic] \reversesectionnumberstrue\switchtobodyfont[omarb]} \ArabicUTF \starttext 5792-684 $5792-684$ ${\tf 5792-684}$ 2.5 $\sqrt{\tf 2.5}$ 2,5 $\sqrt{\tf 2ØŒ5}$ $\sqrt{\tf 2,5}$ \stoptext ================================================== On Sun, 25 Jun 2006 11:38:37 -0600, Idris Samawi Hamid wrote:

...

On Sun, 25 Jun 2006 02:08:37 -0600, Taco Hoekwater wrote:

...
My guess is you could just remove the punctuation support from the current OTP. If a user really needs to say "1.2 million", (s)he can just write $1.2$ instead, which is more or less standard TeX practice anyway.

It is my understanding that the contents of $$ is unaffected by OTPs (and if it is not, it should be made so. Math is a language on its own).

ok, it's done. The new ocp is here: http://wiki.contextgarden.net/images/2/27/Uni2cuni.zip

and the instructions are here:

http://wiki.contextgarden.net/Aleph_Guide#Installing

Now, in normal text, +, -, and . are treated as separators, not plus sign, minus, and decimal point. If you want math use $$ etc. but then you will get the math font not omarb (which is standard practice in the Arabic-script world).

I have no idea if this affects Hans' solution (have not upgraded yet); this is all experimental so things may change.

An aside: Classical Arabic is more sensical. Consider the number 3721. In classical Arabic one says, "one and twenty and and seven hundred and three thousand", which makes much more sense for a r-l language. So one would write the numeral from r to l and it would look the same. How decimals would be handled in the classical case needs a bit of research

Best Idris

-- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Hans Hagen

6:01 p.m.

Idris Samawi Hamid wrote:

...

I found the problem:

In uni2cuni.otp the numbering handling does two things:

It DOES isolate non-numerals as separators within a given expression. So placing an Arabic letter between two numbers

5792ر684

processes fine; each individual number gets reversed.

But the otp makes exceptions for the following punctuation:

+ - .

If we get rid of those exceptions the separator problem will go away. But then math will be messed up. The problem is that the

+ - .

are ambiguous; sometimes they have a mathematical significance; sometimes a separator significance. We need the exception for math (generally done the usual l-r way) but don't need it for separators (done in the r-l way).

isn't it simplier then to disable otp's in math mode

...

What I could do is define two filter sequences: UTFArabic and UTFArabicMath. Hans, could you do a conditional that calls up one in math mode and the other everywhere else?

\appendtoks < reset otps > < intialize other sequence of thenm> \to \everymathematics

...

This is a stop-gap solution until we replace otp's with something smarter. Taco, Hans, let me know what you think before I work on this.

It is really inefficient to have to define an entire ot stack just to change one otp. There must be a better way to abtract things so we can plug a given otp without redoing the entire stack.

i suppose that switching is fast, so using a different stack was what came on my mind first Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Idris Samawi Hamid

8:34 p.m.

On Sun, 25 Jun 2006 10:01:03 -0600, Hans Hagen wrote:

...

isn't it simplier then to disable otp's in math mode

...
What I could do is define two filter sequences: UTFArabic and UTFArabicMath. Hans, could you do a conditional that calls up one in math mode and the other everywhere else?

\appendtoks < reset otps > < intialize other sequence of thenm> \to \everymathematics

Sometimes someone may want to use the numerals from the omarb in a mathematical context so this may still be useful. Will work on it... Best Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Taco Hoekwater

8:37 a.m.

Idris Samawi Hamid wrote:

...

...
of course you should not expect proper kerning when isolation is used with latin

Why is this?

A figure id like in "1.2" can have kerning between the 1 and dot, and the dot and 2. But when it is coded as "1{.}2", it won't.

...

...
still messed up: math display formulas, strange number reversion and swapping l/r [since omega has bugs with math i'm not sure what is the reason]

maybe duncan/idris/taco have an idea

It looks like the displays in aleph are a bit broken if you mix RL text with LR math. I'll have a closer look. Taco

Hans Hagen

5:57 p.m.

Idris Samawi Hamid wrote:

...

Why is this?

tex itself does things, like with fi not being the same as f{}i [no lig building] and since otp's may insert things as well abc may different may differen from a{}b{}c (i must admit that i don't know when an otp stops scanning, i suppose at an unexpandable thingie (taco knows where what matters -) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

6957

Age (days ago)

6959

Last active (days ago)

List overview

Download

9 comments

3 participants

participants (3)

Hans Hagen
Idris Samawi Hamid
Taco Hoekwater

arab (omega)

tags

participants (3)