Hi otp lovers, i managed to get this number stuff running (i.e. bypass the otp messing up numbers): i'm uploading a beta beware, there is no need to setup separators, and no need to reverse numbers! \def\ArabicUTF {\ArabicDirGlobal \usefiltersequence[UTFArabic]% \switchtobodyfont[omarb]% \isolateseparators} % will handle separators of course you should not expect proper kerning when isolation is used with latin things like this actually need some advanced control (special otp's for difference situations, depending on what one's dealing with, then to be hooked into the code at the right place, which is tricky since it may get lost later one etc etc) still messed up: math display formulas, strange number reversion and swapping l/r [since omega has bugs with math i'm not sure what is the reason] maybe duncan/idris/taco have an idea Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hi Hans,
Thank you very much for this; will test in the coming days (I have not
updated ConTeXt since Jan 28; hope there are are no deadly installation
surprises-) )
On Fri, 23 Jun 2006 12:59:59 -0600, Hans Hagen
Hi otp lovers,
Well, `love' is much too strong a word ;-)
i managed to get this number stuff running (i.e. bypass the otp messing up numbers):
How does this happen? I know the otp's have it so that numerals are always typeset l-r, is there some nasty side-effect? Hmm just checked: 5792-684 and 5792{}-{}684 produce two different results; the second one is correct. The otp includes the separator as part of the number (unless it is isolated-) I will look at the responsible otp to see if I can fix this at the otp level.
i'm uploading a beta
beware, there is no need to setup separators, and no need to reverse numbers!
\def\ArabicUTF {\ArabicDirGlobal \usefiltersequence[UTFArabic]% \switchtobodyfont[omarb]% \isolateseparators} % will handle separators
Ah! you also isolate the separators; should be possible in the otp...
of course you should not expect proper kerning when isolation is used with latin
Why is this?
things like this actually need some advanced control (special otp's for difference situations, depending on what one's dealing with, then to be hooked into the code at the right place, which is tricky since it may get lost later one etc etc)
still messed up: math display formulas, strange number reversion and swapping l/r [since omega has bugs with math i'm not sure what is the reason]
maybe duncan/idris/taco have an idea
I will look at it; need examples etc. Thnx as always Best Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
I found the problem: In uni2cuni.otp the numbering handling does two things: It DOES isolate non-numerals as separators within a given expression. So placing an Arabic letter between two numbers 5792ر684 processes fine; each individual number gets reversed. But the otp makes exceptions for the following punctuation: + - . If we get rid of those exceptions the separator problem will go away. But then math will be messed up. The problem is that the + - . are ambiguous; sometimes they have a mathematical significance; sometimes a separator significance. We need the exception for math (generally done the usual l-r way) but don't need it for separators (done in the r-l way). What I could do is define two filter sequences: UTFArabic and UTFArabicMath. Hans, could you do a conditional that calls up one in math mode and the other everywhere else? This is a stop-gap solution until we replace otp's with something smarter. Taco, Hans, let me know what you think before I work on this. It is really inefficient to have to define an entire ot stack just to change one otp. There must be a better way to abtract things so we can plug a given otp without redoing the entire stack. Best Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Idris Samawi Hamid wrote:
But the otp makes exceptions for the following punctuation:
+ - .
If we get rid of those exceptions the separator problem will go away. But then math will be messed up. The problem is that the
My guess is you could just remove the punctuation support from the current OTP. If a user really needs to say "1.2 million", (s)he can just write $1.2$ instead, which is more or less standard TeX practice anyway. It is my understanding that the contents of $$ is unaffected by OTPs (and if it is not, it should be made so. Math is a language on its own).
This is a stop-gap solution until we replace otp's with something smarter. Taco, Hans, let me know what you think before I work on this.
TeX will never be smart enough to understand that "see figure 1.2" is fundamentally different from "averaging 1.2 figures per page". Not until it can actually interpret English text, anyway. Greetings, Taco
On Sun, 25 Jun 2006 02:08:37 -0600, Taco Hoekwater
My guess is you could just remove the punctuation support from the current OTP. If a user really needs to say "1.2 million", (s)he can just write $1.2$ instead, which is more or less standard TeX practice anyway.
It is my understanding that the contents of $$ is unaffected by OTPs (and if it is not, it should be made so. Math is a language on its own).
ok, it's done. The new ocp is here: http://wiki.contextgarden.net/images/2/27/Uni2cuni.zip and the instructions are here: http://wiki.contextgarden.net/Aleph_Guide#Installing Now, in normal text, +, -, and . are treated as separators, not plus sign, minus, and decimal point. If you want math use $$ etc. but then you will get the math font not omarb (which is standard practice in the Arabic-script world). I have no idea if this affects Hans' solution (have not upgraded yet); this is all experimental so things may change. An aside: Classical Arabic is more sensical. Consider the number 3721. In classical Arabic one says, "one and twenty and and seven hundred and three thousand", which makes much more sense for a r-l language. So one would write the numeral from r to l and it would look the same. How decimals would be handled in the classical case needs a bit of research Best Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Dear gang,
I have redone this. Now there are two filter sequences, one for main text
and one for using the arabic-font digits in math mode.
http://wiki.contextgarden.net/Aleph_Guide#Installing
has been updated, get the new uni2cuni.zip. I'll update m-gamma soon so
this will be part of the distro if Hans accepts-)
Here is a test file that works here. Note (see the sqrt example) that
omarb apparently does not have U+066B, the Arabic decimal point. And the
Arabic comma maps to the lm comma in math mode (since omarb maps the
arabic comma to U+002C in the omarb virtual
font-sigh). So there is further work needed to get this all just right,
including recompiling the fonts etc...
Best
Idris
==================================================
% tex=aleph output=dvipdfmx
\usetypescriptfile[type-omg]
\usetypescript[OmegaArab]
\hoffset=0pt
%% Individual Filters
% Input filters (from what you type)
\definefiltersynonym [UTF8] [inutf8]
% Contextual filter
\definefiltersynonym [UniCUni] [uni2cuni]
\definefiltersynonym [UniCUniMath] [uni2cuni-math]
% Output filters (font mapping)
\definefiltersynonym [CUniArab] [cuni2oar]
%% Filter Sequences
\definefiltersequence
[UTFArabic]
[UTF8,UniCUni,CUniArab]
\definefiltersequence
[UTFArabicMath]
[UTF8,UniCUniMath,CUniArab]
\appendtoks
\clearocplists
\usefiltersequence[UTFArabicMath]
\to \everymathematics
% For global Arabic script
\def\ArabicDirGlobal{%
\pagedir TRT\bodydir TRT\textdir TRT\pardir TRT }%
\def\ArabicUTF{\ArabicDirGlobal\usefiltersequence[UTFArabic]
\reversesectionnumberstrue\switchtobodyfont[omarb]}
\ArabicUTF
\starttext
5792-684 $5792-684$ ${\tf 5792-684}$
2.5 $\sqrt{\tf 2.5}$
2,5 $\sqrt{\tf 2،5}$ $\sqrt{\tf 2,5}$
\stoptext
==================================================
On Sun, 25 Jun 2006 11:38:37 -0600, Idris Samawi Hamid
On Sun, 25 Jun 2006 02:08:37 -0600, Taco Hoekwater
wrote: My guess is you could just remove the punctuation support from the current OTP. If a user really needs to say "1.2 million", (s)he can just write $1.2$ instead, which is more or less standard TeX practice anyway.
It is my understanding that the contents of $$ is unaffected by OTPs (and if it is not, it should be made so. Math is a language on its own).
ok, it's done. The new ocp is here: http://wiki.contextgarden.net/images/2/27/Uni2cuni.zip
and the instructions are here:
http://wiki.contextgarden.net/Aleph_Guide#Installing
Now, in normal text, +, -, and . are treated as separators, not plus sign, minus, and decimal point. If you want math use $$ etc. but then you will get the math font not omarb (which is standard practice in the Arabic-script world).
I have no idea if this affects Hans' solution (have not upgraded yet); this is all experimental so things may change.
An aside: Classical Arabic is more sensical. Consider the number 3721. In classical Arabic one says, "one and twenty and and seven hundred and three thousand", which makes much more sense for a r-l language. So one would write the numeral from r to l and it would look the same. How decimals would be handled in the classical case needs a bit of research
Best Idris
-- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Idris Samawi Hamid wrote:
I found the problem:
In uni2cuni.otp the numbering handling does two things:
It DOES isolate non-numerals as separators within a given expression. So placing an Arabic letter between two numbers
5792ر684
processes fine; each individual number gets reversed.
But the otp makes exceptions for the following punctuation:
+ - .
If we get rid of those exceptions the separator problem will go away. But then math will be messed up. The problem is that the
+ - .
are ambiguous; sometimes they have a mathematical significance; sometimes a separator significance. We need the exception for math (generally done the usual l-r way) but don't need it for separators (done in the r-l way).
isn't it simplier then to disable otp's in math mode
What I could do is define two filter sequences: UTFArabic and UTFArabicMath. Hans, could you do a conditional that calls up one in math mode and the other everywhere else?
\appendtoks < reset otps > < intialize other sequence of thenm> \to \everymathematics
This is a stop-gap solution until we replace otp's with something smarter. Taco, Hans, let me know what you think before I work on this.
It is really inefficient to have to define an entire ot stack just to change one otp. There must be a better way to abtract things so we can plug a given otp without redoing the entire stack.
i suppose that switching is fast, so using a different stack was what came on my mind first Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On Sun, 25 Jun 2006 10:01:03 -0600, Hans Hagen
isn't it simplier then to disable otp's in math mode
What I could do is define two filter sequences: UTFArabic and UTFArabicMath. Hans, could you do a conditional that calls up one in math mode and the other everywhere else?
\appendtoks < reset otps > < intialize other sequence of thenm> \to \everymathematics
Sometimes someone may want to use the numerals from the omarb in a mathematical context so this may still be useful. Will work on it... Best Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523 -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Idris Samawi Hamid wrote:
of course you should not expect proper kerning when isolation is used with latin
Why is this?
A figure id like in "1.2" can have kerning between the 1 and dot, and the dot and 2. But when it is coded as "1{.}2", it won't.
still messed up: math display formulas, strange number reversion and swapping l/r [since omega has bugs with math i'm not sure what is the reason]
maybe duncan/idris/taco have an idea
It looks like the displays in aleph are a bit broken if you mix RL text with LR math. I'll have a closer look. Taco
Idris Samawi Hamid wrote:
Why is this?
tex itself does things, like with fi not being the same as f{}i [no lig building] and since otp's may insert things as well abc may different may differen from a{}b{}c (i must admit that i don't know when an otp stops scanning, i suppose at an unexpandable thingie (taco knows where what matters -) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (3)
-
Hans Hagen
-
Idris Samawi Hamid
-
Taco Hoekwater