Re: [NTG-context] Arabic-utf-8 (plus a sample)

5 Jun 2004

      On Sat, 05 Jun 2004 22:41:39 +0200, Thomas A. Schmitz 
 wrote:
...
Idris,
I know a bit of perl and would love to help. However, I fear that
sending us your stuff via mail will be a bit difficult because the utf-8
chracters get transformed into gibberish.
Thnx 4 such a speedy reply! I don't think you are getting gibberish 
though; you should be getting the extended ascii representation. So the 
letter alif (hex 0627) should look like this:

Ø§

Do you get a forward-slashed circle and a section symbol? If so, that's 
the ascii representation I'm trying to convert to the letter `A'.

Here are the codes you want:

Ø§ [0627] => A

Ø¨ [0628] => b

Ø¬ [062C] => j

Ø¯ [062F] => d

Ù‡ [0647] => h

Ùˆ [0648] => w

Ø² [0632] => z

Let me explain my situation more clearly:-)

I have a unicode editor, Unitype Global Writer. I save a unicode document 
as a utf *.txt file. When I open that saved file in my TeX editor 
(WinEdt), it comes out as extended ascii (that's the "gibberish"). So what 
I wanted to do was convert the ascii "gibberish" to my Latin 
transcription. It seems that what you are suggesting is to use the hex 
representation and convert the unicode txt file into a Latin transcription 
file directly and bypass the gibberish.

On your perl file, can you give me an example of how to use it? I tried 
(in windows, with name
utf2tex.pl and unicode text in unicode-utf.txt) and get

=========================
...
perl utf2tex.pl unicode-utf.txt
Unknown discipline class ':utf8' at C:/Perl/lib/open.pm line 18.
BEGIN failed--compilation aborted at utf2tex.pl line 4.
=========================
from your script I tried, e.g.

============================
$_ =~
s/\x{0627}/\x{0041}/esg;
# from alif to `A'
============================

Your guidance will be greatly appreciated!

Thnx a million!
Idris

-- 
Professor Idris Samawi Hamid
Department of Philosophy
Colorado State University
Fort Collins, CO 80523