[NTG-context] Arabic-utf-8 (plus a sample)
Thomas A. Schmitz
Sat Jun 5 22:48:18 CEST 2004
Just a quick reply (it's bedtime over here): there may be 2 problems. 1
is that the mail program put in an unwanted linebreak after the =~
part, just remove it; it should all be one line. And then: you'll need a
fairly recent version of perl for it to work, what do you get when you
I guess for utf to work, it should be at least 5.8.0. Your basic idea of
the usage is right (I'm not a windows person, but I assume it should be
the same): save the scipt as utf2tex.pl, make it executable and call it
as utf2tex.pl FILENAME.txt.
I guess it would be easiest to convert the utf to ascii directly - that
would mean you could later convert it back. I have a set of scripts that
do just that -- convert babel Greek into utf-8 and back.
If you need more help, I'll look into it tomorrow!
On Sat, 2004-06-05 at 23:33, Idris Samawi Hamid wrote:
> On Sat, 05 Jun 2004 22:41:39 +0200, Thomas A. Schmitz
> <email@example.com> wrote:
> > Idris,
> > I know a bit of perl and would love to help. However, I fear that
> > sending us your stuff via mail will be a bit difficult because the utf-8
> > chracters get transformed into gibberish.
> Thnx 4 such a speedy reply! I don't think you are getting gibberish
> though; you should be getting the extended ascii representation. So the
> letter alif (hex 0627) should look like this:
> Do you get a forward-slashed circle and a section symbol? If so, that's
> the ascii representation I'm trying to convert to the letter `A'.
> Here are the codes you want:
> ÃÂ§  => A
> ÃÂ¨  => b
> ÃÂ¬ [062C] => j
> ÃÂ¯ [062F] => d
> Ãâ¡  => h
> ÃË  => w
> ÃÂ²  => z
> Let me explain my situation more clearly:-)
> I have a unicode editor, Unitype Global Writer. I save a unicode document
> as a utf *.txt file. When I open that saved file in my TeX editor
> (WinEdt), it comes out as extended ascii (that's the "gibberish"). So what
> I wanted to do was convert the ascii "gibberish" to my Latin
> transcription. It seems that what you are suggesting is to use the hex
> representation and convert the unicode txt file into a Latin transcription
> file directly and bypass the gibberish.
> On your perl file, can you give me an example of how to use it? I tried
> (in windows, with name
> utf2tex.pl and unicode text in unicode-utf.txt) and get
> > perl utf2tex.pl unicode-utf.txt
> Unknown discipline class ':utf8' at C:/Perl/lib/open.pm line 18.
> BEGIN failed--compilation aborted at utf2tex.pl line 4.
> from your script I tried, e.g.
> $_ =~
> # from alif to `A'
> Your guidance will be greatly appreciated!
> Thnx a million!
More information about the ntg-context