On Sat, 05 Jun 2004 22:41:39 +0200, Thomas A. Schmitz
Idris,
I know a bit of perl and would love to help. However, I fear that sending us your stuff via mail will be a bit difficult because the utf-8 chracters get transformed into gibberish.
Thnx 4 such a speedy reply! I don't think you are getting gibberish though; you should be getting the extended ascii representation. So the letter alif (hex 0627) should look like this: ا Do you get a forward-slashed circle and a section symbol? If so, that's the ascii representation I'm trying to convert to the letter `A'. Here are the codes you want: ا [0627] => A ب [0628] => b ج [062C] => j د [062F] => d ه [0647] => h و [0648] => w ز [0632] => z Let me explain my situation more clearly:-) I have a unicode editor, Unitype Global Writer. I save a unicode document as a utf *.txt file. When I open that saved file in my TeX editor (WinEdt), it comes out as extended ascii (that's the "gibberish"). So what I wanted to do was convert the ascii "gibberish" to my Latin transcription. It seems that what you are suggesting is to use the hex representation and convert the unicode txt file into a Latin transcription file directly and bypass the gibberish. On your perl file, can you give me an example of how to use it? I tried (in windows, with name utf2tex.pl and unicode text in unicode-utf.txt) and get =========================
perl utf2tex.pl unicode-utf.txt Unknown discipline class ':utf8' at C:/Perl/lib/open.pm line 18. BEGIN failed--compilation aborted at utf2tex.pl line 4. =========================
from your script I tried, e.g. ============================ $_ =~ s/\x{0627}/\x{0041}/esg; # from alif to `A' ============================ Your guidance will be greatly appreciated! Thnx a million! Idris -- Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523