Hi Duncan, Duncan Hothersall schrieb:
Hi all.
I have ConTeXt set up to output Chinese using usemodule[chinese], all fonts, encodings and maps are installed and the sample file works well.
Now I have a whole load of Chinese text in utf-8 encoding. Can ConTeXt process this, or do I have to convert it to another encoding? I tried \enableregime[utf] and \useencoding[uc] but it just produced black blobs instead of Chinese characters.
I hope ConTeXt can do it? :-)
Thanks,
Duncan
i prepared a small perl script to convert chinese utf-8 encoded tex-files to gbk coded tex-files. I call it right before using texexec.pl to create a pdf from the resulting tex-file. It has the advantage that you can use both simplified and traditional characters in one file, if you have full gbk enabled font files. (all chinese ht*.ttf) You can easy see all chinese characters on the screen with any unicode enabled Editor (Scite) Here you are: utf82gbk.pl ----------------------------- #!/usr\bin\perl -w use strict; use utf8; use Encode::HanConvert; our ($filename, $recoded); $filename = $ARGV[0]; $filename=~ s/\.tex$//io ; if (open(INP,"<:utf8","$filename.tex")) { print "processing file $filename.tex\n" ; $/ = "\0777" ; $_ = <INP> ; close(INP) ; simp_to_gb($_); use bytes; if ((open(OUT,">","$filename-gbk.tex"))) { print OUT $_ ; close(OUT) ; } } else { print "invalid filename\n" } if (-e "$filename-gbk.tex") {print "created file $filename-gbk.tex\n"} sub unirecode { my ($a,$b) = @_ ; if ((ord($b)<0x80)&&($b !~ /[a-zA-Z0-9]/)) { print "$b" ; ++$recoded ; return "\\uc\{" . ord($a) . "\}\{". ord($b) . "\}" } else { return "$a$b" } } if (open(INP,"$filename-gbk.tex")) { $recoded = 0 ; print "processing file $filename-gbk.tex " ; $/ = "\0777" ; $_ = <INP> ; close(INP) ; s/([\x80-\xFF])(.)/unirecode($1,$2)/mgoe ; if (($recoded)&&(open(OUT,">$filename.tmp"))) { print OUT $_ ; close(OUT) ; unlink "$filename-gbk.tex" ; rename "$filename-gbk.tmp", "$filename-gbk.tex" ; unlink "$filename-gbk.tmp" ; } if ($recoded) { print " - $recoded glyphs recoded - original saved as $filename-gbk.tec\n" } else { print "- no glyphs recoded\n" } } else { print "invalid filename\n" } ----------------------------- usage: utf82tex filename.tex texexec filename-gbk.tex It's a combination of Hans Hagens tex2uc.pl wich converts codes including tex related characters (\, {, } ...) into \unicodeglyph commands and an easy utf-8 to gbk converter. It needs the module Encode::HanConvert. I created 2 new Menuentries in my Scite Editor. "Create gbk texfile" wich creates filename-gbk.tex and "Process gbk texfile" wich runs texexec on this new file. It works for me very well. I hope this helps a bit until pdftex can handle unicode. Greetings from Potsdam, Germany Lutz P.S. Excuse my bad english