On Sat, Nov 30, 2002 at 09:15:45PM +0100, Gour wrote:
Simon Pepping (spepping@scaprea.hobby.nl) wrote:
I would like to know that too :-) I have not yet found the time to find out how Context deals with encodings. I only have a note that says that one should do \useXMLfilter [utf], and that I should have a look at the xtag-utf (which is input by the above command) or enco files.
As far as I can see ConTeXt does not understand utf-8 encoding.
Where did you find this note mentioning utf?
On my computer :-) I collected remarks made on this list in that document.
Some time ago I saw a post on DocBook list from Sebastian Rahtz who is considering to rewrite PassiveTex with ConTeXt support instead of LaTeX.
That would be very good; much better than just doing docbook. Sometimes I think I would better spend my time on such an effort, but I am afraid it is a huge task.
The question remains, how to do it with multi-lingual document encoded in utf-8?
Any hint?
As is the case more often in open source: do it yourself. Hans has not taken part in this discussion, so I think he does not feel like embarking on an effort in this area. The basic mechanism to make TeX work with encodings is to declare all characters above 127 active, and map them to a suitable control sequence. But that only works with single-byte encodings. xmltex, David Carlisle's XML parser in tex, which is used by Passivetex, can swallow and interpret utf-8 encoding. I think he applies the utf-8 rules to the sequences of single bytes. It should be easy to transfer this to Context, because it should not be macro package dependent. The other options are: use an input filter, like the program that was mentioned in this thread. Or use NTS, the java based TeX implementation. Currently it does not deal with multibyte encodings because it is artificially restricted to 256 characters (if I remember correctly) and because there are no input encoding macro packages for higher character codes. Sebastian's PassiveTeX has long mapping tables for unicode to latex control sequences. These can be translated to context. (And they could be made to work with NTS.) While I am writing this, I am beginning to think that copying xmltex's algorithm to context is the best way to go. Regards, Simon -- Simon Pepping email: spepping@scaprea.hobby.nl