Some time ago there was a discussion about extending support for
different regimes in ConTeXt. The list of (to-be-)supported regimes
probably depends strongly on the implementation (ruby+iconv?). I
collected a preliminary list of candidate regimes and possible synonyms
(some synonyms are listed there for backward compatibility and have to
remain there), leaving out most of eastern encodings (not because they
shouldn't be on the list, but because I'm completely ignorant about that).
Hans suggested to post this to the mailing list first to get some useful
comments and suggestions.
The following question should probably go in a separate thread, but it's
a very similar thematic. In July 2006 Ljubljana will host people from
around 85 coutries of the world. One of the very ambitious organizers is
dreaming for already a couple of years to print the participant names
(on honourable mentions for example, ...) in both latinic transcription
and as they are written in original (under an assumption that the names
are properly entered in a UTF-8 database). This is probably not possible
to do for every single obscure language, but does it in general sound like:
a) Good luck (I don't want to be on your place)!
b) Take a good (commercial) program
c) If you're ready to invest the rest of your time (forget about
hobbies!), it's probably doable in LaTeX or ConTeXt until then
č) Forget about TeX - it will be possible to solve this problem one day
with unicode & one of the new TeX engines. But until then, it's not
worth the effort, because any effort you may invest will become obsolete
in a couple of years.
To be honest, even some people who will thanslate the materials into the
native language, will probably do that with paper, pencil & scanner.
And here the encodings:
ISO-8859-2 Central European
ISO-8859-3 South European
ISO-8859-8 Hebrew Visual
ISO-8859-8-I Hebrew (???) What is that?
% I'm not sure that anyone needs these:
% backward compatibility
(recode also recognises "arabic", "greek", "cyrillic", "hebrew" as
an alias for those encodings: I don't if this is a good idea as there
are other charset operating with the same language groups as well)
% CentEur, CentralEurope or CentralEuropean? or all of them?
(I also need some help here: sometimes Mac encodings are defined using
adjectives, sometimes using nouns, like Ukraine/Ukrainian. Should only
one of them (which?) be used or both of them? On the unicode page, Mac
encodings appear twice. The second time under Microsoft/Apple,
containing MacCyrillic, MacGreek, MacIceland, MacLatin2, MacRoman,
MacTurkish. I didn't really get the point for that.)
% essentially the same as under Microsoft, with some minor changes
(to be processed manually, if these are to be supported)
EBCDIC % plenty of them are missing on the web
cp866 Cyrillic - Russian
cp874 Thai (repeats from some unknown reason)
cp936 PRC GBK
cp1250 Central European
% backward compatibility
% there are some other possibilities:
% ms-ee, ms-cyrl, ms-ansi, as-greek, ms-turk, ms-hebr, ms-arab, ...
% anyone thinks that they are needed?
% It is not online in Unicode, but it is somewhere already:
#### Some very confusing part (I should leave it out) ####
# MISC (? probably none of them to be processed)
NextStep (What's that???)
% Missing in Unicode mapping (online)