2007/12/5, Mojca Miklavec
Hello,
I have noticed that ConTeXt uses "gr" for Greek, but the ISO code seems to be "el". Less problematic: should agr be grc instead? (OpenType uses PGR, but I'm not sure if that's the same thing.)
What do the Greek experts say?
Well, English is a story on its own. "us" and "uk" don't have their own codes as a separate language, and even worse: uk should stand for Ukrainian!!!
"Norwegian" (which is not a language at all) should be patched (according to an old user request) once.
A similar problem exists with: - Chinese (cn instead of zn) - Czech (cz instead of cs) - Vietnamese (vn instead of vi) - Ukrainian (ua instead of uk!!!)
A case where I have no opinion: - deo
Some languages have already changed their codes in the past: - Spanish: sp -> es - German: du -> de - Slovenian: si -> sl (no trace left, I hope :)
My proposal would be to change: - gr -> el - agr -> grc - cz -> cs - vn -> vi - deo -> ? (if at all) gmh - German, Middle High (ca.1050-1500) goh - German, Old High (ca.750-1050) - cn -> zn (with *lots of care*)
And to keep all the needed synonyms. (Besides that: to issue a warning if possible.)
I have no idea what to do with Ukrainian and UK though.
-------------
Another issue: some languages need some little modifications or alternatives:
1.) In German, Slovenian, Croatian, (maybe in other languages as well) ... one can use two types of quotes: - „" U+201E/U+201C & ‚' U+201A/U+2018 (sorry, a bug in gmail reencodes them) - »« U+00BB/U+00AB & ›‹ U+203A/U+2039
It is also common two write «text ‹text› text» in German.
It might make sense to be able to say something similar to \mainlanguage [german] [quotes | quotationmarks | quotationstyle = guillemots | guillemets or comma | ninesix]
2.) I could imagine a Serbian user to request being able to typeset in two scritpts (Latin or Cyrillic). That means: - different labels - loading different hyphenation patterns (even though transcription in either direction can be made on the fly - I can confirm that a user has already asked me if I know how to input text in cyrillic and get output in latin - as he wasn't fluent in reading Cyrillic, he wanted to misuse ConTeXt to help him read texts from web)
So I could imagine making Cyrillic the default script, but still letting one to use
\mainlanguage [serbian] [script=latin, % or even (if any user would be enthusiastic enough to provide code) transliteration=on]
and get latin labels and hyphenation patterns.
3.) Solve the problem with English in a more elegant way:
\mainlanguage [english] [alternative=us]
or
\mainlanguage[en][US] % as in "en_US.UTF-8" \mainlanguage[en][GB] \mainlanguage[en][AU] \mainlanguage[de][AT] % if one ever figures out that "German from Germany" isn't good enough
Then, [us] should be kept as a synonym for \mainlanguage[en][US].
(The examples above could also be called via \mainlanguage[de][alternative=guillemets] or
As mentioned above this won't work.
\mainlanguage[sr][alternative=latin].)
4.) deo \mainlanguage[de][alternative=old] ??? (no idea what that is about)
The old rules should't be used any longer :-)
Note that 1.) could be combined (should be "combinable") with this one.
Any thoughts?
I think we should keep the current syntax with mkii and allow better control in the mkiv code. Wolfgang