Re: [NTG-context] Greek: GR or EL? Czech: CZ or CS? UK: Ukrainian or ...

5 Dec 2007


      2007/12/5, Mojca Miklavec :
...
Hello,
I have noticed that ConTeXt uses "gr" for Greek, but the ISO code
seems to be "el". Less problematic: should agr be grc instead?
(OpenType uses PGR, but I'm not sure if that's the same thing.)
What do the Greek experts say?
Well, English is a story on its own. "us" and "uk" don't have their
own codes as a separate language, and even worse: uk should stand for
Ukrainian!!!
"Norwegian" (which is not a language at all) should be patched
(according to an old user request) once.
A similar problem exists with:
- Chinese (cn instead of zn)
- Czech (cz instead of cs)
- Vietnamese (vn instead of vi)
- Ukrainian (ua instead of uk!!!)
A case where I have no opinion:
- deo
Some languages have already changed their codes in the past:
- Spanish: sp -> es
- German: du -> de
- Slovenian: si -> sl (no trace left, I hope :)
My proposal would be to change:
- gr -> el
- agr -> grc
- cz -> cs
- vn -> vi
- deo -> ? (if at all)
     gmh - German, Middle High (ca.1050-1500)
     goh - German, Old High (ca.750-1050)
- cn -> zn (with *lots of care*)
And to keep all the needed synonyms. (Besides that: to issue a warning
if possible.)
I have no idea what to do with Ukrainian and UK though.
-------------
Another issue: some languages need some little modifications or alternatives:
1.) In German, Slovenian, Croatian, (maybe in other languages as well)
... one can use two types of quotes:
- „" U+201E/U+201C & ‚' U+201A/U+2018 (sorry, a bug in gmail reencodes them)
- »« U+00BB/U+00AB & ›‹ U+203A/U+2039
It is also common two write «text ‹text› text» in German.
...
It might make sense to be able to say something similar to
\mainlanguage
    [german]
    [quotes | quotationmarks | quotationstyle =
        guillemots | guillemets   or   comma | ninesix]
2.) I could imagine a Serbian user to request being able to typeset in
two scritpts (Latin or Cyrillic). That means:
- different labels
- loading different hyphenation patterns (even though transcription in
either direction can be made on the fly - I can confirm that a user
has already asked me if I know how to input text in cyrillic and get
output in latin - as he wasn't fluent in reading Cyrillic, he wanted
to misuse ConTeXt to help him read texts from web)
So I could imagine making Cyrillic the default script, but still
letting one to use
\mainlanguage
    [serbian]
    [script=latin,
     % or even (if any user would be enthusiastic enough to provide code)
     transliteration=on]
and get latin labels and hyphenation patterns.
3.) Solve the problem with English in a more elegant way:
\mainlanguage
    [english]
    [alternative=us]
or
\mainlanguage[en][US] % as in "en_US.UTF-8"
\mainlanguage[en][GB]
\mainlanguage[en][AU]
\mainlanguage[de][AT] % if one ever figures out that "German from
Germany" isn't good enough
Then, [us] should be kept as a synonym for \mainlanguage[en][US].
(The examples above could also be called via
\mainlanguage[de][alternative=guillemets] or
As mentioned above this won't work.
...
\mainlanguage[sr][alternative=latin].)
4.) deo
\mainlanguage[de][alternative=old] ??? (no idea what that is about)
The old rules should't be used any longer :-)
...
Note that 1.) could be combined (should be "combinable") with this one.
Any thoughts?
I think we should keep the current syntax with mkii and allow better control
in the mkiv code.

Wolfgang

Re: [NTG-context] Greek: GR or EL? Czech: CZ or CS? UK: Ukrainian or ...

Wolfgang Schuster