New subject: Greek: GR or EL? Czech: CZ or CS? UK: Ukrainian or ...

5 Dec 2007

      Hello,

I have noticed that ConTeXt uses "gr" for Greek, but the ISO code
seems to be "el". Less problematic: should agr be grc instead?
(OpenType uses PGR, but I'm not sure if that's the same thing.)

What do the Greek experts say?

Well, English is a story on its own. "us" and "uk" don't have their
own codes as a separate language, and even worse: uk should stand for
Ukrainian!!!

"Norwegian" (which is not a language at all) should be patched
(according to an old user request) once.

A similar problem exists with:
- Chinese (cn instead of zn)
- Czech (cz instead of cs)
- Vietnamese (vn instead of vi)
- Ukrainian (ua instead of uk!!!)

A case where I have no opinion:
- deo

Some languages have already changed their codes in the past:
- Spanish: sp -> es
- German: du -> de
- Slovenian: si -> sl (no trace left, I hope :)

My proposal would be to change:
- gr -> el
- agr -> grc
- cz -> cs
- vn -> vi
- deo -> ? (if at all)
     gmh - German, Middle High (ca.1050-1500)
     goh - German, Old High (ca.750-1050)
- cn -> zn (with *lots of care*)

And to keep all the needed synonyms. (Besides that: to issue a warning
if possible.)

I have no idea what to do with Ukrainian and UK though.

-------------

Another issue: some languages need some little modifications or alternatives:

1.) In German, Slovenian, Croatian, (maybe in other languages as well)
... one can use two types of quotes:
- „" U+201E/U+201C & ‚' U+201A/U+2018 (sorry, a bug in gmail reencodes them)
- »« U+00BB/U+00AB & ›‹ U+203A/U+2039

It might make sense to be able to say something similar to
\mainlanguage
    [german]
    [quotes | quotationmarks | quotationstyle =
        guillemots | guillemets   or   comma | ninesix]

2.) I could imagine a Serbian user to request being able to typeset in
two scritpts (Latin or Cyrillic). That means:
- different labels
- loading different hyphenation patterns (even though transcription in
either direction can be made on the fly - I can confirm that a user
has already asked me if I know how to input text in cyrillic and get
output in latin - as he wasn't fluent in reading Cyrillic, he wanted
to misuse ConTeXt to help him read texts from web)

So I could imagine making Cyrillic the default script, but still
letting one to use

\mainlanguage
    [serbian]
    [script=latin,
     % or even (if any user would be enthusiastic enough to provide code)
     transliteration=on]

and get latin labels and hyphenation patterns.

3.) Solve the problem with English in a more elegant way:

\mainlanguage
    [english]
    [alternative=us]

or

\mainlanguage[en][US] % as in "en_US.UTF-8"
\mainlanguage[en][GB]
\mainlanguage[en][AU]
\mainlanguage[de][AT] % if one ever figures out that "German from
Germany" isn't good enough

Then, [us] should be kept as a synonym for \mainlanguage[en][US].

(The examples above could also be called via
\mainlanguage[de][alternative=guillemets] or
\mainlanguage[sr][alternative=latin].)

4.) deo
\mainlanguage[de][alternative=old] ??? (no idea what that is about)
Note that 1.) could be combined (should be "combinable") with this one.

Any thoughts?

Mojca

Greek: GR or EL? Czech: CZ or CS? UK: Ukrainian or ...

Mojca Miklavec

Thomas A. Schmitz

Arthur Reutenauer

Thomas A. Schmitz

Hans Hagen

Wolfgang Schuster

Henning Hraban Ramm

Arthur Reutenauer

Henning Hraban Ramm

Hans Hagen

Arthur Reutenauer

Mojca Miklavec

tags

participants (6)