[dev-context] Improved support for Norwegian in ConTeXt

Hans Hagen pragma at wxs.nl
Sun Feb 4 22:40:54 CET 2007


Hi Mojca

an impressive summary ... just provide the patches needed,  take a look 
at 'de' and 'deo' ... we can clone languages  so definitions can be shared

concerning conversions ... there are some language specific things, take 
a look at chinese (s-chi*)
> I would suggest you to post some of the questions to the ntg-mailing
> list, where more Norwegian users can comment on it. When doing some
> changes, the 100% backward compatibility might need to be sacrificed a
> bit, bot some changes are worth doing so if others agree and if it's a
> contribution towards a better quality. (I CC-ed to two users who seem
> to have contributed or complained a bit ;)
>
> On 2/4/07, Karl Ove Hufthammer wrote:
>   
>> I'm writing this to suggest improvements in ConTeXt's support for the
>> Norwegian languages. ConTeXt already has rudimentary support for Norwegian,
>> but with some problems.
>>
>>
>> Language codes
>> --------------
>>
>> The main problem is that ConTeXt use the language code 'no' for Norwegian.
>> There actually *is* no written language called 'Norwegian'; Norway has two
>> official written languages, Norwegian Bokm�l (ISO 639 language code 'nb') and
>> Norwegian Nynorsk (ISO 639 language code 'nn'). The current definitions
>> for 'no' in ConTeXt is for Norwegian Bokm�l. (There is a ISO 639 language
>> code 'no' for Norwegian, but this should usually be used for spoken
>> Norwegian, or perhaps for transcriptions of spoken language.)
>>
>> The language code 'no' should be removed, and be replaced by the two language
>> codes 'nb' and 'nn'.
>>     
>
> Although I don't know the exact situation, a few remarks:
>
> - You should probably also provide the correct definitions for calling
> the language (so that one can say \mainlanguage[norwegian], but
> perhaps with what you consider to be the proper language tags). It's
> currently
>
> \installlanguage [norwegian]   [\s!no]
> \installlanguage [norsk]       [\s!no] % bonus switch
>
> You need to fix the two and perhaps add
> \installlanguage [???]       [\s!nb]
> \installlanguage [???]       [\s!nk]
>
>
> - If you remove [no], older documents might break. I don't know much
> about the situation and the number of users, but can you say which of
> the two language variants [no] should default to? Since the current
> definitions probably point to "nb" (from the first blick) - would it
> make sense to use "nb" when one says \mainlanguage[no]?
>
> Perhaps one can issue a warning when the language "no" is selected
> (statig something like "language 'no' is deprecated, please use 'nb'
> for Bokm�l or nn for Nynorsk instead")
>
> I also asked to replace "si" by "sl" for Slovenian some time ago, but
> that was when there was no support for Slovenian yet and "si" stands
> for Singhalese (whatever that is).
>
> For Norwegian the situation might be slightly different since "no"
> still means Norwegian, but I don't know how "offensive"/"ignorant" it
> sounds to you if that one is used.
>
> Removing it probably doesn't affect the rest, so if other Norwegian
> users agree to remove it completely, it can still be done, but I would
> suggest you to ask the author of the original translations and the
> rest of users on the ntg-context mailing list first. Otherwise it can
> still default to one of the two varians (or to a new one if you
> provide also the third alternative for the "spoken language").
>
>   
>> See http://en.wikipedia.org/wiki/Norwegian_language for a (not too good)
>> article on the Norwegian languages.
>>
>> For the record, the language names used in LaTeX/Babel is
>> (unfortunately) 'Norwegian' and 'norsk' for Norwegian Bokm�l, and 'nynorsk'
>> for Norwegian Nynorsk, instead of 'bokmal'/'bokm�l' and 'nynorsk'. Norwegian
>> Bokm�l support was added first, and used up the 'Norwegian' name.
>>
>>
>> Hyphenation
>> -----------
>>
>> The two written language are quite similar, and the current hyphenation
>> dictionary (nohyphbx) was made to support both. But there are (at least) two
>> words which are put in the hyphenation exceptions for this dictionary because
>> they would have different hyphenation (because of different meaning) in
>> Norwegian Nynorsk and Norwegian Bokm�l. These are:
>>
>> attende -- nb: at-ten-de ('eighteenth'),       nn: att-en-de ('back')
>> betre   -- nb: be-tre ('enter'/'set foot on'), nn: bet-re ('better')
>>
>> Would it be possible to have two different hyphenation dictionaries for 'nb'
>> and 'nn', which would only differ in the hyphenation exceptions used for
>> these two words?
>>     
>
> This can be done. Hans was complaining about the mess of (naming of)
> Norwegian hyphenation patterns one month ago anyway, I guess that "he
> won't mind" adding yet another fix to the scripts ;)
>
>   
>> Language setup
>> --------------
>>
>> Here is an improved/correct version of the language setup for Norwegian. The
>> setup for 'no' should be removed.
>>
>> \installlanguage
>>   [nn]
>>   [spacing=packed,
>>    lefthyphenmin=2,
>>    righthyphenmin=2,
>>    leftsentence=---,
>>    rightsentence=---,
>>    leftsubsentence=---,
>>    rightsubsentence=---,
>>    leftquote=\upperleftsinglesixquote,
>>    rightquote=\upperrightsingleninequote,
>>    leftquotation=\leftguillemot,
>>    rightquotation=\rightguillemot,
>>    date={day,{.},\ ,month,\ ,year},
>>    state=stop]
>>
>> This is for Norwegian Nynorsk ('nn'), but the same setup is used for Norwegian
>> Bokm�l (the values used for 'day' differ, though -- see below).
>>
>> But I am not sure I understand what the four *sentence commands are used for.
>> We usually don't use em-dashes in Norwegian, so the entries look incorrect.
>> If you can explain what the commands are used for, I can supply the correct
>> Norwegian definitions.
>>
>> I also noticed that the Italian definitions use leftspeech, middlespeech and
>> rightspeech commands. What are these used for?
>>
>>
>> Other language-specific settings
>> --------------------------------
>>
>> Norwegian (Bokm�l and Nynorsk) differs typographically from English in several
>> other ways. Here is three of them:
>>
>> We don't (usually) use bullets for the first level of unnumbered lists; we use
>> en-dashes.
>>
>> -- Item 1
>> -- Item 2
>> -- Item 3
>>
>> Bullets are commonly seen in document created by word processors of US origin,
>> and in the documents created by people without proper typographic training,
>> though. It would be nice if ConTeXt could use en-dashes by default for lists
>> in Norwegian text.
>>     
>
> The default is to use
>    bullet, dash, star, triangle
> for the four levels if itemization.
>
> If you want to change the behaviour in your document only, all you need to do is
>     \definesymbol[1][\endash]
> but I guess that it could be adapted, so that Norwegian documents will
> all use endash by default.
>
> Similar supoprt has already been implemented for Slovenian (to use
> different set of characters when itemize uses characters).
>
> There are two questions:
> - do other Norwegian users agree to change the default set?
> - what should be the order then? (ie: what character should be used
> for the second level of itemization?)
>
>   
>> We don't use full stops in numbered lists. In other words, instead of
>>
>> 1. Item 1
>> 2. Item 2
>> 3. Item 3
>>
>> we write
>>
>> 1  Item 1
>> 2  Item 2
>> 3  Item 3
>>     
>
> That's the matter of
> \setupitemize[stopper=]
>
> I don't know how to set that in a langage-specific way, but it sounds
> reasonable me to add it.
>
>   
>> The same holds for numbered headings, both in the main text and in the TOC.
>>     
>
> But sections already start with
>    1 Section name
> rather than
>    1. Section name
> by default. (Support for the second case might be improved in the
> future. Or rather: I hope that it will be.)
>
>   
>> Would it be possible to support this by default in ConTeXt?
>>
>> We also use the comma in decimal numbers (3,14 instead of 3.14).
>>     
>
> We too. In text this is no problem anyway. Math can be setup in that
> way, but I doubt that it's set up in any language (although it could
> be). This means that you should better write $3{,}14$ instead of
> $3,14$, I don't know about any other consequences, since TeX almost
> never writes out any calculated floats in the resulting document.
>
>   
>> Norwegian labels
>> ----------------
>>
>> Here is labels for Norwegian (Bokm�l and Nynorsk). The old 'no' labels should
>> be removed. The 'nb' ones are taken from the 'no' ones, but with some
>> corrections.
>>
>> Some comments: We don't usually capitalise the first letter in
>> crossreferences. Where one would in English write
>>
>> See Figure 5.22 ...
>>
>> we would write
>>
>> Se figur 5.22 ... (Bokm�l)
>> Sj� figur 5.22 ... (Nynorsk)
>>     
>
> But when you crossreference, you only get 5.22, you have to write
> "figur" manually (you can set up that perhaps, so that you get
> "figure" attached to the number, but in any case you need to do that
> manually).
>
> "Figur 5.22" will only be used under the actual image. When
> crossreferencing, we use lowercase too, but under the fugure itself I
> think that uppercase is OK, at least for our language (since it's
> caption of the figure anyway).
>
>   
>> But we would of course write
>>
>> Figur 5.22 viser ...
>> (Figure 5.22 shows ...)
>>
>> The definitions below use a capital first letter. Will this be a problem?
>>
>> I was also unsure about what the 'lines' label should be. The plural of 'line'
>> ('linje') in Norwegian (both 'nb' and 'nn') is 'linjer', but we do not use
>> the plural when referencing more than one line. Where one would write
>>
>> The discussion on lines 5--13 ...
>>
>> in English, we would write
>>
>> Dr�ftinga p� linje 5--13 ...
>>
>> in Norwegian. In other words, we use the singular instead of the plural. The
>> same holds for the other cross-referencing terms ('Figure', 'Table' &c.).
>>
>> Feel free to change the 'lines' label to 'linje' if this make it work better.
>>     
>
> I don't know where exactly this is used, but I assume that it's for
> "List of Figures", "List of Tables". But I don't know exactly, I never
> use those. (I have just translated some of them and I hoped that the
> first one who will consider them wrong will complain ;)
>
> Mojca
> _______________________________________________
> dev-context mailing list
> dev-context at ntg.nl
> http://www.ntg.nl/mailman/listinfo/dev-context
>   


-- 

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------



More information about the dev-context mailing list