Ahoi, usually, uppercase index entries are sorted before all lowercase entries. Is there a simple setup to change that? i.e. I need the sorting sum - Sun - sunny instead of Sun - sum - sunny I know I can influence sort order for single entries, but I’d like a general setting. (Diacritics are handled as wanted.) \starttext \strut \index{Sun}\index{sun}\index{Suomi}\index{suave} \index{sunny}\index{sum}\index{Sumatra}\index{summon} \index{sample}\index{super} \index{şample}\index{südlich}\index{súper} \index{şun}\index{sün}\index{şüñ} \completeindex \stoptext Greetlings, Hraban --- https://www.fiee.net http://wiki.contextgarden.net https://www.dreiviertelhaus.de GPG Key ID 1C9B22FD
On 06/10/2018 11:49 AM, Henning Hraban Ramm wrote:
Ahoi,
usually, uppercase index entries are sorted before all lowercase entries. Is there a simple setup to change that?
i.e. I need the sorting sum - Sun - sunny instead of Sun - sum - sunny
I know I can influence sort order for single entries, but I’d like a general setting.
Hi Hraban, I think this may achieve what you want: \setupregister[index][method={zm,zc}] I hope it helps, Pablo
(Diacritics are handled as wanted.)
\starttext \strut \index{Sun}\index{sun}\index{Suomi}\index{suave} \index{sunny}\index{sum}\index{Sumatra}\index{summon} \index{sample}\index{super} \index{şample}\index{südlich}\index{súper} \index{şun}\index{sün}\index{şüñ}
\completeindex
\stoptext
Greetlings, Hraban --- https://www.fiee.net http://wiki.contextgarden.net https://www.dreiviertelhaus.de GPG Key ID 1C9B22FD
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___________________________________________________________________________________
Am 2018-06-10 um 12:42 schrieb Pablo Rodriguez
On 06/10/2018 11:49 AM, Henning Hraban Ramm wrote:
Ahoi,
usually, uppercase index entries are sorted before all lowercase entries. Is there a simple setup to change that?
i.e. I need the sorting sum - Sun - sunny instead of Sun - sum - sunny
I know I can influence sort order for single entries, but I’d like a general setting.
Hi Hraban,
I think this may achieve what you want:
\setupregister[index][method={zm,zc}]
I hope it helps,
Thank you very much! I overlooked that there is indeed some documentation: http://wiki.contextgarden.net/Command/keyword:method (Source: http://repo.or.cz/w/context.git/blob/HEAD:/tex/context/base/sort-ini.lua) Greetlings, Hraban --- https://www.fiee.net http://wiki.contextgarden.net https://www.dreiviertelhaus.de GPG Key ID 1C9B22FD
Am 2018-06-10 um 13:07 schrieb Henning Hraban Ramm
I think this may achieve what you want: \setupregister[index][method={zm,zc}] I hope it helps,
Thank you very much!
I overlooked that there is indeed some documentation: http://wiki.contextgarden.net/Command/keyword:method
But reading through that and the source I’m still confused. Source snippet: local validmethods = tohash { "ch", -- raw character (for tracing) "mm", -- minus mapping "zm", -- zero mapping "pm", -- plus mapping "mc", -- lower case - 1 "zc", -- lower case "pc", -- lower case + 1 "uc", -- unicode } local predefinedmethods = { [v_default] = "zc,pc,zm,pm,uc", [v_before] = "mm,mc,uc", [v_after] = "pm,mc,uc", [v_first] = "pc,mm,uc", [v_last] = "pc,mm,uc", } I’d like to write a proper explanation for the wiki (and my book). Is there anywhere documentation about the meaning/goal of the presets or algorithms? Are these codes translatable into something like "ignore diacritics", "ignore upper/lowercase" etc.? Greetlings, Hraban --- https://www.fiee.net http://wiki.contextgarden.net https://www.dreiviertelhaus.de GPG Key ID 1C9B22FD
Am 2018-06-10 um 14:11 schrieb Henning Hraban Ramm
Am 2018-06-10 um 13:07 schrieb Henning Hraban Ramm
: I think this may achieve what you want: \setupregister[index][method={zm,zc}] I hope it helps,
Thank you very much!
I overlooked that there is indeed some documentation: http://wiki.contextgarden.net/Command/keyword:method
But reading through that and the source I’m still confused.
I’d like to write a proper explanation for the wiki (and my book).
Is there anywhere documentation about the meaning/goal of the presets or algorithms?
Are these codes translatable into something like "ignore diacritics", "ignore upper/lowercase" etc.?
Ok, I think I got it... For a proper sorting, you first apply a "mapping", then a "casing" and finally "unicode". Presets: default = upper like lowercase, diacritics separate before = upper before lower, diacritics ignored after = lower before upper, diacritics ignored first = lower before upper, diacritics separate last = upper before lower, diacritics separate * If you don’t set the sorting method, the preset "first" ist used (and not "default"). * There’s no preset for the (in my eyes most meaningful) combination "upper like lowercase, diacritics ignored" (zm,zc,uc). * Aren’t language specific sorting rules possible at the current state? Or does "unicode" handle that? E.g. -- DIN 5007-1 (German default sorting) is like zm,zc,uc, but ß should be sorted like ss. -- DIN 5007-2 (German phonebook sorting) would additionally require umlauts to be sorted as ä = ae etc. -- Austrian phonebook sorting sorts umlauts after base vocals, i.e. a, ä, o, ö, u, ü, s, ß. -- Danish and Norwegian: x, y, z, æ, ø, å -- Finnish and Swedish: x, y = ü, z, æ, ä, ö, ø, å (until 2006 v = w) -- etc. (according to https://de.wikipedia.org/wiki/Alphabetische_Sortierung) If nobody objects I’ll add this to the wiki. Greetlings, Hraban --- https://www.fiee.net http://wiki.contextgarden.net https://www.dreiviertelhaus.de GPG Key ID 1C9B22FD
On 06/10/2018 06:16 PM, Henning Hraban Ramm wrote:
[...] * Aren’t language specific sorting rules possible at the current state? Or does "unicode" handle that? E.g. -- DIN 5007-1 (German default sorting) is like zm,zc,uc, but ß should be sorted like ss. -- DIN 5007-2 (German phonebook sorting) would additionally require umlauts to be sorted as ä = ae etc. -- Austrian phonebook sorting sorts umlauts after base vocals, i.e. a, ä, o, ö, u, ü, s, ß. -- Danish and Norwegian: x, y, z, æ, ø, å -- Finnish and Swedish: x, y = ü, z, æ, ä, ö, ø, å (until 2006 v = w) -- etc. (according to https://de.wikipedia.org/wiki/Alphabetische_Sortierung)
sort-lan.lua contains different language definitions. Among others: DIN 5007-1, DIN 5007-2, Duden. Languages "de-AT", "no", "da" and "sv" are ordered as you explain. Swedish doesn’t contain ø (according to https://sv.wikipedia.org/wiki/Ø, it is replaced with ö) or ü (it is a foreign letter to their alphabet). and "fi" seems to order the way you describe. And v is different from w. I would say, Finnish isn’t included in sort-lan.lua.
If nobody objects I’ll add this to the wiki.
Please, it would be extremely helpful (I remember thinking that registers didn’t make any sense in ConTeXt, before someone helped me). Pablo -- http://www.ousia.tk
Am 2018-06-10 um 20:09 schrieb Pablo Rodriguez
On 06/10/2018 06:16 PM, Henning Hraban Ramm wrote:
[...] * Aren’t language specific sorting rules possible at the current state? Or does "unicode" handle that? E.g. -- DIN 5007-1 (German default sorting) is like zm,zc,uc, but ß should be sorted like ss. -- DIN 5007-2 (German phonebook sorting) would additionally require umlauts to be sorted as ä = ae etc. -- Austrian phonebook sorting sorts umlauts after base vocals, i.e. a, ä, o, ö, u, ü, s, ß. -- Danish and Norwegian: x, y, z, æ, ø, å -- Finnish and Swedish: x, y = ü, z, æ, ä, ö, ø, å (until 2006 v = w) -- etc. (according to https://de.wikipedia.org/wiki/Alphabetische_Sortierung)
sort-lan.lua contains different language definitions.
Ah, thanks for the hint. But how can I employ these definitions with index/list ordering? Setting mainlanguage and method "*,uc" doesn’t seem to do the trick.
Among others: DIN 5007-1, DIN 5007-2, Duden.
Hans, please add the replacement { "ß", "ss" } to definitions['DIN 5007-1'] and definitions['DIN 5007-2'] Thank you!
Swedish doesn’t contain ø (according to https://sv.wikipedia.org/wiki/Ø, it is replaced with ö) or ü (it is a foreign letter to their alphabet). and "fi" seems to order the way you describe. And v is different from w.
I wouldn’t touch it then, who knows how accurate German wikipedia is...
I would say, Finnish isn’t included in sort-lan.lua.
Yes it is. (http://source.contextgarden.net/tex/context/base/mkiv/sort-lan.lua)
Please, it would be extremely helpful (I remember thinking that registers didn’t make any sense in ConTeXt, before someone helped me).
As soon as I understand how the language dependent sorting works... Greetlings, Hraban --- https://www.fiee.net http://wiki.contextgarden.net https://www.dreiviertelhaus.de GPG Key ID 1C9B22FD
On 6/10/2018 8:50 PM, Henning Hraban Ramm wrote:
Hans, please add the replacement { "ß", "ss" } to definitions['DIN 5007-1'] and definitions['DIN 5007-2'] wolfgang provided these so he has to give his blessing
Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
Hans Hagen mailto:j.hagen@xs4all.nl 10. Juni 2018 um 21:07 On 6/10/2018 8:50 PM, Henning Hraban Ramm wrote:
wolfgang provided these so he has to give his blessing
You can add them. AFAIK the ß replacement was added later (no idea who sent it) because it wasn’t in the patch I found in my mail archive. Wolfgang
Am 2018-06-10 um 20:50 schrieb Henning Hraban Ramm
Ah, thanks for the hint. But how can I employ these definitions with index/list ordering? Setting mainlanguage and method "*,uc" doesn’t seem to do the trick.
Sorry, found it. In my test file there was still "language=cz" in the setup, and the language key is not documented.
Hans, please add the replacement { "ß", "ss" } to definitions['DIN 5007-1'] and definitions['DIN 5007-2'] Thank you!
Wikified: http://wiki.contextgarden.net/Command/setupregister http://wiki.contextgarden.net/Command/keyword:method Greetlings, Hraban --- https://www.fiee.net http://wiki.contextgarden.net https://www.dreiviertelhaus.de GPG Key ID 1C9B22FD
participants (4)
-
Hans Hagen
-
Henning Hraban Ramm
-
Pablo Rodriguez
-
Wolfgang Schuster