Hello, I've tried to find something relevant about the terrible Czech sorting :-) The first thing to note is that there is a standard (from 1970 or so) that is not implementable in fact, it requires such stupid sorts like Karel IV < Karel III as one should sort it as the numbers were written in words (ctvrty, treti) :-) So in practice, there are more-or-less accurate approximations. Quite good intro is http://www.vitsoft.info/sortkit.htm The sorting is considered very reasonable if it conforms with order stated in http://www.fi.muni.cz/~adelton/l10n/cssort/cssort.table . Characters on a single line in the table are considered equivalent. Note the `ch' character that is sorted between h and i. This table contains accented letters that are not used in Czech (like crossed l, z dot above). It should IMHO be also completely OK for Slovak (as they, I hope, inherited the standard). I think that it would be completely OK to sort according to that table taking chars on single lines as equivalent. The modules the table is from implements a four-pass sorting algorithm that reflects pretty damn rules, see http://www.fi.muni.cz/~adelton/l10n/cssort/csort.c . An example of sorted sequences is http://www.fi.muni.cz/~adelton/l10n/cssort/sort.tab . The question is if it is reasonable to implement it internally in ConTeXt or to use an external module. An external Perl module was prepared by Tom Hudec once (he even modified the sorting table, he preferred all letters with `hacek (\v{})' to be greater than without \v. If you consider internal ConTeXt implementation feasible, I'd be happy if you commented the sorting macros a bit, so that I could contact native Czech users and fine-tune it. I'd like to consult it with our Czech TeX frieds, I don't feel myself to be a sorting expert (it's quite tricky, isn't it). Thanks, D.A. -- Early to rise, early to bed, makes a man healthy, wealthy and dead. -- Terry Pratchett, "The Light Fantastic"
participants (1)
-
David Antos