On Oct 5, 2010, at 2:15 PM, Philipp Gesang wrote:
Hi Thomas and others,
technically speaking the problem is solved by ISO 14651.[1]
In praxi multilingual sorting depends on local rules, of which “One index per script|language.” seems to be the most common.
Yes, that's what I was trying to say. In practice, hardly anyone will want an individual index for Spanish if they have just two Spanish words in an English book. And someone (me) might say that they want three Greek terms in their German index at logical places.
Some time ago I made an lpeg from the bnf in [1]. It matches the collation rules from [2], but as I couldn’t figure out how to map them onto context’s sorting mechanism I never got around to actually capture the information. As I won’t be having the time to try it with the new structure of sort-lan I guess I’ll just attach the peg grammar for anyone to use as a starting point. Unicode collation would be great to have in context.
transliteration. The problem with polytonic Greek is that so many different unicode characters need to have the same sort entry. If
Isn’t that just what the Greek rules in sort-lan.lua do? If not then it would be a bug.
Oh yes, you're right, I missed that. Thanks for pointing that out! Thomas