The issues of indexing, &c., probably fall into two issues: a. Is is something European-derivative in reference to a work or b. It is something entirely for "native-speaking" use and expectations? I've been on the Ivritex list for quite a while and there has been some long-running issues on how to deal with mixed versus pure texts and what people ought expect. I have seen considerable variance in Hebrew materials from the latter nineteenth-century to today in which they, for example, consider the ex-height in relation to superdiacritica and subdiacritica from nikkud to cantillation. They have had to tackle the issues of handling a mixed-versus non-mixed language text. It's nontrivial. Just from an historical perspective, at one time Latin and other languages concatenated the articles to the words, for example, in the nomenclature Alcoran for the Qur'an. Today in indexing (I have used Cindex to do quite a few book indices) one generally drops the definite and indefinite articles of most languages. Even in contents and chapter headings, one aviods articles except in informal literature for entertainment consumption. That may be language-dependent, for in German and Greek one does have to use articles more than in English. Still, I have seldom seen an index with arthrous forms in any language. CPS On Fri, 2008-06-20 at 19:02 +0300, Khaled Hosny wrote:
On Fri, Jun 20, 2008 at 09:34:40AM +0200, Hans Hagen wrote:
Idris Samawi Hamid wrote:
On Thu, 19 Jun 2008 18:23:05 -0600, Khaled Hosny
wrote: Arabic index entries are all listed under "unknown" instead of its respective Arabic letters. I'm not sure if this is a bug or a misconfiguration from my side. See the attached example.
We need to include arabic-farsi-urdu etc. databases in the distro. If Hans can tell us what file to emulate/edit etc....
first we need to discuss the logic ... say that we have a sequence of chars ... do we need to erase the vowels? etc
Erase vowels as in not counting them? Then yes we should only respect full letters. We might need also need to strip the Arabic definite article "ال", but this will be tricky since there are words that start with it. May be we better have syntax like \index[a]{entry} where this entry will be under "a", or we already have this?
Regards, Khaled
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________