Hello Hans, On 8/24/07, Hans Hagen wrote:
Hi,
I uploaded a new version of mkiv (regular zip).
Thanks a lot!
- case changing using attributes and node processing
simple test file for spacing and casing:
I'm attaching a not-so-trivial test file for "casing", just to see how well it works for Croatian. A few observations: - LM doesn't have any lj, nj, dz, dž, ... (probably another request for the Polish guys) - It would be great if MK IV did the trasformation from digraphs to normal letters in case those digraphs are not present in the font itself (for ij, lj, nj, dz, dž, ... just as it would be great if ccaron was automatically composed out of c and caron if the letter wasn't present in that font). Visually there is probably no difference in plain text, except in exactly the cases for which you're sending the tests (that's casing and spacing). See http://en.wikipedia.org/wiki/Gaj's_Latin_alphabet how the word "MJENJAČNICA" is split into letters. Normal people still type n+j in text, not the digraph "nj" (nj), but in case you get some text with those digraphs which are valid Unicode letters, it would be nice if they were processed ...
\starttext
test: oeps {\setcharacterspacing[frenchpunctuation] x: xx \bfd x: xx} oeps: test
test \WORD{test TEST \TeX} test
test \word{test TEST \TeX} test
test \Word{test TEST \TeX} test
Another few observations: - \word doesn't work in XeTeX - What exactly is \Words supposed to do (with non-first letters in a word)? - ConTeXt with XeTeX outputs dozens of empty lines to the console. An extra challenge would be to get this work (but unless some Croats ask you for that or unless you have too much time left, don't bother about that - it needs slightly more than only lccode and uccode of a letter since there are three forms: one for lowercase [ljubljana -> lj], one for all-uppercase words [LJUBLJANA -> LJ] and one for the first letter of a word starting with an uppercase [Ljubljana -> Lj]): In Unicode: \word{ljubljana} -> ljubljana \Word{ljubljana} -> Ljubljana \WORD{ljubljana} -> LJUBLJANA \word{Ljubljana} -> ljubljana \Word{Ljubljana} -> Ljubljana \WORD{Ljubljana} -> LJUBLJANA \word{LJUBLJANA} -> ljubljana \Word{LJUBLJANA} -> Ljubljana \WORD{LJUBLJANA} -> LJUBLJANA In Latin transcript (in case you have problems seing some Unicode letters): \word{ljubljana} -> ljubljana \Word{ljubljana} -> Ljubljana \WORD{ljubljana} -> LJUBLJANA \word{Ljubljana} -> ljubljana \Word{Ljubljana} -> Ljubljana \WORD{Ljubljana} -> LJUBLJANA \word{LJUBLJANA} -> ljubljana \Word{LJUBLJANA} -> Ljubljana \WORD{LJUBLJANA} -> LJUBLJANA See also: http://unicode.org/cldr/data/common/collation/hr.xml http://en.wikipedia.org/wiki/Gaj's_Latin_alphabet
{\setcharacterkerning[extrakerning]\input zapf\endgraf }
(That could be "backported" to XeTeX. I think it enables a similar feature now, but I should check.) Mojca