Mojca Miklavec wrote:
Hello Hans,
On 8/24/07, Hans Hagen wrote:
Hi,
I uploaded a new version of mkiv (regular zip).
Thanks a lot!
- case changing using attributes and node processing
simple test file for spacing and casing:
I'm attaching a not-so-trivial test file for "casing", just to see how well it works for Croatian.
A few observations:
- LM doesn't have any lj, nj, dz, dž, ... (probably another request for the Polish guys)
hm, just write a small proposal ... however, dealing with non present chars is to be dealt with anyway
- It would be great if MK IV did the trasformation from digraphs to normal letters in case those digraphs are not present in the font itself (for ij, lj, nj, dz, dž, ... just as it would be great if ccaron was automatically composed out of c and caron if the letter wasn't present in that font).
\definefontfeature [test][mode=node,language=dflt,script=latn,complement=yes] {\font\test = lmtypewriter8-regular*test at 12.3pt \test ljubljana Ljubljana LJUBLJANA } currently the complement only replaces LATIN/compat combinations (see char-def.lua)
Visually there is probably no difference in plain text, except in exactly the cases for which you're sending the tests (that's casing and spacing). See http://en.wikipedia.org/wiki/Gaj's_Latin_alphabet how the word "MJENJACNICA" is split into letters. Normal people still type n+j in text, not the digraph "?" (nj), but in case you get some text with those digraphs which are valid Unicode letters, it would be nice if they were processed ...
dealing with n+j in text is too dangerous to catch, unless we start implementing complex language depenent replacements, and even then it's messy (what to do when one really wants a nj (two char)) ... so, thos old docs can best be converted to proper utf then
\starttext
test: oeps {\setcharacterspacing[frenchpunctuation] x: xx \bfd x: xx} oeps: test
test \WORD{test TEST \TeX} test
test \word{test TEST \TeX} test
test \Word{test TEST \TeX} test
Another few observations: - \word doesn't work in XeTeX
no, neither in pdftex i think; new
- What exactly is \Words supposed to do (with non-first letters in a word)?
make first chars uppercase but only when the next is a char; (i changed it a bit, defs were not seen (overloaded later by macros)
- ConTeXt with XeTeX outputs dozens of empty lines to the console.
indeed, has to do with the fact that i need to test if a font is present on the system (file vs name stuff) and the empty lines are a side effect of entering/exiting batchmode
An extra challenge would be to get this work (but unless some Croats ask you for that or unless you have too much time left, don't bother about that - it needs slightly more than only lccode and uccode of a letter since there are three forms: one for lowercase [ljubljana -> lj], one for all-uppercase words [LJUBLJANA -> LJ] and one for the first letter of a word starting with an uppercase [Ljubljana -> Lj]):
In Unicode:
\word{?ub?ana} -> ?ub?ana \Word{?ub?ana} -> ?ub?ana \WORD{?ub?ana} -> ?UB?ANA
\word{?ub?ana} -> ?ub?ana \Word{?ub?ana} -> ?ub?ana \WORD{?ub?ana} -> ?UB?ANA
\word{?UB?ANA} -> ?ub?ana \Word{?UB?ANA} -> ?ub?ana \WORD{?UB?ANA} -> ?UB?ANA
as long as we have utf it's already taken care of
In Latin transcript (in case you have problems seing some Unicode letters):
\word{ljubljana} -> ljubljana \Word{ljubljana} -> Ljubljana \WORD{ljubljana} -> LJUBLJANA
\word{Ljubljana} -> ljubljana \Word{Ljubljana} -> Ljubljana \WORD{Ljubljana} -> LJUBLJANA
\word{LJUBLJANA} -> ljubljana \Word{LJUBLJANA} -> Ljubljana \WORD{LJUBLJANA} -> LJUBLJANA
See also:
http://unicode.org/cldr/data/common/collation/hr.xml http://en.wikipedia.org/wiki/Gaj's_Latin_alphabet
{\setcharacterkerning[extrakerning]\input zapf\endgraf }
(That could be "backported" to XeTeX. I think it enables a similar feature now, but I should check.)
hm, i'm not going to backport everything; keep in mind that i these features are not font related; actually future mkiv versions will also do dynamic feature change so ... anyhow, ... new upload to play with Hans -- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------