[NTG-context] unic-xxx.tex glyph lists: minor bugs, questions
Philipp Reichmuth
reichmuth at web.de
Sun Nov 5 02:24:54 CET 2006
Hi,
I've been writing a script that sifts through the unic-xxx.tex files to
get a readable mapping what Unicode characters are supported using
\Amacron-style names.
In the process I found one bug and something that might be another bug:
- the Cyrillic block (unic-004.tex) is missing an \unknownchar line for
U+04CF, so that the remaining (few) glyphs are off by one
- the Hebrew block (unic-005.tex) starts with a \numexpr line indicating
an offset of 224 = E0; however, the first character in the list is
U+05D0. So either the whole block is off by 16, starting at 0x0490
instead of 0x0500, or the 224 should be a 208 (=D0) instead. BTW
unic-005.tex is the only file with Macintosh line endings. Are the
unic-xxx files automatically generated or maintained by hand?
Incidentally, it would be trivial now to put the list of ConTeXt glyphs
on the Wiki, if anyone's interested.
I wanted to use this to work towards better support for the whole range
of ConTeXt glyphs with OpenType fonts under XeTeX, by reading what
ConTeXt glyphs are available in a font and building a list of
"\catcode`ā=\active \def ā {\amacron}"-style list for the rest.
(Unfortunately this kind of list would be font-specific, but the generic
alternative would be a huge list of active characters with an
\ifnum\XeTeXcharglyph"....>0 macro behind it, and that would probable be
quite slow.) I wonder if there is a more intelligent way to achieve
this goal; since part of the logic for mapping code points into glyph
macros exists already, it would be easier if there was a way to reuse that.
The best way out would be if I could enable ConTeXt's UTF-8 regime while
running XeTeX in \XeTeXinputencoding=bytes mode, but I haven't gotten
that to work yet.
Philipp
More information about the ntg-context
mailing list