[NTG-context] accessing glyphs in the private area

Hans Hagen j.hagen at xs4all.nl
Tue Oct 2 11:29:46 CEST 2018


On 10/2/2018 9:29 AM, Ulrike Fischer wrote:
> Am Tue, 2 Oct 2018 06:55:02 +0200 schrieb luigi scarso:
> 
>>> For what do you reserve the space in the PUA?
> 
>>   http://www.pragma-ade.nl/general/manuals/fonts-mkiv.pdf
>> page 32 of the document :
>   
>> As we already mentioned in a previous chapter, in ConTeXt we use
>> Unicode internally. This also means that fonts are organized this
>> way. By default the glyph representation of a Unicode character
>> sits in the same slot in the glyph table. All additional glyphs,
>> like ligatures or alternates are pushed in the private unicode
>> space. This is why in the lists shown in the figures the
>> ligatures have a private Unicode number.
> 
> Hm. To clarify. In xetex there is clear distinction between the slot
> and unicode. \XeTeXglyph (slot) and \char (unicode) give different
> output and \char actively uses the tounicode mapping of the font.
> 
> \font\test="[lmroman10-regular.otf]"
> \test
> \XeTeXglyph"7A
> \char"7A
> \bye
>  
> In luatex \char and \Uchar don't really care about unicode, even if
> the font has tounicode=1 and tounicode entries, they access the char
> by the hashed integer number.

they access the char in the characters table (where each character has 
an index field so one can write a simple function that accesses it by 
index; also, i assume that in xetex \char gives the character as known 
to tex so if one input non-unicode one gets that)

> So to get "unicode" the font loader has to sort the glyphs, index
> unicode glyphs by their unicode code point, and assign "non-unicode"
> glyphs numbers that don't interfere.
> 
> Did I got right?

indeed, and we use the private space for those with no unicode (which 
can be a lot, also think for instance of the snippets that make up math 
extensibles)

> Then I do understand that you need some free numbers to push
> glyphes. But I do not understand why to achieve this you remove
> glyphs from their unicode points. The PUA is not some non-unicode
> wilderness. The code points there are as valid as in the other code
> blocks. You wouldn't move away the greek block to get the place, so
> why do you think it is okay to throw out of the PUA block what SIL
> and other font designers encoded there?  Can't you check for a free
> range instead?

sure, but then i also loose some functionality in context (unless i gho 
for ugly solutions) ... as all glyphs are supposed to have a name access 
by name is a pretty good alternative

the main issue is that there are fonts that use private > 0xFFFF space 
which then would mean a lot of extra mem for names ... so the question 
is are there fonts that use that range

Hans


-- 

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------


More information about the ntg-context mailing list