[NTG-pdftex] Error with \pdfglyphtounicode when surrogatesare involved.

Akira Kakuto kakuto at fuk.kindai.ac.jp
Wed May 31 06:59:18 CEST 2017


Hi Karl,

> I realize you're reporting a separate bug, that the value gets
> misinterpreted

>> \pdfglyphtounicode{Z}{D835DC81}
>>
>> <5A> <36E537DC81>

I confirmed that Ross's
\pdfglyphtounicode{Z}{D835 DC81}
with a space works ok.

In the case of
\pdfglyphtounicode{Z}{D835DC81}
I encountered an assertion error because
long code = 0XD835DC81 < 0 in my case, where
sizeof(long) = 4.

Ross obtained erroneously vh = 0X36E537, vl = 0XDC81
because long code = 0XD835DC81 > 0, if sizeof(long) = 8.

Is assert(code >= 0 && code <= 0X10FFFF) OK or not OK?


(from tounicode.c)
static char *utf16be_str(long code)
{
    static char buf[SMALL_BUF_SIZE];
    long v;
    unsigned vh, vl;

    assert(code >= 0);

    if (code <= 0xFFFF)
        sprintf(buf, "%04lX", code);
    else {
        v = code - 0x10000;
        vh = v / 0x400 + 0xD800;
        vl = v % 0x400 + 0xDC00;
        sprintf(buf, "%04X%04X", vh, vl);
    }
    return buf;
}

Best,
Akira



More information about the ntg-pdftex mailing list