Hi Akira,

On May 31, 2017, at 2:59 PM, Akira Kakuto <kakuto@fuk.kindai.ac.jp> wrote:

Hi Karl,

I realize you're reporting a separate bug, that the value gets
misinterpreted

\pdfglyphtounicode{Z}{D835DC81}

<5A> <36E537DC81>

I confirmed that Ross's
\pdfglyphtounicode{Z}{D835 DC81}
with a space works ok.

In the case of
\pdfglyphtounicode{Z}{D835DC81}
I encountered an assertion error because
long code = 0XD835DC81 < 0 in my case, where
sizeof(long) = 4.

Ross obtained erroneously vh = 0X36E537, vl = 0XDC81
because long code = 0XD835DC81 > 0, if sizeof(long) = 8.

Is assert(code >= 0 && code <= 0X10FFFF) OK or not OK?

I'm thinking this is OK, *provided* spaces are used to separate the codes,
when multiple glyphs are required.

Otherwise there should be just a single Unicode point, and the
allowable range for this is    <= 0X10FFFF  .
Indeed the top end of this  ( 0X100000 upwards ) is for “Private Use” only.


My understanding of these pieces of code:

    for (i = 0; i < l; i++) {
        if (p[i] == ' ')
            valid_unistr = 2;   /* if a space occurs we treat this entry as a string */


    if (valid_unistr == 2) {    /* a string with space(s) */
        /* copy p to buf2, ignoring spaces */
        for (q = buf2; *p != 0; p++)
            if (*p != ' ')
                *q++ = *p;
        *q = 0;
        gu->code = UNI_STRING;
        gu->unicode_seq = xstrdup(buf2);

  … is that blocks of 4-6 hex digits are just copied verbatim, without calling   check_unicode_value  
so that  assert  is never actually encountered.

Do you agree with this interpretation?



(from tounicode.c)
static char *utf16be_str(long code)
{
  static char buf[SMALL_BUF_SIZE];
  long v;
  unsigned vh, vl;

  assert(code >= 0);

  if (code <= 0xFFFF)
      sprintf(buf, "%04lX", code);
  else {
      v = code - 0x10000;
      vh = v / 0x400 + 0xD800;
      vl = v % 0x400 + 0xDC00;
      sprintf(buf, "%04X%04X", vh, vl);
  }
  return buf;
}

Best,
Akira


Cheers,

Ross


Dr Ross Moore
Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.moore@mq.edu.au

http://www.maths.mq.edu.au





CRICOS Provider Number 00002J. Think before you print. 
Please consider the environment before printing this email.

This message is intended for the addressee named and may 
contain confidential information. If you are not the intended 
recipient, please delete it and notify the sender. Views expressed 
in this message are those of the individual sender, and are not 
necessarily the views of Macquarie University.