Hi all.

I’ve just discovered a problem with  \pdfglyphtounicode
when you are trying to map a character to a  Plane-1  code-point.

Here is a minimal working example that shows the issue.

%%%%%  cut here for test file %%%%%%
\pdfcompresslevel=0
\pdfgentounicode=1
\input glyphtounicode.tex
\pdfglyphtounicode{Z}{D835DC81}   % MATH bold-italic-Z U+1D481 (U+D835 U+DC81)

Z $Z$

\bye

%%%%%  end cut here for test file %%%%%%

Using:
 This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017) (preloaded format=pdftex)

The two fonts both get a bad entry in their  /ToUnicode  CMap  resource.
viz.

<5A> <36E537DC81>

instead of the intended:

<5A> <D835DC81>


The Hex string <36E537DC81>  is not just wrong it is actually invalid for a CMap entry,
which is supposed to have a multiple of 4 Hex digits, not 10 of them.

A cut&paste of the `Z`s in the PDF output produces chinese glyphs,
which is usually a sign that some UTF-8 sequence has got screwed up.



Of course I don’t really want to map all `Z`s into Plane-1.
This is just an easy way to illustrate the problem that I discovered
when trying to support proper Cut/Paste of exotic characters in 
 LinLibertine & LinBiolinum  fonts.



Cheers

Ross


Dr Ross Moore
Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.moore@mq.edu.au

http://www.maths.mq.edu.au





CRICOS Provider Number 00002J. Think before you print. 
Please consider the environment before printing this email.

This message is intended for the addressee named and may 
contain confidential information. If you are not the intended 
recipient, please delete it and notify the sender. Views expressed 
in this message are those of the individual sender, and are not 
necessarily the views of Macquarie University.