[NTG-pdftex] Error with \pdfglyphtounicode when surrogates are involved.

Ross Moore ross.moore at mq.edu.au
Tue May 30 08:03:04 CEST 2017


Hi all.

I’ve just discovered a problem with  \pdfglyphtounicode
when you are trying to map a character to a  Plane-1  code-point.

Here is a minimal working example that shows the issue.

%%%%%  cut here for test file %%%%%%
\pdfcompresslevel=0
\pdfgentounicode=1
\input glyphtounicode.tex
\pdfglyphtounicode{Z}{D835DC81}   % MATH bold-italic-Z U+1D481 (U+D835 U+DC81)

Z $Z$

\bye

%%%%%  end cut here for test file %%%%%%

Using:
 This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017) (preloaded format=pdftex)

The two fonts both get a bad entry in their  /ToUnicode  CMap  resource.
viz.

<5A> <36E537DC81>

instead of the intended:

<5A> <D835DC81>


The Hex string <36E537DC81>  is not just wrong it is actually invalid for a CMap entry,
which is supposed to have a multiple of 4 Hex digits, not 10 of them.

A cut&paste of the `Z`s in the PDF output produces chinese glyphs,
which is usually a sign that some UTF-8 sequence has got screwed up.



Of course I don’t really want to map all `Z`s into Plane-1.
This is just an easy way to illustrate the problem that I discovered
when trying to support proper Cut/Paste of exotic characters in
 LinLibertine & LinBiolinum  fonts.



Cheers

Ross


Dr Ross Moore
Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.moore at mq.edu.au<mailto:ross.moore at mq.edu.au>

http://www.maths.mq.edu.au


[cid:image001.png at 01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 00002J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ntg.nl/pipermail/ntg-pdftex/attachments/20170530/dd029a0c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 4605 bytes
Desc: image001.png
URL: <http://mailman.ntg.nl/pipermail/ntg-pdftex/attachments/20170530/dd029a0c/attachment.png>


More information about the ntg-pdftex mailing list