[NTG-pdftex] [ pdftex-Patches-580 ] Patch to make ToUnicode for Type1 fonts

noreply at sarovar.org noreply at sarovar.org
Sun Jul 16 13:19:06 CEST 2006

Patches item #580, was opened at 2006-07-14 20:57
You can respond by visiting: 

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: The Thanh Han (hanthethanh)
Assigned to: Nobody (None)
Summary: Patch to make ToUnicode for Type1 fonts

Initial Comment:
This is a patch to pdftex so that it can create
ToUnicode entries for Type1
fonts. The main purpose is to make ligatures and some
other glyphs like
smallcap letters or oldstyle digits from OpenType fonts
searchable. This
patch also contains a minor fix that allows use of
fonts without embedding,
for example MinionPro or MyriadPro (which are
distributed with Acrobat
Reader >= 7.0 but from their use is restricted with
Acrobat Reader only).

How to apply:
- this patch applies to the pristine
pdftex-1.40.0-beta-20060213 sources
  only; if you have applied another patch(es) to the
sources, please
  discard them and start from the fresh ones.

- how to apply:

| cd /path/to/pdftex-1.40.0-beta-20060213/src
| cat /path/to/the/patch | patch -p1
| ./configure
| cd texk/web2c
| make pdfetex

If you want to be careful, try the patch with the
option '--dry-run' first to
see whether the patch can be applied without problems.

add the following lines into your document, somewhere
at the beginning:

| \input glyphtounicode.tex
| \pdfgentounicode=1

If pdftex cannot generate the right ToUnicode value for
some glyphs
(probably because the glyph name is not ``known'' to
pdftex), it's possible
to add further entries so pdftex can learn how to
generate unicode for such
``unknown'' glyphs.

The syntax is simple:




says that glyph 'A' has its unicode U+0041

The entries in glyphtounicode.tex cover Adobe Glyph
List (glyphlist.txt
version 2.0) and some addtional glyphs
(texglyphlist.txt version 2.33,
coming from from lcdf-typetools), plus some additional
entries for

If some glyph name cannot be found, pdftex does some
simple name

- remove any ".xxx" suffix from glyph name, where "xxx"
is a string
  consisting of alphabetic characters. For example
"A.sc" => "A"

- remove suffix like "small", "oldstyle", "inferior"
and "superior" from
  glyph name. For example "Asmall" => "A"

The result name then is looked up again to find a unicode.

Ligatures require a special form of ToUnicode. Example:


here '0066' is the unicode string for 'f'. Some
ligatures have their name
like 'f_f_i', in such case the command should be


ie '_' is removed from the glyph name, and then all
letters are translated
to their unicode string.


>Comment By: The Thanh Han (hanthethanh)
Date: 2006-07-16 11:19

Logged In: YES 

patch updated by a bug fix from Akira


You can respond by visiting: 

More information about the ntg-pdftex mailing list