Patches item #580, was opened at 2006-07-14 20:57 You can respond by visiting: http://sarovar.org/tracker/?func=detail&atid=495&aid=580&group_id=106 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: The Thanh Han (hanthethanh) Assigned to: Nobody (None) Summary: Patch to make ToUnicode for Type1 fonts Initial Comment: This is a patch to pdftex so that it can create ToUnicode entries for Type1 fonts. The main purpose is to make ligatures and some other glyphs like smallcap letters or oldstyle digits from OpenType fonts searchable. This patch also contains a minor fix that allows use of fonts without embedding, for example MinionPro or MyriadPro (which are distributed with Acrobat Reader >= 7.0 but from their use is restricted with Acrobat Reader only). How to apply: ~~~~~~~~~~~~~ - this patch applies to the pristine pdftex-1.40.0-beta-20060213 sources only; if you have applied another patch(es) to the sources, please discard them and start from the fresh ones. - how to apply: ,-------- | cd /path/to/pdftex-1.40.0-beta-20060213/src | cat /path/to/the/patch | patch -p1 | ./configure | cd texk/web2c | make pdfetex `-------- If you want to be careful, try the patch with the option '--dry-run' first to see whether the patch can be applied without problems. Usage: ~~~~~~ add the following lines into your document, somewhere at the beginning: ,-------- | \input glyphtounicode.tex | \pdfgentounicode=1 `-------- Customization: ~~~~~~~~~~~~~ If pdftex cannot generate the right ToUnicode value for some glyphs (probably because the glyph name is not ``known'' to pdftex), it's possible to add further entries so pdftex can learn how to generate unicode for such ``unknown'' glyphs. The syntax is simple: \pdfglyphtounicode{<glyph-name>}{<unicode-value>} Example: \pdfglyphtounicode{A}{0041} says that glyph 'A' has its unicode U+0041 The entries in glyphtounicode.tex cover Adobe Glyph List (glyphlist.txt version 2.0) and some addtional glyphs (texglyphlist.txt version 2.33, coming from from lcdf-typetools), plus some additional entries for ligatures. If some glyph name cannot be found, pdftex does some simple name translations: - remove any ".xxx" suffix from glyph name, where "xxx" is a string consisting of alphabetic characters. For example "A.sc" => "A" - remove suffix like "small", "oldstyle", "inferior" and "superior" from glyph name. For example "Asmall" => "A" The result name then is looked up again to find a unicode. Ligatures require a special form of ToUnicode. Example: \pdfglyphtounicode{ff}{00660066} here '0066' is the unicode string for 'f'. Some ligatures have their name like 'f_f_i', in such case the command should be \pdfglyphtounicode{f_f_i}{006600660069} ie '_' is removed from the glyph name, and then all letters are translated to their unicode string. ---------------------------------------------------------------------- You can respond by visiting: http://sarovar.org/tracker/?func=detail&atid=495&aid=580&group_id=106