[Dev-luatex] Reading tounicode from shared fonts

Khaled Hosny dr.khaled.hosny at gmail.com
Wed Nov 14 18:23:14 CET 2018


While debugging an issue with PDF /ToUnicode, I found code for reading
character-level tounicode from the fonts, always takes them from the
first font in fonts in merges in the PDF. Usually this is not an issue
since the font loader will have identical tounicode values for all
characters in the same font loaded multiple times. However, my code sets
character-level tounicode only after processing the nodes (to avoid
parsing the cmap and GSUB tables ahead of typesetting), so the same font
loaded multiple times can have different tounicodes and characters used
only in later instances of the font will not have their tounicodes in
the PDF file.

It seems to be one character fix to instead check the font the character
is used in, and looks to me like a typo since the first revision this
code was introduced (r710) that went unnoticed. Patch attached.

-------------- next part --------------
diff --git a/source/texk/web2c/luatexdir/font/tounicode.c b/source/texk/web2c/luatexdir/font/tounicode.c
index f0449aff9..4ce912273 100644
--- a/source/texk/web2c/luatexdir/font/tounicode.c
+++ b/source/texk/web2c/luatexdir/font/tounicode.c
@@ -514,7 +514,7 @@ int write_cid_tounicode(PDF pdf, fo_entry * fo, internal_font_number f)
                 if (quick_char_exists(k, i) && char_used(k, i)) {
                     j = char_index(k, i);
                     if (gtab[j].code == UNI_UNDEF) {
-                        set_cid_glyph_unicode(i, &gtab[j], f);
+                        set_cid_glyph_unicode(i, &gtab[j], k);

More information about the dev-luatex mailing list