restrict pdfglyphtounicode to a tfm

With pdftex it is possible to restrict a \pdfglyphtounicode declaration to a specific tfm. So in the following example the A from cmr10 is mapped to B, but the A from cmss10 is untouched: \pdfgentounicode=1 \pdfglyphtounicode{tfm:cmr10/A}{0042} A \font\test=cmss10 \test A \bye This is useful, e.g., to setup unicode mappings for older symbol fonts which often (mis)use standard glyph names. With luatex this doesn't work. Only the "general" mapping is honored here: \pdfvariable gentounicode =1 \pdfextension glyphtounicode {C}{0044} \pdfextension glyphtounicode {tfm:cmr10/A}{0042} C A \font\test=cmss10 \test C A \bye Would it be possible to extend luatex to support the tfm: syntax too? Or is there an alternative to change tounicode mappings? -- Ulrike Fischer http://www.troubleshooting-tex.de/

On Sat, 12 Apr 2025 at 14:57, Ulrike Fischer
With pdftex it is possible to restrict a \pdfglyphtounicode declaration to a specific tfm. So in the following example the A from cmr10 is mapped to B, but the A from cmss10 is untouched:
\pdfgentounicode=1 \pdfglyphtounicode{tfm:cmr10/A}{0042} A \font\test=cmss10 \test A \bye
This is useful, e.g., to setup unicode mappings for older symbol fonts which often (mis)use standard glyph names.
With luatex this doesn't work. Only the "general" mapping is honored here:
\pdfvariable gentounicode =1 \pdfextension glyphtounicode {C}{0044} \pdfextension glyphtounicode {tfm:cmr10/A}{0042}
C A \font\test=cmss10 \test C A \bye
Would it be possible to extend luatex to support the tfm: syntax too? Or is there an alternative to change tounicode mappings?
texlive/trunk/Build/source/texk/web2c/luatexdir/font/pdfglyphtounicode-readme.txt (at least from 2020-02-15 13:45:08) """ In pdftex there are more heuristics going on when determining the tounicode mapping: - more lookups using periods - a prefix tfm:cmr10/foo -> bar mapping Because in luatex one can have a callback that just loads the tfm and then decorates it with tounicodes we don't do this in luatex. HH """ -- luigi

Am Sat, 12 Apr 2025 22:48:56 +0200 schrieb luigi scarso:
On Sat, 12 Apr 2025 at 14:57, Ulrike Fischer
wrote: With pdftex it is possible to restrict a \pdfglyphtounicode declaration to a specific tfm. So in the following example the A from cmr10 is mapped to B, but the A from cmss10 is untouched: ...
Would it be possible to extend luatex to support the tfm: syntax too? Or is there an alternative to change tounicode mappings?
texlive/trunk/Build/source/texk/web2c/luatexdir/font/pdfglyphtounicode-readme.txt (at least from 2020-02-15 13:45:08)
""" In pdftex there are more heuristics going on when determining the tounicode mapping:
- more lookups using periods - a prefix tfm:cmr10/foo -> bar mapping
Because in luatex one can have a callback that just loads the tfm and then decorates it with tounicodes we don't do this in luatex.
HH """
--
Pity ;-(. It also looks as if loading manually a cmap isn't an option either (if \pdfgentounicode is 1) as \pdfnobuiltintounicode is not support either. And I didn't find yet examples or a documentation how to define a callback that "decorates" the tfm. -- Ulrike Fischer http://www.troubleshooting-tex.de/

Hi Ulrike, On Sun, 2025-04-13 at 12:15 +0200, Ulrike Fischer wrote:
And I didn't find yet examples or a documentation how to define a callback that "decorates" the tfm.
I think that this would be the "define_font" callback. You should be
able to take the example from §6.3.3 and replace "v.commands = <...>"
with "v.tounicode = {

Am Sun, 13 Apr 2025 04:51:00 -0600 schrieb Max Chernoff:
Hi Ulrike,
On Sun, 2025-04-13 at 12:15 +0200, Ulrike Fischer wrote:
And I didn't find yet examples or a documentation how to define a callback that "decorates" the tfm.
I think that this would be the "define_font" callback. You should be able to take the example from §6.3.3 and replace "v.commands = <...>" with "v.tounicode = {
}" (the documentation says that this needs to be a UTF-16 string, but a table of integer codepoints works too). Luaotfload already hooks into the "define_font" callback, so you should only need to override "fonts.readers.tfm". Actually, there's already a function "fonts.mappings.addtounicode", so you could either replace that with your own function, or add a manipulator that runs before/after that.
It is not really luaotfload that does that, but the font loader which is imported from context. And it is quite unclear if (and how) that can be used to overwrite the tounicode values without disturbing settings done with \pdfglyphtounicode. The actual question came from the maintainer of the adforn package who tried to improve the tounicode values (and so accessibility) of the chars. I will tell her to ask a question on tex.sx and then you can tell her how to do it in LaTeX without disturbing other fonts and packages and in a way that can be extended to other similar fonts. I would be very happy if that is possible, as I have a number of chess fonts where the tounicode values are wrong too. -- Ulrike Fischer http://www.troubleshooting-tex.de/
participants (3)
-
luigi scarso
-
Max Chernoff
-
Ulrike Fischer