[NTG-pdftex] [PATCH v4] Allow .enc files for bitmap PK fonts

Pali Rohár pali.rohar at gmail.com
Mon Dec 18 12:40:36 CET 2017

On Monday 18 December 2017 11:17:45 Hans Hagen wrote:
> > It looks like that currently pdftex generates CMap from glyph names.
> > Theoretically it should be possible to assign fully unique glyph names
> > for every one glyph, possible fully random and then into CMap table put
> > correct mapping for all character codes (as CMap table does not use
> > glyph names) according to enc file.
> that would confuse some viewers too (i remember some thread about non
> standard ffi ligature names and resolving hard coded in some viewer and the
> request for tex related fonts to conform to that bad practice too)

First occurrence of duplicate can use originally specified glyph name
and second, third, ... occurrences can use newly unique glyph name (with
proper CMap table). Yes, that would not fix problem for those "some"
viewers but in this situation it is better then nothing.

> > > > File test.tex:
> > > > ============
> > > > \pdfglyphtounicode{mychar}{269}
> > > > \pdfgentounicode=1
> > > > \pdfmapline{cmb10 <my.enc}
> > > > \font\cmb=cmb10
> > > > \cmb
> > > > a b
> > > > \bye
> > > > ============
> > > > 
> > > > And result PDF file would not render glyph 'a' if function
> > > > remove_duplicate_glyph_names() is disabled. There would be two glyphs 'b'.
> but still i think that the fact that there are duplicate names in my.enc
> file is the real problem: if two b's refer to different shapes then what is
> the real 'b'? And what is the right new name: b.one, b.two ?

If you have two shapes for b, then you can assign glyph name 'b' only
just for one shape in final PDF. What you can do is to create CMap table
where both characters would be mapped to unicode code point for 'b'.

PDF viewers which do not use CMap would not be able to copy+paste
properly. But this is current situation as /ToUnicode is not supported
for Type3 fonts yet.

Anyway, exactly same problem is for Type 1 fonts. If you have two
different shapes for b in Type 1 font, then only one can have glyph name

> What does one expect with cut and paste?

The expected behavior for ordinary user is simple: Both glyphs which are
marked as 'b' should be copied as character 'b'.

It can work only in PDF viewers with correct CMap support. But with
current pdftex code it is not possible.

But you are right that this is a real problem. Some calligraphic fonts
have more glyphs for one character. And decision which glyph needs to be
used is based on previous or next characters.

> If two names are the same and they refer to the
> same font program then there is no problem and the first one encountered
> when embedding should be used.
> If remove duplicates is an option in pdftex then at least make sure that
> it's off by default (better complain loudly on the console that the enc is
> broken)

Do you want to be this problem a fatal error?

> so that the user knows that enabling that option is not solving the
> problem (and in tex distributions the fixed enc should be used). Heuristics
> and fixes for bugged fonts are nice but not being able
> to bypass them is bad.

I thought it would be better to produce PDF file as enc file itself does
not change how PDF file is rendered. It affects only copy+paste from PDF

> (multiple .notdef is an exception)

Different, but maybe more interesting question is: What happens for
other font formats if supplied enc file contains duplicate names?

Pali Rohár
pali.rohar at gmail.com

More information about the ntg-pdftex mailing list