[NTG-pdftex] [pdftex-Bugs][2092] Ligatures and special characters in included pdf disappear in typesetted document

pdftex-bugs at sarovar.org pdftex-bugs at sarovar.org
Fri Feb 6 07:46:19 CET 2009


Bugs item #2092, was opened at 2008-09-18 10:36
Status: Open
Priority: 3
Submitted By: Jan Michael (jan)
Assigned to: Nobody (None)
Summary: Ligatures and special characters in included pdf disappear in typesetted document 
Category: PDF inclusion
Group: None
>Resolution: Wont Fix


Initial Comment:
Dear Readers,

as the subject already states, my documents created with pdflatex are loosing their ligatures and other special characters from included pdf documents which were typesetted by other applications (Word, Excel, Omnigraffle ...).


(a) Situation
-------------

This is how the behaviour can be reproduced and how it happens in my case:

(1) Import Latin Modern fonts as otf into system wide font location
(2) Create a document with Latin Modern Roman font and use ff, fi, fl ligatures
(3) Save Document as PDF (in OS X it's just "Save as PDF ...") - the pdf still shows the ligatures
(4) Include the created pdf into a *.tex document with \includegraphics
(5) Typeset the document with pdflatex
(6) Ligatures from included pdf document disappear while inline ligature are typeset as usual.

(b) Minimal Example
-------------------
\documentclass[ngerman]{scrreprt}
\usepackage{graphicx}
\begin{document}
      \begin{figure}[htbp!]
              \centering\includegraphics[width=1.1\textwidth]{ligaturen.pdf}
      \end{figure}
\end{document}

ligaturen.pdf can be downloaded from <https://dl.getdropbox.com/u/73200/ligaturen.pdf>

(c) Problem Analysis
--------------------
This is what the the German tex usenet group de.comp.text.tex has already figured out:

- problem can be reproduced with MiKTeX 2.6 and TexLive 2008 (Windows)
- names of ligature glyphs in included pdf (f_i, f_l, f_f) differ from names used by pdftex (fi, fl, ff)
- font from included pdf is replaced by pdftex -> glyphs f_i, ... won't be found anymore
- behaviour can be suppressed by using \pdfmapfile{}
- behaviour can be suppressed by removing or renaming related fonts in pdftex.map
- behaviour can be suppressed by typesetting the pdf with xe(la)tex before including it in pdf(la)tex document. XeTeX redeclares the font in ligaturen.pdf from JTSEMF+LMRoman10-Regular to ASKXQL+LMRoman10-Regular-Identity-H.

The related thread, in German language, can be found here:
	
	<http://groups.google.de/group/de.comp.text.tex/browse_frm/thread/4c0245d37cd84273?>

Right now I use the workaround with XeTeX to get my documents right. What do you think?

Cheers,

	Jan

------------------------------------------------------------

p.s. Please see pdf(la)tex version information below:
(from TexLive 2007 shipped with MacTeX 2007 package on MacBook
C2D running OS X 10.5.5)

$ pdflatex -v
pdfTeX 3.141592-1.40.3-2.2 (Web2C 7.5.6)
kpathsea version 3.5.6
Copyright 2007 Peter Breitenlohner (eTeX)/Han The Thanh (pdfTeX).
Kpathsea is copyright 2007 Karl Berry and Olaf Weber.
There is NO warranty.  Redistribution of this software is
covered by the terms of both the pdfTeX copyright and
the Lesser GNU General Public License.
For more information about these matters, see the file
named COPYING and the pdfTeX source.
Primary author of pdfTeX: Peter Breitenlohner (eTeX)/Han The Thanh (pdfTeX).
Kpathsea written by Karl Berry, Olaf Weber, and others.

Compiled with libpng 1.2.15; using libpng 1.2.15
Compiled with zlib 1.2.3; using zlib 1.2.3
Compiled with xpdf version 3.01

----------------------------------------------------------------------

>Comment By: The Thanh Han (hanthethanh)
Date: 2009-02-06 06:46

Message:
(1) Re: the problem with ligature 
pdftex is doing what it is supposed to do: when it sees a font (LMRoman10-Regular) in an included pdf and the same font is available on local disk, it tries to replace the font in the included pdf by the local font. The reason is to make the pdf output smaller (avoid font duplicate). In this case however this is not desired, since the font in included is not the same as the font on local disk (though they have the same name). Workarounds:
(a) tell pdftex to keep all fonts in included pdfs by saying: 
   \pdfinclusioncopyfonts=1
(b) or tell pdftex not to replace a particular font for pdf inclusion: 
    - open the log file
    - search for the map file(s) being loaded (usually pdftex.map)
    - search for the line containing LMRoman10-Regular, in my setup (Texlive 2008) it would be 

,--------
| cs-lmr10 LMRoman10-Regular "enclmcs ReEncodeFont" <lm-cs.enc <lmr10.pfb
| ec-lmr10 LMRoman10-Regular "enclmec ReEncodeFont" <lm-ec.enc <lmr10.pfb
| l7x-lmr10 LMRoman10-Regular "enclml7x ReEncodeFont" <lm-l7x.enc <lmr10.pfb
| qx-lmr10 LMRoman10-Regular "enclmqx ReEncodeFont" <lm-qx.enc <lmr10.pfb
| rm-lmr10 LMRoman10-Regular "enclmrm ReEncodeFont" <lm-rm.enc <lmr10.pfb
| t5-lmr10 LMRoman10-Regular "enclmt5 ReEncodeFont" <lm-t5.enc <lmr10.pfb
| texnansi-lmr10 LMRoman10-Regular "enclmtexnansi ReEncodeFont" <lm-texnansi.enc <lmr10.pfb
| ts1-lmr10 LMRoman10-Regular "enclmts1 ReEncodeFont" <lm-ts1.enc <lmr10.pfb
`--------
    - override those lines by removing the PostScript name as follows (put into tex file):
,--------
| \pdfmapline{=cs-lmr10 "enclmcs ReEncodeFont" <lm-cs.enc <lmr10.pfb}
| \pdfmapline{=ec-lmr10 "enclmec ReEncodeFont" <lm-ec.enc <lmr10.pfb}
| \pdfmapline{=l7x-lmr10 "enclml7x ReEncodeFont" <lm-l7x.enc <lmr10.pfb}
| \pdfmapline{=qx-lmr10 "enclmqx ReEncodeFont" <lm-qx.enc <lmr10.pfb}
| \pdfmapline{=rm-lmr10 "enclmrm ReEncodeFont" <lm-rm.enc <lmr10.pfb}
| \pdfmapline{=t5-lmr10 "enclmt5 ReEncodeFont" <lm-t5.enc <lmr10.pfb}
| \pdfmapline{=texnansi-lmr10 "enclmtexnansi ReEncodeFont" <lm-texnansi.enc <lmr10.pfb}
| \pdfmapline{=ts1-lmr10 "enclmts1 ReEncodeFont" <lm-ts1.enc <lmr10.pfb}
`--------
this is of course more work than (a), the advantage is that font replacement is disabled only for LMRoman10-Regular instead of every font as in (a).

(2) Re: the problem with pdf figure containing Mathematica 1 font:
this was tricky: perhaps Illustrator tried to optimize the pdf size, so it used a predefined encoding (MacRomanEncoding) to avoid having an explicit encoding, then changed the font to rename all glyphs to conform that encoding. So, for example /alpha in the original font was renamed to /a, /beta to /b and so on. pdftex is not aware of this, and when replacing the font it ended up with the original names. In principle the problem is the same as in (1): pdftex replaces a font in included pdf by a local font, but it is not we want because the font in included differs from the local font. Workarounds for (1) apply here too.


----------------------------------------------------------------------

Comment By: The Thanh Han (hanthethanh)
Date: 2009-02-06 06:14

Message:
test files from Bruno attached.

----------------------------------------------------------------------

Comment By: Bruno Voisin (bvoisin)
Date: 2009-02-05 14:05

Message:
I've just met some bug which seems a follow-up.

Imagine you've got two versions of the Mathematica 1 font, both included in and used by the Mathematica application: one in PFA format installed for TeX within texmf, and the other in TrueType format installed at the OS level.

If you use Adobe Illustrator to prepare an illustration with this font, on the Mac, Illustrator will embed and subset the TrueType font in Macintosh Roman encoding. Imagine you save the Illustrator output to EPS format, and convert it to PDF format (the same happens probably when creating directly the PDF file from within Illustrator, I've just not tried).

Now use \includegraphics to include the illustration in a LaTeX document:

- If the Mathematica 1 font is installed in texmf, dvips will use the embedded version of the font while pdfTeX will use the version from texmf. Since the version in texmf does not have the encoding assumed by Illustrator, glyphs are missing from the pdfTeX output.

- If the Mathematica 1 font is not installed in texmf, both dvips and pdfTeX use the embedded version of the font and everything's fine.

Attached are small test files. Files with names ending with "with-math1" were produced with the Mathematica 1 font in texmf, and files with names ending with "without-math1" were produced without the Mathematica font in texmf.


----------------------------------------------------------------------

Comment By: The Thanh Han (hanthethanh)
Date: 2008-09-18 12:06

Message:
yes this is an unfortunate situation: both the font from included pdf and the font on disk use the same name (LMRoman10-Regular), but they differ. Workarounds have been also mentioned; in short they fall into  2 methods:

- disable font subsetting globally, or
- change the font name in included pdf to something else, hence pdftex will not think this is the same font as the one on disk

none is perfect, and also it's not clear how pdftex should handle such cases. Needs more thinking...

----------------------------------------------------------------------

You can respond by visiting: 
http://sarovar.org/tracker/?func=detail&atid=493&aid=2092&group_id=106


More information about the ntg-pdftex mailing list