[NTG-context] searchable PDF with MinionPro under mkiv

Mon Jan 17 14:53:01 CET 2011

>> However, it turns out that pdftotext converts to
>> fi ff ffi ffl 1234567890,
>> splitting fi ligature while leaving ff, ffi and ffl intact, which is
>> strange.
>> I did not try with Adobe Reader but the pdf is searchable with Apple
>> Preview and the pasted copy is still intact:
>> fi ff ffi ffl 1234567890
> For me, it still doesn't work.  I get oldstyle numbers in the text, and
> neither in Adobe Reader nor in okular, evince or xpdf the numbers are
> searchable.  However, I figured out that it is my version of the font
> causing the wrong result.

You are right! I have not considered that. Depending on the used font, pdftotext expands (some) the ligatures or not. With TeXGyre Pagella for instance there is no ligature expansion at all:

fi ff ffi ffl 1234567890

and with Cambria I get a pdf which is not searchable with Preview:

􀅩i ff f􀅩i f􀅩l 1234567890


