Ligature handling for PDF searching.

(This came up on comp.text.tex in a question about LaTeX, but it also applies to ConTeXt, and the proposed solution for LaTeX doesn't apply.) Consider the following document: \starttext Some ligature tests: ff, fi, ffi, fl, ffl. \stoptext If I process that with texexex -pdf, load it into Acrobat 5, and then copy-and-paste the text from the PDF into a text editor, the fi and fl ligatures are correctly treated as two letters, but the ff, ffi, and ffl ligatures are treated as single (unknown) characters. Similarly, searching for "f" within the document only finds the fi and fl ligatures; it doesn't find the others. Searching for "ff" finds nothing. This is a fairly significant problem in the on-screen usability of ConTeXt-created documents. In LaTeX, there is apparently a solution in the cmap.sty package (though it currently only works for T1 encoding): http://www.ctan.org/tex-archive/macros/latex/contrib/cmap/ Is there a similar solution for ConTeXt? (Has this perhaps been solved with a later version of ConTeXt than I have on my computer?) Thanks, - Brooks

Brooks Moses wrote:
Is there a similar solution for ConTeXt? (Has this perhaps been solved with a later version of ConTeXt than I have on my computer?)
that kind of stuff was introduced in context ages ago -) take a look at: pdfr-il2 enco-pfr it's rather integrated and automatic although i didn't test it recently (probably last in fall 2000) the only thing needed is a pdfr-ec and pdfr-texnansi Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Taco Hoekwater wrote:
it's hard to check with compressed files, but: \pdfcompresslevel=0 \useencoding[pfr] \startencoding [ec] \usepdffontresource ec \stopencoding \usetypescript[palatino][ec] \setupbodyfont[palatino] \starttext fi ff ffi \stoptext seems to work here; i'll add the file and definition to the distribution Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (4)
-
Brooks Moses
-
Hans Hagen
-
Taco Hoekwater
-
Vit Zyka