
Am Freitag, 14. Februar 2025, 08:48:06 Mitteleuropäische Normalzeit schrieb Hans Hagen via ntg-context:
On 2/14/2025 2:19 AM, Gerion Entrup wrote:
Hi,
I recently learned that Typst seems to be able to produce PDFs where a hyphenated text can be copied without the hyphenation (so all words in the copied text are not hyphenated). I seem to recall that the PDF format has an extra mode for this, where the creation program can embed some text that should only appear when copied and replace the word parts that are visible on the page.
ConTeXt, in it's default mode, seems not to embed this text. When copying hyphenated words, the hyphenated word parts appear as distinct words (even without the hyphen). Is there a way to tell ConTeXt to produce PDF where the text can be copied without hyphenated words?
This is a fuzzy area and has always depended on how pdf viewers see things. The standard has some suggestions and oenm is to use soft hyphens which is what we do (can be turned off). From your description it looks like actual text is used and in this case, although one can make that work, to me it is not a solution, it not only polutes the page stream, it also can interferes with other features and increases overhead.
When a viewer sees aoft hyphen it is assumed that it looks for the next part of the word. Afaik acrobat reader can handle both variants. The other (open source) viewers that I use are a mixed bag (in areas like these).
Thanks for the answer. I researched this for my default PDF-viewer, Okular from KDE, and this program seems to be really special in this regard. It should be actively responsible for the behavior described in my original mail. See https://bugs.kde.org/show_bug.cgi?id=447094#c5 and https://bugs.kde.org/show_bug.cgi?id=233604. Gerion