On Mon, Jun 7, 2021 at 7:05 PM Hans Hagen
On 6/3/2021 11:25 AM, Christoph Reller wrote:
Hi,
On Windows, we have the consola font. Consider the MWE:
\starttext \definedfont[name:consola*default at 12 pt] - \stoptext
The output PDF is correctly generated with recent versions of ConTeXt LMTX. The hyphen is, however, mapped to a soft hyphen https://unicode-table.com/en/00AD/ by means of the ToUnicode table which contains: beginbfchar <015E> <00AD> endbfchar
Consequently, when copying the text from the PDF and pasting in an editor or a console, the soft hyphen is pasted.
I would like to change the ToUnicode information to an ordinary hyphen-minus https://unicode-table.com/en/002D/: beginbfchar <015E> <002D> endbfchar
It is (as awlways with fonts) more complex than that (1) because different unicode slots share the same shape and (2) we have some (already) old hyphen patching code for messy fonts (which is kind of bad anyway).
We actually want all these hyphens to have the right tounicode even if they share shapes (i already had some comment about looking into that but never ran into a font that needed it).
So, after some experimenting i decided to solve that in a different way (lmtx only because there i have more control) ... i need to run some checks and then do an upload so that you can test (also other files if possible).
Finally I found the time to do some extended testing on this and it seems that for my use-case the LMTX version 2021-06-09 behaves as I would expect: Hyphens are now extracted as hyphens. Thanks a lot for your implementation, Hans! Cheers, Christoph