pdftex - Duplicated codes ligatures in glyphtounicode.tex
Hello, I found another problem in glyphtounicode.tex file. There are duplicate codes for ligatures, see: \pdfglyphtounicode{ff}{FB00} \pdfglyphtounicode{ffi}{FB03} \pdfglyphtounicode{ffl}{FB04} \pdfglyphtounicode{fi}{FB01} ... \pdfglyphtounicode{fl}{FB02} ... \pdfglyphtounicode{IJ}{0132} ... \pdfglyphtounicode{ij}{0133} ... ... \pdfglyphtounicode{ff}{0066 0066} \pdfglyphtounicode{fi}{0066 0069} \pdfglyphtounicode{fl}{0066 006C} \pdfglyphtounicode{ffi}{0066 0066 0069} \pdfglyphtounicode{ffl}{0066 0066 006C} \pdfglyphtounicode{IJ}{0049 004A} \pdfglyphtounicode{ij}{0069 006A} Currently pdftex use last defined value, so first definitions are useless. Can you fix it that file, so there will be only one active definitions? I see that on other places duplicate values are just commented... And why is used decomposited sequence of ascii for those ligatures instead using unicode character for each ligature? -- Pali Rohár pali.rohar@gmail.com
I found another problem in glyphtounicode.tex file. There are duplicate codes for ligatures, see: ...
I cannot find such duplicate lines in glyphtounicode.tex in TeX Live: % lcdf-typetools glyphtounicode.tex, Version 2.95 Best, Akira
On Saturday 25 June 2016 15:11:09 Akira Kakuto wrote:
I found another problem in glyphtounicode.tex file. There are duplicate codes for ligatures, see: ...
I cannot find such duplicate lines in glyphtounicode.tex in TeX Live: % lcdf-typetools glyphtounicode.tex, Version 2.95
Ah right, again I check my TeXLive (in 2012) and svn code in /trunk/... Sorry I forgot to look into /branches/stable/ which is *real* pdftex trunk. Anyway, question about not using unicode characters for ligatures remains. -- Pali Rohár pali.rohar@gmail.com
Sorry I forgot to look into /branches/stable/ which is *real* pdftex 1) Yes, although glyphtounicode.tex even in branches/stable is not up to date. I will remove it, or update it, when I have a chance. 2) The current glyphtounicode.tex is in TeX Live (v 2.95, as Akira said). 3) glyphtounicode.tex is no longer maintained in pdftex (or in TeX Live). It is maintained by Eddie Kohler. Please write him, not here. 4) For the ligatures, I can't remember for sure, but I suspect that the entries predate the existence of the ligatures in Unicode. It is not clear to me that the change is desirable, because the "decomposition" into two f characters is what is actually desired for searches, etc. If you can convince Eddie, fine; it's up to him. A practical usage case is probably needed for that. The current entries have been around a long, long, time and should not be changed merely to "follow" Unicode. 5) Regarding your previous message about Delta(greek). glyphtounicode.tex generally follows Adobe's original glyphlist.txt http://partners.adobe.com/public/developer/en/opentype/glyphlist.txt which has the same Delta + Deltagreek definitions. Which are more useful in practice. I can't imagine there will or should be any changes there. 5b) I note that glyphlist.txt does have the FB0* for the f-ligatures. I don't know if we intentionally diverged from that for TeX. 6) If you want to make changes for your own purposes, you can just make your own version of the file. Hope this helps, Karl
Hi Karl, Pali and others
On Jun 26, 2016, at 8:34 AM, Karl Berry
On Sunday 26 June 2016 01:08:32 Ross Moore wrote:
It is not clear to me that the change is desirable, because the "decomposition" into two f characters is what is actually desired for searches, etc. If you can convince Eddie, fine; it's up to him. A practical usage case is probably needed for that. The current entries have been around a long, long, time and should not be changed merely to "follow" Unicode.
Agreed. I was looking at this kind of thing more than 10 years ago, and very few fonts supported the ligatures. So copy/paste would result in missing characters.
Or maybe after 10 years is time for change? :-)
You do *not* want to use \pdfglyphtounicode with glyph names that clash with those that are otherwise loaded automatically, else you could be killing the correct mapping for other fonts.
Today when I read & patch pdftex source code (see my another email in pdftex@tug with subject "Encoding for metafont PK fonts") I found out that tounicode.c understand \pdfglyphtounicode mapping per TFM file in format "tfm:%s/%s". So I can use \pdfglyphtounicode mapping which only apply for specific font, not for all fonts. -- Pali Rohár pali.rohar@gmail.com
On Sunday 26 June 2016 00:34:51 Karl Berry wrote:
3) glyphtounicode.tex is no longer maintained in pdftex (or in TeX Live).
Then would be better to remove it from pdftex tree, so other people do not look at incorrect place (as me).
4) For the ligatures, I can't remember for sure, but I suspect that the entries predate the existence of the ligatures in Unicode. It is not clear to me that the change is desirable, because the "decomposition" into two f characters is what is actually desired for searches, etc. If you can convince Eddie, fine; it's up to him. A practical usage case is probably needed for that. The current entries have been around a long, long, time and should not be changed merely to "follow" Unicode.
Ok, so problem is that pdf viewers do not support searching for unicode ligatures and decomposition help users. Understand.
5) Regarding your previous message about Delta(greek). glyphtounicode.tex generally follows Adobe's original glyphlist.txt
http://partners.adobe.com/public/developer/en/opentype/glyphlist.txt which has the same Delta + Deltagreek definitions. Which are more useful in practice. I can't imagine there will or should be any changes there.
More mathematical fonts have under glyph Delta greek delta character. And in mathematical text is greek Delta used as Delta, not increase character. This reason for my proposal of update. -- Pali Rohár pali.rohar@gmail.com
participants (4)
-
Akira Kakuto
-
Karl Berry
-
Pali Rohár
-
Ross Moore