On 05/05/2011 03:52 PM, Oliver Buerschaper wrote:
Has anyone seen this before? I wanted to ask up front before I really start digging into the issue… I might have missed something obvious.
Check the hexdump of the file. Chances are that one of them has í directly, and one a combination of<dotlessi><acuteaccent>.
Awesome hint… hits the nail on the head! The "faulty" version (i.e. the one not appearing in the PDF with Minion Pro) is<dotlessi><acuteaccent> (where<acuteaccent> appears to translate to CC81 in hex, correct?).
Yes. Useful site for find out stuff like that without having to do utf-8 calculations yourself: http://www.decodeunicode.org/en/u+0301/properties At the top right, it has numerical values for the current character in various encodings.
I guess I need to find and replace the accent combination by the direct slot?
That would be wise for now, but I think context should be able to trap this automatically (at least in the mode=node case).
Can something similar happen for other "foreign" characters (like ß, umlauts, ae, etc.) or is this sort of error only possible with accents?
IIRC, in principle it can happen with some other characters as well, but I do not think that happens often. It is mostly combining accents. Best wishes, Taco