On Mon, Feb 13, 2012 at 12:42 PM, Ulrike Fischer
Am Fri, 10 Feb 2012 12:14:15 +0100 schrieb luigi scarso:
if you mean ASCII with coderange 0-255 *and* ISO-8859-1 (Latin 1) encoding there is no need to conversion;
This is not true. You are mixing up unicode positions and utf8 encoding.
E.g. "ä" has the same position in unicode and latin1 (dez 228, hex E4). But its utf8 code consist of 16 bits (1100001110100100, hex c3a4) while its latin 1 code is 8-bit long (11100100). ah yes you are right -- I've made the implicit assumption that his file was already utf-8 encoded . I'm using only utf-8 from long time and I almost forget about ! String contains an invalid utf-8 sequence.
system > tex > error on line 10 in file t1.txt: String contains an invalid utf-8 sequence ... (I believe he met the error during the next tries because he wrote
I cannot \input the file as this is not a valid ConTeXt source. ) What I meant was, as I wrote below, "To allow backward compatibility, the 128 ASCII and 256 ISO-8859-1 (Latin 1) characters are assigned Unicode/UCS code points that are the same as their codes in the earlier standards" and this is true only for iso-8859-1 .
-- luigi