On 3/12/2015 9:41 PM, luigi scarso wrote:
On Thu, Mar 12, 2015 at 7:55 PM, Hans Hagen
mailto:pragma@wxs.nl> wrote: it's actually a bug ... it is ok to map an invalid character in the input to 0xFFFD, halt and continue when permitted, but the method used in luatex thereby obscures a valid 0xFFFD in the input
FFFD REPLACEMENT CHARACTER • used to replace an incoming character whose value is unknown or unrepresentable in Unicode
the question is not what to do when an invalid character comes in, in that case luatex can replace it by 0xFFFD and issue a error as now, but when the input hasn't an 0xFFFD then luatex should just carry on as 0xFFFD is a *valid* character it is quite easy for a macro package to trigger an error as \catcode"FFFD=15 will do thatm but it's impossible for a macro package to intercept the weird interception by luatex's input handler
The meaning of FFFD is not "typeset a question mark on a black box" as in � (which depends to font in anycase so in principle it's possible to see something completely different in a new version of the font) but to signal something potentially wrong with a symbol that currently in most cases is �. Misusing the meaning is not bad di per se, but in this specific case I think luatex is correct to be conservative and ask to the user what to do; context --batchmode typesets the document, writes the messages on the log, and ends with -1 , so an automatic agent is also alerted.
you cannot force a user to use \batchmode and -1 would abort a wrapper thereby leading to an invalid document; it means that luatex can never typeset a document where char 0xFFFD is being typeset and luatex should not be normative not accepting 0xFFFD in the input is a bug Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------