David Kastrup wrote:
Taco Hoekwater
writes: David Kastrup wrote:
So the kind of utf-8 support (OTP or something) used for Omega needs to be somewhat optional.
No, the error is simply a bug. All I/O characters that are visible to the bare engine is, and will be, utf-8 encoded.
What is "the bare engine"? From the TeX side, one sees Unicode characters.
If you want to do bare bytes, you have to preprocess them in lua.
How do you interpret input bytes that don't form valid utf-8 sequences? As long as they are preserved in some recognizable manner, it should be possible to do this sort of reverse conversion to the original bytes, but it certainly does not sound like it would make for attractive speed.
you can define a callback that will intercept each line and do whatever you want with the content as long as what you pipe back into tex is utf 8 the internal dataflow is utf8 and as the manual states, getting not utf (8 bit) out is a matter of remapping to a reserved private area in unicode (for instance, pdf literals may need 8 bit instead of utf, and that's how it's done) this keeps luatex internally clean, but permits macro writers to do what they want; it's also the principle of luatex ... provide access and points of interception but stay as clean as possible internally anyhow, good old tex was never 8 bit clean (at least not till recently and then only with natural.tcx or -8bit) also keep in mind that macro packages need to adapt to luatex and not the reverse -) Hans -- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------