Re: [Dev-luatex] Utf-8 too dominant?

27 Mar 2007

      David Kastrup wrote:
...
Taco Hoekwater  writes:
...
David Kastrup wrote:
...
So the kind of utf-8 support (OTP or something) used for Omega needs
to be somewhat optional.
No, the error is simply a bug. All I/O characters that are visible to
the bare engine is, and will be, utf-8 encoded.
What is "the bare engine"?  From the TeX side, one sees Unicode
characters.
...
If you want to do bare bytes, you have to preprocess them in lua.
How do you interpret input bytes that don't form valid utf-8
sequences?  As long as they are preserved in some recognizable manner,
it should be possible to do this sort of reverse conversion to the
original bytes, but it certainly does not sound like it would make for
attractive speed.
you can define a callback that will intercept each line and do whatever 
you want with the content as long as what you pipe back into tex is utf 8

the internal dataflow is utf8 and as the manual states, getting not utf 
(8 bit) out is a matter of remapping to a reserved private area in 
unicode (for instance, pdf literals may need 8 bit instead of utf, and 
that's how it's done)

this keeps luatex internally clean, but permits macro writers to do what 
they want; it's also the principle of luatex ... provide access and 
points of interception but stay as clean as possible internally

anyhow, good old tex was never 8 bit clean (at least not till recently 
and then only with natural.tcx or -8bit)

also keep in mind that macro packages need to adapt to luatex and not 
the reverse -)
Hans

-- 

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------