Re: [Dev-luatex] Filename encoding

12 Dec 2013

      2013/12/9, Philipp Stephani :
...
It's not that easy. For Windows, you need to convert the code points to
UTF-16...
or pass it wthout conversion. Characters beyond the basic multilingual
plane in filenames need not be allowed.
...
and then use _wfopen. For OS X and Linux, you need to convert it to
UTF-8 and then call fopen. In such cases it's often easier to only store
one version internally (e.g. the UTF-8 version)
or just the string of code points as it is stored internally by luatex
(think it is a string of int or unsigned integers, can't remember
now).
...
and then convert to the
system encoding at the very edge of the program, i.e., replace all calls to
fopen by a wrapper function that fans out to fopen or _wfopen depending on
the operating system. I tried this once with LuaTeX, but never finished
because I really underestimated the amount of work required. fopen is
called from dozens of places, and there are other filesystem functions to
take care about. In essence you need to replace each call to any filesystem
function. There are some drop-in wrappers available, e.g. GLib (
https://developer.gnome.org/glib/2.38/glib-File-Utilities.html#g-fopen).
I thought, as you had once done, that the amount of work required was
small. In any case, this is something that ought to have been
programmed from the onset but has been left undone till now. To call
it by its name, this is a bug. Writing

\input whateveráéè.tex

and luatex no finding the file is a bug. As a Spanish speaker this is
not a serious issue for me, but I wonder how people using different
scripts, e.g. greek, russian, hebrew, etc. and using Windows manage to
get around this problem. Is it that they just don't \input files?

Re: [Dev-luatex] Filename encoding

Javier Múgica de Rivera