[Dev-luatex] Filename encoding

Javier Múgica de Rivera javieraritz.ribadeo at gmail.com
Thu Dec 12 11:44:38 CET 2013

2013/12/9, Philipp Stephani <p.stephani2 at gmail.com>:
> It's not that easy. For Windows, you need to convert the code points to
> UTF-16...

or pass it wthout conversion. Characters beyond the basic multilingual
plane in filenames need not be allowed.

>and then use _wfopen. For OS X and Linux, you need to convert it to
> UTF-8 and then call fopen. In such cases it's often easier to only store
> one version internally (e.g. the UTF-8 version)

or just the string of code points as it is stored internally by luatex
(think it is a string of int or unsigned integers, can't remember

>and then convert to the
> system encoding at the very edge of the program, i.e., replace all calls to
> fopen by a wrapper function that fans out to fopen or _wfopen depending on
> the operating system. I tried this once with LuaTeX, but never finished
> because I really underestimated the amount of work required. fopen is
> called from dozens of places, and there are other filesystem functions to
> take care about. In essence you need to replace each call to any filesystem
> function. There are some drop-in wrappers available, e.g. GLib (
> https://developer.gnome.org/glib/2.38/glib-File-Utilities.html#g-fopen).

I thought, as you had once done, that the amount of work required was
small. In any case, this is something that ought to have been
programmed from the onset but has been left undone till now. To call
it by its name, this is a bug. Writing

\input whateveráéè.tex

and luatex no finding the file is a bug. As a Spanish speaker this is
not a serious issue for me, but I wonder how people using different
scripts, e.g. greek, russian, hebrew, etc. and using Windows manage to
get around this problem. Is it that they just don't \input files?

More information about the dev-luatex mailing list