Hi, I'm using Windows. Luatex interprets the command line as if it were UTF-8, so I cannot write accents in the file names. E.g.:
luaplainJA Canción.tex This is LuaTeX, Version beta-0.76.0-2013052306 (rev 4627) ! String contains an invalid utf-8 sequence. <*> Canci ¾n.tex ?
Can just Luatex ask some C library IO function to open the name it was passed in, without it trying to be too smart parsing the name? If this is not posible, those of you who use Windows, how do you solve it? Not being able to write accents or other kinds of characters in the filename is a déja-vu form the protohistory of informatics. If Luatex cannot just open the file, it should at least process the commad line transforming it from the platform-locale specific encoding to UTF-8, thence passing the arguments & filename to whatever routines it wants. Regards, -- Javier A. M.
On 12/3/2013 11:05 AM, Javier Múgica de Rivera wrote:
Hi,
I'm using Windows. Luatex interprets the command line as if it were UTF-8, so I cannot write accents in the file names. E.g.:
luaplainJA Canción.tex This is LuaTeX, Version beta-0.76.0-2013052306 (rev 4627) ! String contains an invalid utf-8 sequence. <*> Canci ¾n.tex ?
Can just Luatex ask some C library IO function to open the name it was passed in, without it trying to be too smart parsing the name?
If this is not posible, those of you who use Windows, how do you solve it?
Not being able to write accents or other kinds of characters in the filename is a déja-vu form the protohistory of informatics. If Luatex cannot just open the file, it should at least process the commad line transforming it from the platform-locale specific encoding to UTF-8, thence passing the arguments & filename to whatever routines it wants.
this is a tricky issue as jobnames can end up anywhere in (lua)tex so also in places where it definitely has to be utf also, there is nothing that forbids filenames to have any characters so there is no robust way to identify in what encoding (codepage) the filename is, especially as files can come from anywhere (a unix based nas running samba, external resources like graphic studios using osx or whatever, some script that makes up a name) (you can of course create files with utf 8 filenames on windows) ps. this is not unique to windows and/or luatex ... i've run into issues with media servers running on linux machines that also had issues, not barking like tex, but for instance entering loops which to some extend is worse (one option is to write a wrapper script that translates from your current codepage to utf, hoping of course that you don't get files from someplace else with another encoding; internally windows uses utf16 but i'm not sure if that helps much) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Can just Luatex ask some C library IO function to open the name it was passed in, without it trying to be too smart parsing the name? The opposite. Windows command shell passes accented letters in DOS 8-bit encoding (prehistoric, don't you see?). So, if LuaTeX were extremely pedantic, he could convert these characters to UTF-16, which is actually how filenames are stored in the filesystem. And just in cases, when the input comes directly from the terminal. At
On 2013-12-03 12:05, Javier Múgica de Rivera wrote: the moment LuaTeX is quite simple and clear about any input – UTF-8. May be solution could be using of .bat files, where command lines are put in UTF-8? Just guess. You could try Cygwin terminal as well – it uses UTF-8 for the input. What about Total Commander? Don't have it at the moment.
Not being able to write accents or other kinds of characters in the filename is a déja-vu form the protohistory of informatics. Think about filenames like some kind of names of variables. Do you experience any discomfort about lack of accents on them? Is the case really very first time, you have problems with accented filenames? Especially across different filesystems, like Hans mentioned?
Mindaugas.
On 2013-12-03 12:05, Javier Múgica de Rivera wrote:
If Luatex cannot just open the file, it should at least process the commad line transforming it from the platform-locale specific encoding to UTF-8, thence passing the arguments & filename to whatever routines it wants.
Anyway, I could grep neither CreateFile nor CreateFileW inside of tex part of luatex, so I doubt simple fopen() will manage accented file names. Another one option would be to use short 8.3 file name variants. Use dir /X to obtain them. Regards, Mindaugas.
participants (4)
-
Hans Hagen
-
Javier Múgica de Rivera
-
Khaled Hosny
-
Mindaugas Piešina