On 12/29/2013 10:10 AM, Khaled Hosny wrote:
On Sun, Dec 29, 2013 at 01:07:15AM +0000, Philipp Maximilian Stephani wrote:
but honestly, we're not living in the 1960s any more.
No, we are not, but Windows is.
I always wonder why folks need comments like this. I can come up with linux aspects that are 1960. I more and more tend to ignore discussions (and mails) that have this OS bad-this-or-that undertone. (And I've left mailings because of it.) If windows was that bad, then why do desktop builders try to mimick it. Much is a matter of getting accustomed to. Anyway, if at some point utf16 had become the favourite (on linux) we would be in bigger problems as it can have many zero's in strings. At least windows could serve multiple gui languages rather early so we have to live with some aspects (large companies wouldn't like sudden changes and want to use programs decades). Fwiw: it's comparable to (mysql) database content where different assumptions about what bytes represent can give weird side effects. It's about mutual agreements. Lua(tex) is rather neutral with respect to what bytes go into a filename: if i save some data using an utf8 filename (from lua for instance) i can perfectly well reload that file. Some applications will show proper (utf8) names, others, like 'dir' in the console, will show bytes as e.g. latin. Not much different from what one gets when one logs into a remote machine with a different terminal setup. Which reminds me: last week i entered an lua interactive console on ubuntu and magically ^3 was turned into this superscript unicode 3 characters ... so, talking of a mess up ... to some extend I can understand such default behaviour so I'll live with it. It's cut 'n paste and assumptions of other applications that (at least on windows) can turn something utf8 into something looking weird. It's really not much different from typesetting an utf8 encoded document in a tex that expects 8 bit texnansi encoding. The typeset stream looks weird but in fact is honest utf8 visualized. Of course we could introduce an abstract filename object (including all these attributes that relates to file) but it's not really a solution. Simply converting utf8 encoded filenames into utf16 doesn't work out well because in between we use C-strings and these have this '60 properties of being zero terminated so in practice one ends up with utf16 names clipped to length 1. When on windows one mixes applications in a workflow it is important to make sure that one doesn't get code page translations in the way. Anyoing indeed, but using computers is full of annoyances. You don't want to know what troubles we sometimes have with graphics coming from apple infrastructures to linux infrastructure where users. Filenames is always a bit of an issue. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------