Salvete, while I am aware that my Japanese is ages away from creating anything releasable, I thought about creating a lang-jap.tex file for my personal use (and maybe for having it corrected by someone actually speaking the language). Now, checking lang-chi.tex, I find it is encoded in a way I don't really want to copy. I'd much rather write the whole file in “proper” utf-8. Is it possible to simply enclose the file in a \startregime[utf]...\stopregime pair or do I risk havoc by doing this? (Should I start with this project, I'll have more questions, such as: How do I make a unicode character such as 。active, for good line breaks?) Christopher
Christopher Creutzig said this at Thu, 22 Sep 2005 14:17:57 +0200:
Is it possible to simply enclose the file in a \startregime[utf]...\stopregime pair or do I risk havoc by doing this?
Well, if you're using a regime, it still (usually) depends on symbolic character names being defined under the hood. Also, such an approach (explicitly calling \startregime[utf]) doesn't make XeTeX as happy as it could be (XeTeX is happiest if you just pass through Unicode characters. Regimes imply ConTeXt processing.) This isn't a definitive answer, just a couple of issues off the top of my head. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Adam T. Lindsay, Computing Dept. atl@comp.lancs.ac.uk Lancaster University, InfoLab21 +44(0)1524/510.514 Lancaster, LA1 4WA, UK Fax:+44(0)1524/510.492 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Adam Lindsay wrote:
Is it possible to simply enclose the file in a \startregime[utf]...\stopregime pair or do I risk havoc by doing this?
Well, if you're using a regime, it still (usually) depends on symbolic character names being defined under the hood. Also, such an approach
Sure. But editing the file is oh so much easier when I can just type \def\japChapterNumber#1{第#1章} than if I have to look up the unicode numbers first and type \def\japChapterNumber{\uchar{123}{44}#1\uchar{122}{224}}
(explicitly calling \startregime[utf]) doesn't make XeTeX as happy as it could be (XeTeX is happiest if you just pass through Unicode characters.
That implies that ConTeXt should switch off all conversions when running in XeTeX and seeing \startregime[utf], right? (I certainly want to use the whole thing in XeTeX, if I ever do start it. I would prefer not to make the code depend on that. I could live with som \if... switches at the beginning and end, sure.) Christopher
Christopher Creutzig said this at Thu, 22 Sep 2005 18:25:01 +0200:
Sure. But editing the file is oh so much easier when I can just type \def\japChapterNumber#1{µ⁄#1’¬} than if I have to look up the unicode numbers first and type \def\japChapterNumber{\uchar{123}{44}#1\uchar{122}{224}}
True, but this is scriptable...
That implies that ConTeXt should switch off all conversions when running in XeTeX and seeing \startregime[utf], right?
That's a good point, actually. In fact, XeTeX now has its own regime- like mechanism (\XeTeXinputencoding "charset-name"), that someone could/ should address, either bypassing ConTeXt's existing regimes, or supplementing them. I'm not really active in that space right now, so I can't do it, but I'd be willing to give hints to someone who wants this feature. adam -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Adam T. Lindsay, Computing Dept. atl@comp.lancs.ac.uk Lancaster University, InfoLab21 +44(0)1524/510.514 Lancaster, LA1 4WA, UK Fax:+44(0)1524/510.492 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Adam Lindsay wrote:
Is it possible to simply enclose the file in a \startregime[utf]...\stopregime pair or do I risk havoc by doing this?
Well, if you're using a regime, it still (usually) depends on symbolic character names being defined under the hood. Also, such an approach
Sure. But editing the file is oh so much easier when I can just type \def\japChapterNumber#1{第#1章} than if I have to look up the unicode numbers first and type \def\japChapterNumber{\uchar{123}{44}#1\uchar{122}{224}}
(explicitly calling \startregime[utf]) doesn't make XeTeX as happy as it could be (XeTeX is happiest if you just pass through Unicode characters.
If xetex handles utf-8 by just looking at catcodes letter, you don't need a regime; you just have to make sure that when the file is loaded
Christopher Creutzig wrote: the chars 128->255 have the right catcode \dostepwiserecurse{128}{255}{1}{\catcode\recurselevel=11\relax} Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (3)
-
Adam Lindsay
-
Christopher Creutzig
-
Hans Hagen