Hans wrote:
chinese is not yet defined in utf so if you want that, we need to do it ... assuming this, how about making a set of tfm,enc,map files that match the unicode positions (volunteers ...)
I'm very willing to help, especially if there is some drudge work involved in constructing the files. I don't know enough (yet) about the logic of it all to help with setting up the system, but if someone can supply skeleton files and/or a method for constructing the necessary files, I'm happy to do any leg-work. Duncan
Duncan Hothersall wrote:
Hans wrote:
chinese is not yet defined in utf so if you want that, we need to do it
...
assuming this, how about making a set of tfm,enc,map files that match the unicode positions (volunteers ...)
I'm very willing to help, especially if there is some drudge work involved in constructing the files. I don't know enough (yet) about the logic of it all to help with setting up the system, but if someone can supply skeleton files and/or a method for constructing the necessary files, I'm happy to do any leg-work.
what we need is a set of encoding files like
/UniEncoding52 [
....
/uni52DF
/uni52E0
/uni52E1
/uni52E2
/uni52E3
/uni52E4
...
/.notdef
....
] def
that represent the ranges and can be used to construct tfm files.
(or whatever index entry is needed in order to filter the metrics from
the ttf file)
maybe patricks font code already can do that:
- read in a ttf file (or a glyph list produced by ttf2tfm or ttf2afm)
- make a range of enc and tfm files
actually, this is rather generic, since pdftex can handle symbolic names
like /index... and /uni..., so if we have such a set, we can stick to
one bunch of enc files
the utf handler can then simply access char E1 from htsong-52.tfm
testing is rather simple:
\pdfmapline{htsong-52
On 13 Dec 2005, at 10:52, Hans Hagen wrote:
Duncan Hothersall wrote:
Hans wrote:
chinese is not yet defined in utf so if you want that, we need to do it
...
assuming this, how about making a set of tfm,enc,map files that match the unicode positions (volunteers ...)
I'm very willing to help, especially if there is some drudge work involved in constructing the files. I don't know enough (yet) about the logic of it all to help with setting up the system, but if someone can supply skeleton files and/or a method for constructing the necessary files, I'm happy to do any leg-work.
what we need is a set of encoding files like
/UniEncoding52 [ .... /uni52DF /uni52E0 /uni52E1 /uni52E2 /uni52E3 /uni52E4 ... /.notdef .... ] def
I have made a Ruby-script (for personal use loosely based on Adam's xsl-files) which generates all the encoding- and symbolfiles from a given cmapfile. If someone could send me the ttf-font, I can generate all the necessary encodingfiles for you. Sjoerd
sjoerd siebinga wrote:
I have made a Ruby-script (for personal use loosely based on Adam's xsl-files) which generates all the encoding- and symbolfiles from a given cmapfile. If someone could send me the ttf-font, I can generate all the necessary encodingfiles for you.
the chinese fonts mentioned in the context garden qualify for such a treatment (htsong cum suis) Hans
On 13 Dec 2005, at 11:34, Hans Hagen wrote:
sjoerd siebinga wrote:
I have made a Ruby-script (for personal use loosely based on Adam's xsl-files) which generates all the encoding- and symbolfiles from a given cmapfile. If someone could send me the ttf-font, I can generate all the necessary encodingfiles for you.
the chinese fonts mentioned in the context garden qualify for such a treatment (htsong cum suis)
Ok. Where can I send the chinese encodingfiles?
sjoerd siebinga wrote:
On 13 Dec 2005, at 11:34, Hans Hagen wrote:
sjoerd siebinga wrote:
I have made a Ruby-script (for personal use loosely based on Adam's xsl-files) which generates all the encoding- and symbolfiles from a given cmapfile. If someone could send me the ttf-font, I can generate all the necessary encodingfiles for you.
the chinese fonts mentioned in the context garden qualify for such a treatment (htsong cum suis)
Ok. Where can I send the chinese encodingfiles?
you can send me a zip maybe we should start thinking on how to set up a repository at https://foundry.supelec.fr/ taco and patrick have more experience in this area than i have so maybe they have some ideas on how to organize this Hans
Hi, sjoerd siebinga wrote:
I have made a Ruby-script (for personal use loosely based on Adam's xsl-files) which generates all the encoding- and symbolfiles from a given cmapfile. If someone could send me the ttf-font, I can generate all the necessary encodingfiles for you. Nice! The recommended (by Xiao Jianfeng) TrueType fonts are given at http://wiki.contextgarden.net/Chinese They are ftp://ftp.ctex.org/pub/tex/fonts/truetype/ttf/htfs.ttf ftp://ftp.ctex.org/pub/tex/fonts/truetype/ttf/hthei.ttf ftp://ftp.ctex.org/pub/tex/fonts/truetype/ttf/htkai.ttf ftp://ftp.ctex.org/pub/tex/fonts/truetype/ttf/htsong.ttf
Richard Gabriel wrote:
But yet another question: What about Japanese? I've made only small research so far, but unlike Chinese, there's almost no information about Japanese in TeX. How much of work would be to adjust the current "chinese" ConTeXt module for Japanese? What would you need for it? [Of course, meanwhile I'll investigate some other ways of typesetting Japanese...] (I don't know much about Japanese.)
In Japanese contrary to Chinese they mix different character sets: - The Chinese characters ("Kanji"), which seem to make up most of the (scientific) text (I'v seen); in addition some pronouncation based characters are used: - ("Kana":) Hiragana and Katagana; the former are rather round characters in Japanese texts, most prominent should be "の" [means something like "of" in English]. They are mostly used for suffixes/prefixes where no Chinese equivalent exists. Whereas Katagana is used to write words which have been taken from (mostly) European languages. For Kanji there should be no problem with the Chinese module, for Kana you need additional support for these characters. Since they are pronouncation based, they only consisted of < 50 Characters each. Tobias (Hmm, I never though I would end up such deep in linguistics duing my PhD theses in physics. But having three Chinese in the group and doing regularily some measurements at a research centre in Taiwan - I couldn't help picking up something.)
Tobias Burnus wrote:
(Hmm, I never though I would end up such deep in linguistics duing my PhD theses in physics. But having three Chinese in the group and doing regularily some measurements at a research centre in Taiwan - I couldn't help picking up something.)
well, there is a certain charm in those characters, even if you cannot read them (during a 2*10 hour trip in a chinese bus during the last tug conference one quickly learns to recognize the symbols for gas stations and such -) browsing a chinese-english dictionary is also fun (i have a small one on my desk; some day i should start collecting dictionaries of all languages that context supports -); with a bit of puzzling one can find out the system behind the way words are made up Hans
Hans Hagen wrote:
what we need is a set of encoding files like
/UniEncoding52 [ .... /uni52DF /uni52E0
I hate to be negative, but I have doubts about how generic this approach may be. In some tentative experiments, I discovered that many (most?) CJK fonts don't use traditional postscript names, but rather map from unicode to an indexed glyph number. Fortunately, ttf2tfm's -w enco@Unicode@ notation seems to address this in most of the old test cases I tried. adam -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Adam T. Lindsay, Computing Dept. atl@comp.lancs.ac.uk Lancaster University, InfoLab21 +44(0)1524/510.514 Lancaster, LA1 4WA, UK Fax:+44(0)1524/510.492 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Adam Lindsay wrote:
Hans Hagen wrote:
what we need is a set of encoding files like
/UniEncoding52 [ .... /uni52DF /uni52E0
I hate to be negative, but I have doubts about how generic this approach may be. In some tentative experiments, I discovered that many (most?) CJK fonts don't use traditional postscript names, but rather map from unicode to an indexed glyph number.
Fortunately, ttf2tfm's -w enco@Unicode@ notation seems to address this in most of the old test cases I tried.
afaik pdftex can handle the indexXXXX and unicXXXX entries as alternatives for glyphnames Hans
Hans Hagen wrote:
Adam Lindsay wrote:
Fortunately, ttf2tfm's -w enco@Unicode@ notation seems to address this in most of the old test cases I tried.
afaik pdftex can handle the indexXXXX and unicXXXX entries as alternatives for glyphnames
Yes. Sorry I wasn't clear on that. It's just that ttf2tfm is the tool that does a good job at extracting those entries when other tools fail. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Adam T. Lindsay, Computing Dept. atl@comp.lancs.ac.uk Lancaster University, InfoLab21 +44(0)1524/510.514 Lancaster, LA1 4WA, UK Fax:+44(0)1524/510.492 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
participants (5)
-
Adam Lindsay
-
Duncan Hothersall
-
Hans Hagen
-
sjoerd siebinga
-
Tobias Burnus