Hi, is there someone on this list who has tried korean with mkiv? next week taco and i travel to korea (user group meeting) so we'd better know how to do korean Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Am 03.02.2009 um 21:27 schrieb Hans Hagen:
Hi,
is there someone on this list who has tried korean with mkiv? next week taco and i travel to korea (user group meeting) so we'd better know how to do korean
Korean use the same rules for line breaking as chinese but spaces in the input are not removed and remain in the output. Wolfgang
2009/2/4 Wolfgang Schuster
Am 03.02.2009 um 21:27 schrieb Hans Hagen:
Hi,
is there someone on this list who has tried korean with mkiv? next week taco and i travel to korea (user group meeting) so we'd better know how to do korean
Korean use the same rules for line breaking as chinese but spaces in the input are not removed and remain in the output.
Several month ago, I have tried Korean typesetting with MKIV, but the result was short of satisfactory one. These are minimum typesetting rule for Korean documents: ``Korean characters'' means here all Hangul Syllables (U+AC00 .. U+D7A3) plus Chinese Ideographs. - Spaces between words should be preserved, as Wolfgang said. - Linebreaking should be allowed between Korean characters. We prefer here \penalty50 or \discretionary{}{}{} rather than \hskip. - Linebreaking is allowed between Korean character and Latin character. We prefer here \hskip0pt to make possible hyphenations inside Latin word. - Kumchik (``Kinsoku'' in Japanese) rule is the same as Chinese or Japanese typesetting: * Linebreaking should not occur after, for example, opening parentheses. * Linebreaking should not occur before, for example, closing parentheses, comma, or fullstop. That's all. Best, Dohyun Kim
Am 03.02.2009 um 21:27 schrieb Hans Hagen:
is there someone on this list who has tried korean with mkiv? next week taco and i travel to korea (user group meeting) so we'd better know how to do korean
Here is a short example with text from one of the korean latex manuals. \definefontfeature[korean][mode=node,script=hang,lang=kor] \starttypescript [serif] [unbatang] \definefontsynonym [UnBatang-Regular] [name:unbatang] [features=korean] \definefontsynonym [UnBatang-Bold] [name:unbatangbold] [features=korean] \stoptypescript \starttypescript [serif] [unbatang] \setups[font:fallback:serif] \definefontsynonym [Serif] [UnBatang-Regular] [features=korean] \definefontsynonym [SerifBold] [UnBatang-Bold] [features=korean] \stoptypescript \definetypeface [unbatang] [rm] [serif] [unbatang] [default] \setupbodyfont[unbatang] \starttext % the font setting for cjk gobbles spaces between words % and at the end of the lines, a little hack is needed % to get a correct output in the example \catcode`\ = 13 \def {\hskip.25emplus.15emminus.15em\relax} 나라의 말이 중국과 달라서 한자로 는 서로 통하지 아니하므로 이런 까 닭으로 어리석은 백성들이 말하고 자 하는 바가 있어도 마침내 제뜻 을 능히 펴지 못하는 사람이 많으 니라. 내가 이를 불쌍히 여겨 새로 스물여덟자를 만드나니 \blank 아아, 나는 이제야 도 (道) 를 알았도다. 마음이 어두 운 자는 이목이 누 (累) 가 되지 않는다. 이목만을 믿는 자는 보고 듣는 것 이 더욱 밝혀져서 병이 되는 것이다. 이제 내 마부가 발을 말굽에 밟혀 서 뒷차에 실리었으 므로, 나는 드디어 혼자 고삐를 늦추어 강에 띄우고, 무릎을 구부려 발을 모으고 안장 위에 앉았다. 한번 떨어지면 강이나 물로 땅을 삼고, 물로 옷을 삼으며, 물로 몸을 삼고, 물로 성정을 삼을 것이 다. 이제야 내 마 음은 한번 떨어질 것을 판단한 터이므로, 내 귓속에 강 물 소리가 없어졌 다. 무릇 아홉 번 건너는데도 걱정이 없어 의자 위에 서 좌와 (坐臥) 하고 기거 (起居) 하는 것 같았다. \stoptext Wolfgang
2009년 2월 4일 (수) 오전 9:25, Wolfgang Schuster
나라의 말이 중국과 달라서 한자로 는 서로 통하지 아니하므로 이런 까 닭으로 어리석은 백성들이 말하고 자 하는 바가 있어도 마침내 제뜻 을 능히 펴지 못하는 사람이 많으 니라. 내가 이를 불쌍히 여겨 새로 스물여덟자를 만드나니
나라의 말이 중국과 달라서 한자로는 % endline space should be honoured 서로 통하지 아니하므로 이런 까닭으로 어리석은 백성들이 말하고자 하는 바가 있어도 마침내 제뜻을 능히 펴지 못하는 사람이 많으니라. 내가 이를 불쌍히 여겨 새로 스물여덟자를 만드나니
아아, 나는 이제야 도 (道) 를 알았도다. 마음이 어두운 자는 이목이 누 (累) 가 되지 않는다. 이목만을 믿는 자는 보고 듣는 것이 더욱 밝혀져서 병이 되는 것이다. 이제 내 마부가 발을 말굽에 밟혀서 뒷차에 실리었으 므로, 나는 드디어 혼자 고삐를 늦추어 강에 띄우고, 무릎을 구부려 발을 모으고 안장 위에 앉았다. 한번 떨어지면 강이나 물로 땅을 삼고, 물로 옷을 삼으며, 물로 몸을 삼고, 물로 성정을 삼을 것이다. 이제야 내 마 음은 한번 떨어질 것을 판단한 터이므로, 내 귓속에 강물 소리가 없어졌 다. 무릇 아홉 번 건너는데도 걱정이 없어 의자 위에서 좌와 (坐臥) 하고 기거 (起居) 하는 것 같았다.
아아, 나는 이제야 도(道)를 알았도다. 마음이 어두운 자는 이목이 누(累)가 되지 않는다. 이목만을 믿는 자는 보고 듣는 것이 더욱 밝혀져서 병이 되는 것이다. 이제 내 마부가 발을 말굽에 밟혀서 뒷차에 실리었으므로, 나는 드디어 혼자 고삐를 늦추어 강에 띄우고, 무릎을 구부려 발을 모으고 안장 위에 앉았다. 한번 떨어지면 강이나 물로 땅을 삼고, 물로 옷을 삼으며, 물로 몸을 삼고, 물로 성정을 삼을 것이다. 이제야 내 마음은 한번 떨어질 것을 판단한 터이므로, 내 귓속에 강물 소리가 없어졌다. 무릇 아홉 번 건너는데도 걱정이 없어 의자 위에서 좌와(坐臥)하고 기거(起居)하는 것 같았다. Korean orthography has rules of where spaces should be inserted and where not. So here I proofread Korean texts provided by Wolfgang. Best, Dohyun Kim
2009/2/4 Dohyun Kim
나라의 말이 중국과 달라서 한자로는 % endline space should be honoured
This is a side effect of the font handling, I used the "hang" script which supports only chinese and removes all spaces from the input (between words and at the end of the line). You get better results with "features=default" in the typescript because the spaces remain now in the input but ConTeXt makes a line break now only at the spaces. ConTeXt's CJK is very limited at the moment because the rules have to be defined (with all exceptions) and what we did in the past was to add them piecewise and many things are missing.
Korean orthography has rules of where spaces should be inserted and where not. So here I proofread Korean texts provided by Wolfgang.
I would be better you can provide us better examples, copy and past from other texts is not the best solution. Wolfgang
Korean orthography has rules of where spaces should be inserted and where not. So here I proofread Korean texts provided by Wolfgang.
I would be better you can provide us better examples, copy and past from other texts is not the best solution.
The dvipdfmx* site has a few examples, can you bring them in text form with correct spaces. * http://project.ktug.or.kr/dvipdfmx/ Wolfgang
2009/2/4 Wolfgang Schuster
2009/2/4 Dohyun Kim
: 나라의 말이 중국과 달라서 한자로는 % endline space should be honoured
This is a side effect of the font handling, I used the "hang" script which supports only chinese and removes all spaces from the input (between words and at the end of the line).
You get better results with "features=default" in the typescript because the spaces remain now in the input but ConTeXt makes a line break now only at the spaces.
Yes! Much better with default features. Only missing is allowing line break between characters. On the other hand, as script tag "hang" denotes "Hangul" according to opentype specification, it would be confusing to use this name for Chinese typesetting. See: http://www.microsoft.com/typography/otspec/scripttags.htm
I would be better you can provide us better examples, copy and past from other texts is not the best solution.
The example provided by Wolfgang and corrected by me seems to be sufficient for simple testing. It contains Hangul, Chinese characters, parentheses, commas, and Latin fullstops. Only Latin words are missing; so how about this? 나라의 말이 중국과 달라서 한자(chinese characters)로는 Following link is the result compiled with latex under [a5paper] option: http://people.ktug.or.kr/~nomos/mine/koreantypesettingwithlatex.png Dohyun Kim
On Wed, Feb 4, 2009 at 10:57 AM, Dohyun Kim
You get better results with "features=default" in the typescript because the spaces remain now in the input but ConTeXt makes a line break now only at the spaces.
Yes! Much better with default features. Only missing is allowing line break between characters.
This is already on my list what is missing in ConTeXt's CJK support but I think it makes sense to collect first what is needed and what features we want.
On the other hand, as script tag "hang" denotes "Hangul" according to opentype specification, it would be confusing to use this name for Chinese typesetting. See: http://www.microsoft.com/typography/otspec/scripttags.htm
The chinese script use "hani" and "hang" is for the moment only a synonym. In the end each language use their correct script tag. You can the list also in font-ott.lua.
I would be better you can provide us better examples, copy and past from other texts is not the best solution.
The example provided by Wolfgang and corrected by me seems to be sufficient for simple testing. It contains Hangul, Chinese characters, parentheses, commas, and Latin fullstops. Only Latin words are missing; so how about this?
나라의 말이 중국과 달라서 한자(chinese characters)로는
Following link is the result compiled with latex under [a5paper] option: http://people.ktug.or.kr/~nomos/mine/koreantypesettingwithlatex.png
Looks good, we can use it to test ConTeXt's output after the code is finished. Wolfgang
Dohyun Kim wrote:
On the other hand, as script tag "hang" denotes "Hangul" according to opentype specification, it would be confusing to use this name for Chinese typesetting. See: http://www.microsoft.com/typography/otspec/scripttags.htm
can you give the correct list to use then? but anyhow, since these scripts are used mixed they need to share the logic anyway
I would be better you can provide us better examples, copy and past from other texts is not the best solution.
ok, extended the analyzer with penalty5 between chars and adaptive spacing between scripts i uploaded a beta .. btw, that one has already new math stuff so one needs a real recent luatex as well (so if the zip fails you can try the experimental minimals once they're in sync) the best you can so is to collect specs (together with other cjk users) and make examples forget about latex or other tex solutions since i won't look in them anyway (i can't process them and it's easier for me to start from scratch and also, whenever i run into an example it uses other encodings than utf and/or the multiple pfb approach) \starttext \enabletrackers[otf.analyzing] \setupcolors[state=start] \definefontfeature[korean][script=hani,language=kor,mode=node,analyze=yes] % hangul \definedfont[arialuni*korean at 15pt] [begin] 나라의 말이 중국과 달라서 한자로는 서로 통하지 아니하므로 이런 까닭으로 어리석은 백성들이 말하고자 하는 바가 있어도 마침내 제뜻을 능히 펴지 못하는 사람이 많으니라. 내가 이를 불쌍히 여겨 새로 스물여덟자를 만드나니 아아, 나는 이제야 도(道)를 알았도다. 마음이 어두운 자는 이목이 누(累)가 되지 않는다. 이목만을 믿는 자는 보고 듣는 것이 더욱 밝혀져서 병이 되는 것이다. 이제 내 마부가 발을 말굽에 밟혀서 뒷차에 실리었으므로, 나는 드디어 혼자 고삐를 늦추어 강에 띄우고, 무릎을 구부려 발을 모으고 안장 위에 앉았다. 한번 떨어지면 강이나 물로 땅을 삼고, 물로 옷을 삼으며, 물로 몸을 삼고, 물로 성정을 삼을 것이다. 이제야 내 마음은 한번 떨어질 것을 판단한 터이므로, 내 귓속에 강물 소리가 없어졌다. 무릇 아홉 번 건너는데도 걱정이 없어 의자 위에서 좌와(坐臥)하고 기거(起居)하는 것 같았다. [end] anyway, it looks korean to me (later arthur an i will look into the composition problem, not too hard and fun to do, but it might need a different place in the processing sequence) currently we do this in an analysis pass, but that might change because it can interfere with feature processing (future versions ofloatex will have a few more tricks for manipulating the widths of glyphs and so, which in turn means that some of the logic can be redone in more clever ways) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hi, Hans:
Great work! One comment:
Line 3 and 4, the second paragraph, the line is too stretched,
in fact you can break a Korean word anywhere you want, and no
hyphenation is needed.
Yue Wang
2009/2/4 Hans Hagen
Dohyun Kim wrote:
On the other hand, as script tag "hang" denotes "Hangul" according to opentype specification, it would be confusing to use this name for Chinese typesetting. See: http://www.microsoft.com/typography/otspec/scripttags.htm
can you give the correct list to use then? but anyhow, since these scripts are used mixed they need to share the logic anyway
I would be better you can provide us better examples, copy and past from other texts is not the best solution.
ok, extended the analyzer with penalty5 between chars and adaptive spacing between scripts i uploaded a beta .. btw, that one has already new math stuff so one needs a real recent luatex as well (so if the zip fails you can try the experimental minimals once they're in sync)
the best you can so is to collect specs (together with other cjk users) and make examples
forget about latex or other tex solutions since i won't look in them anyway (i can't process them and it's easier for me to start from scratch and also, whenever i run into an example it uses other encodings than utf and/or the multiple pfb approach)
\starttext
\enabletrackers[otf.analyzing]
\setupcolors[state=start]
\definefontfeature[korean][script=hani,language=kor,mode=node,analyze=yes] % hangul
\definedfont[arialuni*korean at 15pt]
[begin]
나라의 말이 중국과 달라서 한자로는 서로 통하지 아니하므로 이런 까닭으로 어리석은 백성들이 말하고자 하는 바가 있어도 마침내 제뜻을 능히 펴지 못하는 사람이 많으니라. 내가 이를 불쌍히 여겨 새로 스물여덟자를 만드나니
아아, 나는 이제야 도(道)를 알았도다. 마음이 어두운 자는 이목이 누(累)가 되지 않는다. 이목만을 믿는 자는 보고 듣는 것이 더욱 밝혀져서 병이 되는 것이다. 이제 내 마부가 발을 말굽에 밟혀서 뒷차에 실리었으므로, 나는 드디어 혼자 고삐를 늦추어 강에 띄우고, 무릎을 구부려 발을 모으고 안장 위에 앉았다. 한번 떨어지면 강이나 물로 땅을 삼고, 물로 옷을 삼으며, 물로 몸을 삼고, 물로 성정을 삼을 것이다. 이제야 내 마음은 한번 떨어질 것을 판단한 터이므로, 내 귓속에 강물 소리가 없어졌다. 무릇 아홉 번 건너는데도 걱정이 없어 의자 위에서 좌와(坐臥)하고 기거(起居)하는 것 같았다.
[end]
anyway, it looks korean to me
(later arthur an i will look into the composition problem, not too hard and fun to do, but it might need a different place in the processing sequence)
currently we do this in an analysis pass, but that might change because it can interfere with feature processing (future versions ofloatex will have a few more tricks for manipulating the widths of glyphs and so, which in turn means that some of the logic can be redone in more clever ways)
Hans
----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
Yue Wang wrote:
Hi, Hans:
Great work! One comment: Line 3 and 4, the second paragraph, the line is too stretched, in fact you can break a Korean word anywhere you want, and no hyphenation is needed.
well, the spec was: inject penalty5 ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Yue Wang wrote:
Hi, Hans:
Great work! One comment: Line 3 and 4, the second paragraph, the line is too stretched, in fact you can break a Korean word anywhere you want, and no hyphenation is needed.
this version inserts a penalty5 and glue0 ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hans:
Looks much better now.
I think we should have similar inject for Korean in ConTeXt.
(I think Mr. Kim's \penalty50 is for LaTeX? Maybe in ConTeXt it is different...)
Anyway, I think Mr. Kim can give more comments and fine tunes:)
Yue Wang
On Wed, Feb 4, 2009 at 7:32 PM, Hans Hagen
Yue Wang wrote:
Hi, Hans:
Great work! One comment: Line 3 and 4, the second paragraph, the line is too stretched, in fact you can break a Korean word anywhere you want, and no hyphenation is needed.
this version inserts a penalty5 and glue0
----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
2009/2/4 Hans Hagen
Yue Wang wrote:
Hi, Hans:
Great work! One comment: Line 3 and 4, the second paragraph, the line is too stretched, in fact you can break a Korean word anywhere you want, and no hyphenation is needed.
this version inserts a penalty5 and glue0
The result has - too wide space between Korean char(including Chinese char) and fullstop. - too wide space between Korean char(including Chinese char) and comma. - too wide space between Korean char(including Chinese char) and opening parenthesis. - too wide space between closing parenthesis and Korean char(including Chinese char). - too wide space between opening parenthesis and Korean char(including Chinese char) - too wide space between Korean char(including Chinese char) and closing parenthesis. Korean typesetting is somewhat different from Chinese or Japanese typesetting. It has inter-word spaces (ie. glue). So inter-word spaces, fullstops, commas, and parentheses should be treated as the same as in Latin typesetting, except allowing line break 1. between Korean char and opening parentheses 2. between closing parenthesis and Korean char. In these cases, glue with big plus or minus stretch is not desirable. Dohyun Kim
Dohyun Kim wrote:
2009/2/4 Hans Hagen
: Hi, Hans:
Great work! One comment: Line 3 and 4, the second paragraph, the line is too stretched, in fact you can break a Korean word anywhere you want, and no hyphenation is needed.
Yue Wang wrote: this version inserts a penalty5 and glue0
The result has - too wide space between Korean char(including Chinese char) and fullstop. - too wide space between Korean char(including Chinese char) and comma. - too wide space between Korean char(including Chinese char) and opening parenthesis. - too wide space between closing parenthesis and Korean char(including Chinese char). - too wide space between opening parenthesis and Korean char(including Chinese char) - too wide space between Korean char(including Chinese char) and closing parenthesis.
these spacings are configurable but keep in mind that when doing justification tex will try to distribute according to what is set maybe for korean the general space might be larger than normal? Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
2009/2/4 Hans Hagen
Dohyun Kim wrote:
2009/2/4 Hans Hagen
: Yue Wang wrote:
Hi, Hans:
Great work! One comment: Line 3 and 4, the second paragraph, the line is too stretched, in fact you can break a Korean word anywhere you want, and no hyphenation is needed.
this version inserts a penalty5 and glue0
The result has - too wide space between Korean char(including Chinese char) and fullstop. - too wide space between Korean char(including Chinese char) and comma. - too wide space between Korean char(including Chinese char) and opening parenthesis. - too wide space between closing parenthesis and Korean char(including Chinese char). - too wide space between opening parenthesis and Korean char(including Chinese char) - too wide space between Korean char(including Chinese char) and closing parenthesis.
these spacings are configurable but keep in mind that when doing justification tex will try to distribute according to what is set
maybe for korean the general space might be larger than normal?
Saying "too wide space" was my mistake. I mean, in all these cases "no space" is desirable. Just allowing line break (withot stretching) before opening parenthesis and after closing parenthesis will be good enough. In all other cases, "do nothing" is what I want. Dohyun Kim
Hans and Mojca:
So the Korean support begins,
I think the unfonts can be add into the ConTeXt minimals distribution
(or as a extra package)?
It is a high-quality free fonts collection which contains batang,
dotum style and so on.
You can download the fonts at
http://kldp.net/projects/unfonts/
Yue Wang
2009/2/4 Yue Wang
Hi, Hans:
Great work! One comment: Line 3 and 4, the second paragraph, the line is too stretched, in fact you can break a Korean word anywhere you want, and no hyphenation is needed.
Yue Wang
2009/2/4 Hans Hagen
: Dohyun Kim wrote:
On the other hand, as script tag "hang" denotes "Hangul" according to opentype specification, it would be confusing to use this name for Chinese typesetting. See: http://www.microsoft.com/typography/otspec/scripttags.htm
can you give the correct list to use then? but anyhow, since these scripts are used mixed they need to share the logic anyway
I would be better you can provide us better examples, copy and past from other texts is not the best solution.
ok, extended the analyzer with penalty5 between chars and adaptive spacing between scripts i uploaded a beta .. btw, that one has already new math stuff so one needs a real recent luatex as well (so if the zip fails you can try the experimental minimals once they're in sync)
the best you can so is to collect specs (together with other cjk users) and make examples
forget about latex or other tex solutions since i won't look in them anyway (i can't process them and it's easier for me to start from scratch and also, whenever i run into an example it uses other encodings than utf and/or the multiple pfb approach)
\starttext
\enabletrackers[otf.analyzing]
\setupcolors[state=start]
\definefontfeature[korean][script=hani,language=kor,mode=node,analyze=yes] % hangul
\definedfont[arialuni*korean at 15pt]
[begin]
나라의 말이 중국과 달라서 한자로는 서로 통하지 아니하므로 이런 까닭으로 어리석은 백성들이 말하고자 하는 바가 있어도 마침내 제뜻을 능히 펴지 못하는 사람이 많으니라. 내가 이를 불쌍히 여겨 새로 스물여덟자를 만드나니
아아, 나는 이제야 도(道)를 알았도다. 마음이 어두운 자는 이목이 누(累)가 되지 않는다. 이목만을 믿는 자는 보고 듣는 것이 더욱 밝혀져서 병이 되는 것이다. 이제 내 마부가 발을 말굽에 밟혀서 뒷차에 실리었으므로, 나는 드디어 혼자 고삐를 늦추어 강에 띄우고, 무릎을 구부려 발을 모으고 안장 위에 앉았다. 한번 떨어지면 강이나 물로 땅을 삼고, 물로 옷을 삼으며, 물로 몸을 삼고, 물로 성정을 삼을 것이다. 이제야 내 마음은 한번 떨어질 것을 판단한 터이므로, 내 귓속에 강물 소리가 없어졌다. 무릇 아홉 번 건너는데도 걱정이 없어 의자 위에서 좌와(坐臥)하고 기거(起居)하는 것 같았다.
[end]
anyway, it looks korean to me
(later arthur an i will look into the composition problem, not too hard and fun to do, but it might need a different place in the processing sequence)
currently we do this in an analysis pass, but that might change because it can interfere with feature processing (future versions ofloatex will have a few more tricks for manipulating the widths of glyphs and so, which in turn means that some of the logic can be redone in more clever ways)
Hans
----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
Yue Wang wrote:
Hans and Mojca:
So the Korean support begins, I think the unfonts can be add into the ConTeXt minimals distribution (or as a extra package)? It is a high-quality free fonts collection which contains batang, dotum style and so on. You can download the fonts at http://kldp.net/projects/unfonts/
idem for sil fonts Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On Wed, Feb 4, 2009 at 8:16 PM, Hans Hagen
Yue Wang wrote:
Hans and Mojca:
So the Korean support begins, I think the unfonts can be add into the ConTeXt minimals distribution (or as a extra package)? It is a high-quality free fonts collection which contains batang, dotum style and so on. You can download the fonts at http://kldp.net/projects/unfonts/
idem for sil fonts
ditto for cwtex fonts (GPL Traditional Chinese fonts): http://cle.linux.org.tw/fonts/cwttf/center/
Yue Wang wrote:
On Wed, Feb 4, 2009 at 8:16 PM, Hans Hagen
wrote: Yue Wang wrote:
Hans and Mojca:
So the Korean support begins, I think the unfonts can be add into the ConTeXt minimals distribution (or as a extra package)? It is a high-quality free fonts collection which contains batang, dotum style and so on. You can download the fonts at http://kldp.net/projects/unfonts/ idem for sil fonts
ditto for cwtex fonts (GPL Traditional Chinese fonts): http://cle.linux.org.tw/fonts/cwttf/center/
ok, so how about making a page on the wiki where users can mention these fonts, when we have a complete list of redistributable fonts we can decide what to include per font we need to mention then: what scripts are supported, how good is the quality, and how big are the fonts this also makes it easier to provide the typescripts Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hi, Hans:
ok, so how about making a page on the wiki where users can mention these fonts, when we have a complete list of redistributable fonts we can decide what to include
Thanks.Good idea, here is the link: http://wiki.contextgarden.net/CJK_fonts [sorry, I am a wiki newbie] Some well-known free fonts are listed. Most users will choose one of them for their TeX documents. I think we can follow the local distribution's choice since it is used widely in the local community. The followings are the default font in the standard local TeX distribution (CTeX, ktug Collection, cwTeX). Simplified Chinese: sim* font distributed in windows. we provide the typescript only. and we can provide typescript for Adobe's 4 Chinese fonts also. Traditional Chinese: cwTeX fonts. we can provide the typescript and distribute them. Korean: unfonts. we can provide the typescript and distribute them.
per font we need to mention then: what scripts are supported, how good is the quality, and how big are the fonts
this also makes it easier to provide the typescripts
Oh, Dohyun, if I said something wrong on Korean's typeface, please point out. I am not a native Korean speaker:) Yue Wang
On Wed, Feb 4, 2009 at 3:57 PM, Yue Wang
Hi, Hans:
ok, so how about making a page on the wiki where users can mention these fonts, when we have a complete list of redistributable fonts we can
decide
what to include
Thanks.Good idea, here is the link: http://wiki.contextgarden.net/CJK_fonts [sorry, I am a wiki newbie]
Why not CJVK_fonts ? http://www.amazon.de/review/product/0596514476/ref=sr_1_1_cm_cr_acr_txt?_encoding=UTF8&showViewpoints=1 -- luigi
Hi, Luigi:
Why not CJVK_fonts ? http://www.amazon.de/review/product/0596514476/ref=sr_1_1_cm_cr_acr_txt?_encoding=UTF8&showViewpoints=1
Yes,Vietnamese is a Asian language, but it is very different from CJK. It uses a alphabet based writing system. Yue Wang
On Wed, Feb 4, 2009 at 4:22 PM, Yue Wang
Hi, Luigi:
Why not CJVK_fonts ?
Yes,Vietnamese is a Asian language, but it is very different from CJK. It uses a alphabet based writing system.
yes I know. But just to keep some kind of uniformity, as the book seems to suggest. Of course, can be just confusing. -- luigi
Hi,
yes I know. But just to keep some kind of uniformity, as the book seems to suggest.
no. the more widely used name is CJK. see http://en.wikipedia.org/wiki/CJKV, it will redirect to CJK. Unicode standard also classified the group as CJK. (Version 5.1.0, page 409) CJK uses the similar writing systems, all of them use Chinese Characters. Modern Vietnamese do not include Chinese characters, it uses the Latin writing system. Only ancient Vietnamese includes Chinese characters, and that's why some websites/books use CJKV.
Of course, can be just confusing.
Yue Wang
On Wed, Feb 4, 2009 at 4:52 PM, Yue Wang
Hi,
yes I know. But just to keep some kind of uniformity, as the book seems to suggest.
no. the more widely used name is CJK. see http://en.wikipedia.org/wiki/CJKV, it will redirect to CJK. Unicode standard also classified the group as CJK. (Version 5.1.0, page 409)
CJK uses the similar writing systems, all of them use Chinese Characters. Modern Vietnamese do not include Chinese characters, it uses the Latin writing system. Only ancient Vietnamese includes Chinese characters, and that's why some websites/books use CJKV.
good to know -- as usual, I'm not able to read all book I bought . -- luigi
2009/2/4 Yue Wang
Oh, Dohyun, if I said something wrong on Korean's typeface, please point out. I am not a native Korean speaker:)
There are many more free Korean fonts. Unfonts have good (but not excellent, frankly speaking) quality and are originated from Korean TeX community, so that Korean TeX users normally use unfonts "officially". Actually, however, they use other commercial or free fonts for really critical or private purposes. Moreover, recently, there's a noticeable trend in Korea for big private companies or local governments to release free fonts of high quality. Amongst them, "Nanum" fonts released by number one Korean portal site, Naver, are noteworthy. Serif and Sans series of Nanum fonts: http://hangeul.naver.com/index.nhn?goto=fonts#fonts Monospace series of Nanum fonts: http://dev.naver.com/projects/nanumfont These fonts are freely redistributable, as far as I know. I will introduce some more Korean free fonts of quality later on another occasion. Dohyun Kim
can you give the correct list to use then?
hang for Hangul syllables (U+AC00 to U+D7A3), hani for Chinese (Han) ideographs (U+3400 to U+4DFF, U+4E00 to U+9FFF, U+20000 to U+2A6DF, amongst others -- the vast majority of characters in modern use is in the second range).
but anyhow, since these scripts are used mixed they need to share the logic anyway
Not really; they can use together in the same text, but they still are very different behaviour. Besides, the essential script for Korean really is hang, not hani. Arthur
Arthur Reutenauer wrote:
can you give the correct list to use then?
hang for Hangul syllables (U+AC00 to U+D7A3), hani for Chinese (Han) ideographs (U+3400 to U+4DFF, U+4E00 to U+9FFF, U+20000 to U+2A6DF, amongst others -- the vast majority of characters in modern use is in the second range).
but anyhow, since these scripts are used mixed they need to share the logic anyway
Not really; they can use together in the same text, but they still are very different behaviour. Besides, the essential script for Korean really is hang, not hani.
ah, so what then about: hngl
Arthur ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
-- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
ah, so what then about: hngl
You mean as an internal alias for hang + hani, to be used for Korean? Why not; after all, ISO 15924 registered a Jpan code as an alias for Han characters + Hiragana + Katakana. But of course, in the actual fonts, you will find only hang or hani. Arthur
ah, so what then about: hngl
Oh, I just got it (by reading the second edition of Ken Lunde's _CJKV_): you're confusing the script tag with the *feature* tag 'hngl', which is also short for Hangul, but has different semantics: it's a feature supposed to replace a Hanja by the Hangul(s) that correspond to its possible pronounciation(s). Not very common, I guess, and in any case, ConTeXt doesn't have the interface to address this feature: it would need to present the user with all the possible Hanguls, and ask him to choose the appropriate one among them (i.e., this is one of those OpenType features that needs some interactivity, like, for example, 'aalt'). Anyway, I really wonder how that feature made it into the OpenType spec: it seems really specific, and OpenType doesn't have the equivalent feature for Japanese Kanjis, which could be tortured just the same if one would want to. Arthur
participants (6)
-
Arthur Reutenauer
-
Dohyun Kim
-
Hans Hagen
-
luigi scarso
-
Wolfgang Schuster
-
Yue Wang