Linebreaking in Japanese in ConTeXt
I am currently testing with a Japanese translation of something I originally created in English using the ConTeXt engine (LMTX). I run into the fact that Japanese has no use for whitespace, so most of the time TeX has no idea where to break. Is there something I can tell ConTeXt to solve this for me or do I have to alk with the translator to put whitespace in the correct places (kind of breaks the automatic linebreaking of TeX, though)? Gerben Wierda (LinkedIn https://www.linkedin.com/in/gerbenwierda, Mastodon https://newsie.social/@gctwnl) R&A IT Strategy https://ea.rna.nl/ (main site) Book: Chess and the Art of Enterprise Architecture https://ea.rna.nl/the-book/ Book: Mastering ArchiMate https://ea.rna.nl/the-book-edition-iii/
On 30 Apr 2023, at 09:12, Gerben Wierda via ntg-context
wrote: I am currently testing with a Japanese translation of something I originally created in English using the ConTeXt engine (LMTX).
I run into the fact that Japanese has no use for whitespace, so most of the time TeX has no idea where to break. Is there something I can tell ConTeXt to solve this for me or do I have to alk with the translator to put whitespace in the correct places (kind of breaks the automatic linebreaking of TeX, though)?
I forgot to add. I use: \setuplanguage[ja][patterns={ja}]\mainlanguage[ja] G
Gerben Wierda via ntg-context schrieb am 30.04.2023 um 09:12:
I am currently testing with a Japanese translation of something I originally created in English using the ConTeXt engine (LMTX).
I run into the fact that Japanese has no use for whitespace, so most of the time TeX has no idea where to break. Is there something I can tell ConTeXt to solve this for me or do I have to alk with the translator to put whitespace in the correct places (kind of breaks the automatic linebreaking of TeX, though)?
Below is a example with some dummy text for a document which contains only japanese. The main points are 1. to use a font which is suitable for japanese, e.g. Noto Serif/Sans CJK JP and 2. enable line breaking with \setscript[nihongo]. The \mainlanguage setting is only needed to get japanese labels for chapter, float titles etc. When you have a document with mixed scripts the approach below doesn't work because spacing around braces and other punctuation marks is wrong for non japanese text, this can be fixed but you have to include language switches in your document. Wolfgang %%%% begin example \definefontfamily [noto-jp] [rm] [Noto Serif CJK JP] \definefontfamily [noto-jp] [ss] [Noto Sans CJK JP] \definetypeface [noto-jp] [mm] [math] [pagella] [default] \setupbodyfont [noto-jp] \mainlanguage [ja] \setscript [nihongo] \starttext \startbuffer 打構セト読役いゆお及層大コモクカ軟毎ホアヲト極書た球87本野ぎレべ襲画売ぎ関負ら断紀チラネキ質紋キタ資私トゃふん。江コキトサ狂毎現ンづ応通サシ続36島性84界ぼゆゃが実書ちぽふず辞躍事シ世図二ハス盛堀人わレをう第紙きかイの浅69乳キメタ事争つょ社以ヨナム法例やほにル済療にしいむ。子け記多ゅくよ正役モ母憲トぞ説帝け務賞ク付打ワナ内桁ぼでクフ因倒おふフ印気地レタケマ冬4雇各裁微ひるぼ。 事ーやずお両遇ほがお職喜じきちそ経岡ウナオケ型児てきろ置実イハ隊留社びか車斉ムマ数今せぴさし都聞最へぼう康自コミニク月言ム検代捕ぶねやう。6本うげがぎ無多シワソ多転のづが加分がど臣次オモエヱ務教すそ必組テタレソ極渡げフ資聞ぽへトづ禁昔キイウニ食度地ヱ考性ユエ内生ヱセ音94党と経陸咲幼勤りだフ。83男融録ょもな氷社くにルべ害変でおけう変世7超キ減安リヌ株崎5転理済ほだをな黒府コワスソ地賢堂衛イぶにし。 広げがぴ視立ス位心もスわ止想ツクケラ産竹聞ヌ落38判境ノア賞7写カチ華供ソミワ名認ばゆずほ始色らふひも属対73写けで止突兄呂びの。策イルリヘ行十織キヘ興必む万初ゃづは勝米い館独ム感一チヒナ希8写タムシイ室情づがい焼殺つか広質トクロ毎新ソモ患関済えめね働表め白告こリあ杯困摘屈裂れ。 名テホユ稿国む生覧ヨナケヤ文6垣めこル社2市むあし保止転済ノ問判りッ集員セワ全宣コ森領で温子うラちぎ夏査せすむ来資ニ面三わぼ際6之減喰杜き。堂ヲ賞了かえー単惑イめ秋投クツ意拡平ヘタヱ時83死れる互武ぽほ拉51届ニネラヘ掲団ヨ度両議ク。状親日値ル断大示ヘニコク米界を連禁イが性正ルッ問句サルユ試片サロヤ前毎到くぜ黄雑らりえ報同ょきむ督晴岡確たごげむ。 両おぎゅら写合ル書殖だインぎ孝自氷ばぞひ事権ウノレヲ少止ヲマヨミ惑磁ぞくゅ竹月おやま手51質サム付2仏い職断年にーばざ場現ムワホ襲印講行否えうご。5帰7報ク用講ルのよす媛表サシメマ札変ナヲ案積出ソトマセ決情注モハイ際県らぶドご食績どはあだ徳増フシ断県ぼね了件親億かゆ。申府ホ線全る訪親ケヒネ芸比歳よをづう供記モケ正同とき練1紀ゅくふい割市ざ著法べちめせ客答ヤヌキオ入朗ルヌカス況写天境月郎ろたづ。 想ネシモヤ人時ワ化新リぱイこ京希っじ索視ケア風月受マカ東初ぞ月公て無速だ文保か受4香刃尿憎潔だクばづ。果テカ川程イ相安べろ方4映チ米闘ヤタレ会民クをづい治果とルせ字社かイよ新軍こ持芸ワラウケ掲三58治館壌ゃ。行びまむ快野分真タネス応住えじで半記キネト正6者理ぼあ面出ヱシレワ慎首レトアハ託展オ良育圏ーを申傷説ひよもぶ索構を持試奇拝括ぱふ。 写みを参念タ三辞よルさき想3込ぜトリ会図ロ法部じち之杉ぎば厳近ワオコ光傷治ワ整区童熊習本さはリん。一ナカヒハ経松隊整キメケト需湖モユエウ秘千そ含覧ふリさ挙言ねをク凍誉ょい利際ゃぼの織用ユリル育4緊ご選1面とし胃危雪異みのひき。竹品ハ著供うげル島該ょ城豊ぎすご腹独べじり皆5座わーう化著式サマ木陸ミレノ義軍80鯨ヒキトフ読大フニノネ読覧ロヨヤイ投時ょうす都活きっみざ。 上なラ済経ロ有実マ乳装ざなぴま武1守い掲6町ぼ禁28担惑らばんフ社策送案物んまへ。人際クょゃね験伝め念変クマヌ白馳ナツエヤ著側ッごフぞ済川さ趣載ツア挙85適クウネレ稿並ドむわの論棋ナソスヒ教伎締潤革フ。球尊話者が心三いむり達尼スか町過ミタオ上止ワカヤ正着ざぱぎぼ失成行偽ワマヘ北奮く。 済開ぼスじと力紹オア手混チコ成意コテモケ定因ド情最どいとづ写今えりめ問景ルオフチ度自す作12行トヱワ録消ひイ。縦右で後賞たじつド諸場後し面詐でばト題社のれふ年結すたぜじ下労リえたゅ高作セレテ給5山ワ任共ご京前ミレシ本写午吉活ぽし。播るせ部3席ヒ投揮ぴしんけ安図んき同更展将ごさぽ画9用サキ購21十ツヘト京締乱僕ろ。 図ラ象45韓ど考小ヌニ野初ムユイシ河止中ヤトキヒ巻女ッ除稿イ輔余た研弔カ被悪テヘ業購ヘタ詳支ルれ身日だがん出療ずー盛趣なばぐ離女放佳怖れずへ。理じかぐ討15枠制8五袋ヒク選会ソアシカ主手付みとこ猪月メ語経マオヱミ社接フセユア幌財ー香口衆ンろ著瀬なこン身個供悪射窃どスあ。切テセ重購ニホ覧権じゆむげ影文ヒヘ相断66哉ノホテフ黒植くね氏員更あきぱゆ文記ふわひ載更上ぜいひば。 \stopbuffer \rm \getbuffer \page \ss \getbuffer \stoptext %%%% end example
On 30 Apr 2023, at 10:04, Wolfgang Schuster via ntg-context
wrote: %%%% begin example \definefontfamily [noto-jp] [rm] [Noto Serif CJK JP] \definefontfamily [noto-jp] [ss] [Noto Sans CJK JP] \definetypeface [noto-jp] [mm] [math] [pagella] [default]
\setupbodyfont [noto-jp]
\mainlanguage [ja]
\setscript [nihongo]
Thank you. How do I change that in my setup? My setup is creating the same stuff in different languages from XML input (LMTX using lua and METAPOST), which means I need to adapt the following setup (cyrillic and greek shown as the other languages with different character sets I use) \startmode[JA] \setuplanguage[ja][patterns={ja}]\mainlanguage[ja] \stopmode \definefallbackfamily [archimate] [ss] [Helvetica] [preset=range:cyrillic, tf=style:light, it=style:lightoblique, bf=style:regular, bi=style:oblique, force=yes, rscale=1.0] \definefallbackfamily [archimate] [ss] [Helvetica] [preset=range:greek, tf=style:light, it=style:lightoblique, bf=style:regular, bi=style:oblique, force=yes] \definefallbackfamily [archimate] [ss] [Hiragino Sans] [preset=range:japanese, tf=style:W3, it=style:W3, bf=style:W5, bi=style:W5, force=yes] \definefontfamily [archimate] [ss] [Optima] \setupbodyfont[archimate] \starttext lua code creates METAPOST code which creates images with embedded ConTeXt 'vboxes' which again contain language-setting ConTeXt code for each piece of text. Gerben Wierda (LinkedIn https://www.linkedin.com/in/gerbenwierda, Mastodon https://newsie.social/@gctwnl) R&A IT Strategy https://ea.rna.nl/ (main site) Book: Chess and the Art of Enterprise Architecture https://ea.rna.nl/the-book/ Book: Mastering ArchiMate https://ea.rna.nl/the-book-edition-iii/
Gerben Wierda via ntg-context schrieb am 30.04.2023 um 10:48:
On 30 Apr 2023, at 10:04, Wolfgang Schuster via ntg-context
mailto:ntg-context@ntg.nl> wrote: %%%% begin example \definefontfamily [noto-jp] [rm] [Noto Serif CJK JP] \definefontfamily [noto-jp] [ss] [Noto Sans CJK JP] \definetypeface [noto-jp] [mm] [math] [pagella] [default]
\setupbodyfont [noto-jp]
\mainlanguage [ja]
\setscript [nihongo]
Thank you.
How do I change that in my setup?
My setup is creating the same stuff in different languages from XML input (LMTX using lua and METAPOST), which means I need to adapt the following setup (cyrillic and greek shown as the other languages with different character sets I use)
\startmode[JA] \setuplanguage[ja][patterns={ja}]\mainlanguage[ja] \stopmode
When your document prints only text in a single language change the setup above to \startmode [JA] \setscript [nihongo] \mainlanguage [ja] \stopmode but for documents which use multiple script/languages at the same time replace the previous setup with \startsetups [japanese] \setscript [nihongo] \stopsetups \setuplanguage [ja] [setups=japanese] and add \language[ja] before japanese text to ensure linebreaking is enabled. Wolfgang
participants (3)
-
Gerben Wierda
-
Henning Hraban Ramm
-
Wolfgang Schuster