How to define a new language?
I'm trying to define a new language for use in a bilingual document, but
my hyphenation patterns are being ignored and I'm sure I must be doing
something wrong.
I've devised a minimal example that should hyphenate between vowels but
doesn't. Here's the hyphenation patterns file:
----------------------------- lang-foo.pat -----------------------------
% Hyphenation patterns for Foo.
\begingroup
\patterns{
a1a a1e a1i a1o a1u
e1a e1e e1i e1o e1u
i1a i1e i1i i1o i1u
o1a o1e o1i o1o o1u
u1a u1e u1i u1o u1u
}
\endgroup
------------------------------------------------------------------------
I tried to convert this to Lua code but couldn't figure out the right
invocation of _mtxrun --script pattern --convert_.
Then I wrote a minimal Context file and hoped it would pick up
lang-foo.pat anyhow:
----------------------------- test-foo.tex -----------------------------
\installlanguage[foo][
spacing=broad,
leftsentence=---,
rightsentence=---,
leftsubsentence=---,
rightsubsentence=---,
leftquote=\upperleftsinglesixquote,
rightquote=\upperrightsingleninequote,
leftquotation=\upperleftdoublesixquote,
rightquotation=\upperrightdoubleninequote,
date={month,\ ,day,{,\ },year},
state=stop,
]
\mainlanguage[foo]
\setuppapersize[A6]
\setupalign[hyphenated,morehyphenation,flushleft]
\starttext
umleoeoikaoukkoiiaaiewuewniimoaralkokolwwuiuirmlkipetnoaeteuntoilooooamm
uuuauemepokawpoieoomtkeaipeailloaukoiwaeuaewurawiueanwtaoaoemmuuonwleaue
mrtmuweokmikariurtlluiraapnkowuaueolmrneraiiioeauemmaamauiiolluwrltounte
aaaunnoitwueemwlniowotuauwomaupaapwtawiikiuolnrolouletouunwptoioououoael
kiitinouoeolopewourtuineuaermnmmioaiuienkewuiaaklinipprurainouioiomuwokr
koriiaieulkppwwieoemlipiakinppprnpaiaaekapnpatritotoeormaetualeonemlppau
earoeauoiimekoilliiuomirnrwarpupieiilaaueeoaekwriummikpatakeairolwpiaoaa
tniweuiineeetuiektpalukluanetoiklwnemutlonnuaalpnelpniiuiutmaoakmkoiomia
ineioporrieopatakomonipiuwuiwiiaueiueoluoeaaiootwuekinkwawrrwwppilewoioo
itroreraeonkeuiaawalunnueoreiaaeewknailaeomomilkruruekuonkaonwwiieunonow
\stoptext
------------------------------------------------------------------------
It does seem to see lang-foo.pat:
$ context --pdf test-foo.tex | fgrep patterns
mkiv lua stats > loaded patterns: en::2 foo::67, load time: 0.000
But in the resulting PDF, nothing is hyphenated. (I'll post the PDF if
anyone really wants to see it.)
Here's my _context --version_ output:
mtx-context | ConTeXt Process Management 1.02
mtx-context |
mtx-context | main context file: /usr/local/context/tex/texmf-context/tex/context/base/mkiv/context.mkiv
mtx-context | current version: 2018.01.19 13:42
Can someone point me to my mistake?
Thanks in advance,
Paul.
--
Paul Hoffman
On Wed, Feb 27, 2019 at 05:05:05PM -0500, Paul Hoffman wrote:
I'm trying to define a new language for use in a bilingual document, but my hyphenation patterns are being ignored and I'm sure I must be doing something wrong.
Never mind, I solved the problem. I'll describe what I did here, in
case anyone finds it helpful down the road.
First, I figured out how to create lang-foo.lua manually -- which wasn't
too painful, since the hyphenation rules for the language are very
simple -- and found that Context uses it if it sits next to the file
that uses \language[foo].
Then, after some detective work, I found that I can generate
/bar/lang-foo.lua from /foo/hyph-foo.tex by running the following
command:
mtxrun --script patterns --convert --path=/foo --destination=/bar \
--specification=foo,hyph-foo,Foobar
This prints a lot of errors ("no valid file", "convertion aborted")
because, after converting hyph-foo.tex, mtx-patterns.lua tries to
convert everything in its hard-coded list, but that's not a big deal.
Besides lang-foo.lua, which is all I really need, I also get
lang-foo.rme, lang-foo.hyp, and lang-foo.pat; the latter two are for
mkii, I gather.
Would a patch for mtx-patterns.lua that adds an option to convert *only*
a particular language's file be useful? I'm thinking an option like
--only that one can use like this:
mtxrun --script patterns --convert --path=/foo --destination=/bar \
--only \
--specification=foo,hyph-foo,Foo \
--specification=bar,hyph-bar,Bar \
The simplest implementation would be to clear the list first (if --only
is used), then add foo and bar to it. Something like this:
------------------------------------------------------------------------
--- OLD/mtx-patterns.lua 2019-02-28 11:10:27.180857745 -0500
+++ NEW/mtx-patterns.lua 2019-02-28 11:16:27.952426988 -0500
@@ -28,6 +28,7 @@
<flag name="path"><short>source path where hyph-foo.tex files are stored</short></flag>
<flag name="destination"><short>destination path</short></flag>
<flag name="specification"><short>additional patterns: e.g.: =cy,hyph-cy,welsh</short></flag>
+ <flag name="only"><short>convert only the specified patterns</short></flag>
<flag name="compress"><short>compress data</short></flag>
<flag name="words"><short>update words in given file</short></flag>
<flag name="hyphenate"><short>show hypephenated words</short></flag>
@@ -42,6 +43,7 @@
<example><command>mtxrun --script pattern --check --path=c:/data/develop/svn-hyphen/trunk/hyph-utf8/tex/generic/hyph-utf8/patterns</command></example>
<example><command>mtxrun --script pattern --convert --path=c:/data/develop/svn-hyphen/trunk/hyph-utf8/tex/generic/hyph-utf8/patterns/tex --destination=e:/tmp/patterns</command></example>
<example><command>mtxrun --script pattern --convert --path=c:/data/develop/svn-hyphen/trunk/hyph-utf8/tex/generic/hyph-utf8/patterns/txt --destination=e:/tmp/patterns</command></example>
+ <example><command>mtxrun --script pattern --convert --path=/foo --destination=/bar --only --specification=cy,hyph-cy,welsh</command></example>
<example><command>mtxrun --script pattern --hyphenate --language=nl --left=3 nogalwiedes inderdaad</command></example>
</subcategory>
</category>
@@ -497,6 +499,9 @@
--
local specification = environment.argument("specification")
if specification then
+ if environment.argument("only") then
+ scripts.patterns.list = {}
+ end
local components = utilities.parsers.settings_to_array(specification)
if #components == 3 then
table.insert(scripts.patterns.list,1,components)
------------------------------------------------------------------------
Paul.
--
Paul Hoffman
Paul Hoffman schrieb am 27.02.19 um 23:05:
I'm trying to define a new language for use in a bilingual document, but my hyphenation patterns are being ignored and I'm sure I must be doing something wrong.
For which Language do you need patterns? Did you try to contact Arthur or Mojca to add the missing pattern to their repository which includes the hyphenation pattern for the other languages. When this is done you can ask Hans to add support for your missing language in ConTeXt. Wolfgang
On Thu, Feb 28, 2019 at 06:19:24PM +0100, Wolfgang Schuster wrote:
Paul Hoffman schrieb am 27.02.19 um 23:05:
I'm trying to define a new language for use in a bilingual document, but my hyphenation patterns are being ignored and I'm sure I must be doing something wrong.
For which Language do you need patterns?
It's an invented language, so no one else will ever need to use it.
Context is great for making dictionaries, descriptive grammars, etc.
Paul.
--
Paul Hoffman
On Thu, Feb 28, 2019 at 01:23:51PM -0500, Paul Hoffman wrote:
It's an invented language, so no one else will ever need to use it.
Maybe so -- though you can’t know that for sure -- but if you’re down the path of requesting a change in a ConTeXt script to add it locally, you might as well publish the patterns for Mojca and me to add to the repository. Arthur
participants (3)
-
Arthur Reutenauer
-
Paul Hoffman
-
Wolfgang Schuster