Hi everybody, best wishes for (the still new) 2019! My question is not strictly a ConTeXt problem, but about the way luatex (and pure Lua) can handle utf8 in lpeg. Here is my Lua example: mystring = "abcdeφὴὰabcde" local replace_table = { a = "y", c = "z", ὴ = "ή", ὰ = "ά", } function replace(s) local patt = (lpeg.Cs(1)) / replace_table local parser = lpeg.Cs((patt + 1)^0) t = parser:match(s) return t end newstring = replace(mystring) print(newstring) This will successfully replace "a" and "c," but not "ὴ" or "ὰ" because lpeg.Cs(1) sees only the first byte of these multibyte characters. Pure Lua complains with an error message; luatex runs, but does not do the replacement. What would be a good way to work around this limitation? All best Thomas
Thomas A. Schmitz schrieb am 06.01.19 um 18:53:
Hi everybody,
best wishes for (the still new) 2019! My question is not strictly a ConTeXt problem, but about the way luatex (and pure Lua) can handle utf8 in lpeg. Here is my Lua example:
mystring = "abcdeφὴὰabcde"
local replace_table = { a = "y", c = "z", ὴ = "ή", ὰ = "ά", }
function replace(s) local patt = (lpeg.Cs(1)) / replace_table local parser = lpeg.Cs((patt + 1)^0) t = parser:match(s) return t end
newstring = replace(mystring)
print(newstring)
This will successfully replace "a" and "c," but not "ὴ" or "ὰ" because lpeg.Cs(1) sees only the first byte of these multibyte characters. Pure Lua complains with an error message; luatex runs, but does not do the replacement. What would be a good way to work around this limitation?
Below is a modified version of the example on page 103 of the ConTeXt Lua Documents (cld-mkiv.pdf) manual. \starttext \startluacode print("abcdeφὴὰabcde") local remap = utf.remapper { a = "y", c = "z", ὴ = "ή", ὰ = "ά" } print(remap("abcdeφὴὰabcde")) \stopluacode \stoptext Wolfgang
On 06.01.19 19:20, Wolfgang Schuster wrote:
\starttext
\startluacode
print("abcdeφὴὰabcde")
local remap = utf.remapper { a = "y", c = "z", ὴ = "ή", ὰ = "ά" }
print(remap("abcdeφὴὰabcde"))
\stopluacode
\stoptext
Wolfgang, thank you, I should have looked into this manual! Is there an easy way to load the necessary additional libraries if I'm running luatex alone, outside of mtxrun? Thomas
On 1/6/2019 8:14 PM, Thomas A. Schmitz wrote:
On 06.01.19 19:20, Wolfgang Schuster wrote:
\starttext
\startluacode
print("abcdeφὴὰabcde")
local remap = utf.remapper { a = "y", c = "z", ὴ = "ή", ὰ = "ά" }
print(remap("abcdeφὴὰabcde"))
\stopluacode
\stoptext
Wolfgang, thank you, I should have looked into this manual! Is there an easy way to load the necessary additional libraries if I'm running luatex alone, outside of mtxrun? what do you mean with "running luatex alone" ; you can run scripts with
mtxrun --script yourscript foo.txt (or do you mean something different) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On 06.01.19 23:08, Hans Hagen wrote:
what do you mean with "running luatex alone" ; you can run scripts with
mtxrun --script yourscript foo.txt
(or do you mean something different)
Hi Hans, thanks, this is what I meant! I guess I was trying to reinvent the wheel and doing things in pure Lua, but processing utf8 is indeed much easier with all the context functions. So thanks, I hope I can take it from here! Thomas
participants (3)
-
Hans Hagen
-
Thomas A. Schmitz
-
Wolfgang Schuster