[NTG-context] lua lpeg + utf8

Wolfgang Schuster wolfgang.schuster.lists at gmail.com
Sun Jan 6 19:20:26 CET 2019


Thomas A. Schmitz schrieb am 06.01.19 um 18:53:
> Hi everybody,
> 
> best wishes for (the still new) 2019! My question is not strictly a 
> ConTeXt problem, but about the way luatex (and pure Lua) can handle utf8 
> in lpeg. Here is my Lua example:
> 
> mystring = "abcdeφὴὰabcde"
> 
> local replace_table = {
>    a = "y",
>    c = "z",
>    ὴ = "ή",
>    ὰ = "ά",
> }
> 
> function replace(s)
>      local patt = (lpeg.Cs(1)) / replace_table
>      local parser = lpeg.Cs((patt + 1)^0)
>      t = parser:match(s)
>      return t
> end
> 
> newstring = replace(mystring)
> 
> print(newstring)
> 
> This will successfully replace "a" and "c," but not "ὴ" or "ὰ" because 
> lpeg.Cs(1) sees only the first byte of these multibyte characters. Pure 
> Lua complains with an error message; luatex runs, but does not do the 
> replacement. What would be a good way to work around this limitation?

Below is a modified version of the example on page 103 of the ConTeXt 
Lua Documents (cld-mkiv.pdf) manual.

\starttext

\startluacode

print("abcdeφὴὰabcde")

local remap = utf.remapper { a = "y", c = "z", ὴ = "ή", ὰ = "ά" }

print(remap("abcdeφὴὰabcde"))

\stopluacode

\stoptext

Wolfgang


More information about the ntg-context mailing list