string library with utf-8 support
Hello, I developped a Japanese dictionnary in XeTeX. It compiles just fine in luatex but I would like to code the commands in Lua instead of TeX. To make the code faster, cleaner, simpler and more powerfull. Yet, I need to be able to manipulate utf-8 strings (find, substring, replace, length, etc...). Is there a simple way to do that in LUA ? Will there be one day a luatex library for utf8 string managments ? (because it seems the string library is just for strings with 1 byte characters and it is really lacking) Keep up the good work. Olivier Binda
Is there a simple way to do that in LUA ? Will there be one day a luatex library for utf8 string managments ?
It's the unicode library, it's been in LuaTeX for years. unicode.utf8.find corresponds to string.find for UTF-8-encoded strings, etc. The complete name is “Selene Unicode” (slnunico).
(because it seems the string library is just for strings with 1 byte characters
Indeed. Arthur
Olivier Binda wrote:
Hello, I developped a Japanese dictionnary in XeTeX. It compiles just fine in luatex but I would like to code the commands in Lua instead of TeX. To make the code faster, cleaner, simpler and more powerfull.
Yet, I need to be able to manipulate utf-8 strings (find, substring, replace, length, etc...). Is there a simple way to do that in LUA ? Will there be one day a luatex library for utf8 string managments ? (because it seems the string library is just for strings with 1 byte characters and it is really lacking)
there is unicode.utf.gsub etc (you can also use lpeg which is 8 bit clean and for utf manipulations that' sgood enough) ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (3)
-
Arthur Reutenauer
-
Hans Hagen
-
Olivier Binda