On Jan 23, 2009, at 12:13 PM, Hans Hagen wrote:
here is a variant that implements a function (and does not use the env trick)
do local add = function (x,y) return x+y end local P,Ca,Cc= lpeg.P,lpeg.Ca,lpeg.Cc local symbols = { I=1,V=5,X=10,L=50,C=100,D=500,M=1000,IV=4,IX=9,XL=40,CD=400,CM=900} local adders = { } for s,n in pairs(symbols) do adders[s] = P(s)*Cc(n)/add end local MS = adders.M^0 local CS = (adders.D*adders.C^(-4)+adders.CD+adders.CM +adders.C^(-4))^(-1) local XS = (adders.L*adders.X^(-4)+adders.XL+adders.X^(-4))^(-1) local IS = (adders.V*adders.I^(-4)+adders.IX+adders.IV +adders.I^(-4))^(-1) local p = Ca(Cc(0)*MS*CS*XS*IS) function string:romantonumber() return p:match(self:upper()) end end
print(string.romantonumber("MMIX")) print(string.romantonumber("MMIIIX"))
just run such script using
mtxrun --script yourscript.lua
as luatex (texlua) has the latest lpeg built in)
Brilliant! This one does work when I use it with luatex (not with my system lua though, even though I have the latest released version of lpeg 0.9 installed. Bizarre...
2. How can I check if a string begins with a class of words "(Der | Die |Das |The |An )" etc. and strip these words from the string? I do it with a compiled regexp in python, but "Programming in lua" has this to say: "Unlike some other systems, in Lua a modifier can only be applied to a character class; there is no way to group patterns under a modifier. For instance, there is no pattern that matches an optional word (unless the word has only one letter). Usually you can circumvent this limitation using some of the advanced techniques that we will see later." I haven't found these techniques yet.
local stripped = { "Der", "Die", "Das" }
local p = lpeg.P(false)
for k, v in ipairs(stripped) do p = p + lpeg.P(v) end
local w = p * " "
local stripper = lpeg.Cs(((w/"") + lpeg.C(1))^0)
lpeg.print(stripper)
str = "Germans somehow always talk about Der Thomas and Der Hans"
print(stripper:match(str))
Brilliant again! I can run with that, looks great! And who doesn't want a "local stripper" in his code?
3. How can I compare strings with utf8 characters? My naive approach if string.find(record, "Résumé") doesn't appear to work (while the same method does work if the string has only ASCII characters).
since lua is 8 bit clean utf should just work
OK, then the problem must be somewhere else. I'll investigate. Thanks a lot, and best wishes Thomas