Taco Hoekwater wrote:
with lpeg, the code looks a bit harder, but is still short and relatively shortforward:
Here is a smarter lpeg that takes care of embedded \{ as well: local P, S, V = lpeg.P, lpeg.S, lpeg.V local matchtable = { "TEXT", TEXT = (V("FOOTNOTE") + 1)^0 * -1, SP = S(" \n\t")^0, BODY = "{"*(P("\\{")+P("\\}")+(1 - S("{}"))+V("BODY"))^0*"}", FOOTNOTE = "\\footnote" * V("SP") * V("BODY") / print, } lpeg.match(matchtable, data) Quick explanation: the symbol + is alternation, * is concatenation, - is exclusion, ^ is a repeat modifier, "" and P("") match strings, numbers match bytes, S("") matches byte sets. the "matchtable" describes the input data. In there, matchtable[1] points to the "top" rule, which is "TEXT". The other key-value pairs in the table say roughly this: TEXT = an optional sequence [^0] of either footnote items [V("FOOTNOTE")] or non-footnote bytes [1], until the end of the data is reached [-1] FOOTNOTE = the string '\footnote', followed by optional space, followed by the footnote body. SP = zero or more occurrences of a space, tab, or newline byte BODY = a left brace, followed by an optional sequence of four possible things: 1. the string "\{" [P("\\{")] 2. the string "\}" [P("\\}")] 3. a byte that is not "{" nor "}" [(1 - S("{}"))] 4. a recursively included braced body [V("BODY")], followed by a right brace. The [/print] runs the print() command on each successful FOOTNOTE match. Best wishes, Taco