[NTG-context] removing word in filtered XML
Taco Hoekwater
taco at elvenkind.com
Thu Aug 20 11:08:08 CEST 2020
> On 19 Aug 2020, at 18:10, Pablo Rodriguez <oinos at gmx.es> wrote:
>
> Dear list,
>
> I have the following sample:
>
> \startbuffer[demo]
> <html>
> <body>
> <div id="First">
> <p>This is
> <span class="special">One of the best</span> a paragraph.</p>
> <p>This is another paragraph.</p>
> <p>This is another
> <span class="special">Two of the best</span> paragraph.</p>
> <p>This is another
> <span class="special">Three</span> paragraph.</p>
> <p>This is another
> <span class="special">Four of five</span> paragraph.</p>
> </div>
> </body>
> </html>
> \stopbuffer
>
> \startxmlsetups xml:initialize
> \xmlsetsetup{#1}{html}{xml:gen}
> \stopxmlsetups
>
> \xmlregistersetup{xml:initialize}
>
> \startxmlsetups xml:gen
> \xmlfilter{#1}{/**/div/command(xml:special)}
> \stopxmlsetups
>
> \startxmlsetups xml:special
> %~ \startitem
> \cldcontext{string.gsub(lxml.flush([[#1]]),
> " of the ", "")}\stopitem
> \stopxmlsetups
>
> \starttext
> \xmlprocessbuffer{main}{demo}{}
> \stoptext
>
> Is there any way to remove " of " and " of the " in the filtered content
> (xml:special)?
There is pretty much always ‘a way’, but I do not know of a ’nice’ way.
Your problem is that lxml.flush() and friends do not return a value,
they just do a direct context(‘xxxx’) call behind the scenes with no
return string for you to modify.
Also, the special (catcode, space handling) rules for setups and \cldcontext
do not help you.
That does not mean it can’t be done. As I don’t know a of a nice way,
here is a low-level ‘ugly' way:
\startluacode
function filter(a)
local div = lxml.getid(a)
process(div)
lxml.flush(div)
end
function process(div)
for c=1,#div.dt do
if type(div.dt[c]) == 'string' then
div.dt[c] = string.gsub(div.dt[c], " of the ", "")
else
process(div.dt[c])
end
end
end
\stopluacode
\startxmlsetups xml:special
\ctxlua{filter([[#1]])}
\stopxmlsetups
process() is recursive because your xml:special gets the whole <div>. Not sure if you intended it that way.
And if it can be done nicer, I am sure someone will correct me :)
Best wishes,
Taco
More information about the ntg-context
mailing list