Hi all, I thought this was easy, but I overestimated my competence… I want to filter xml elements via their attributes and retrieve and typeset parts belonging together. Here is a small test file that explains what I’m trying: \startbuffer[test] <document> <topics> <topic id="test1"> <title>This is the first test</title> <date>22/11/16</date> </topic> <topic id="test2"> <title>This is the second test</title> <date>22/11/17</date> </topic> </topics> <chapters> <chapter id="test1"> <content> This will be the content of the <emph>first</emph> chapter. </content> </chapter> <chapter id="test2"> <content> This will be the content of the <emph>second</emph> chapter. </content> </chapter> </chapters> </document> \stopbuffer \startxmlsetups xml:testsetups \xmlsetsetup{#1}{*}{-} \xmlsetsetup{#1}{document|chapters|chapter|content|emph}{xml:*} \stopxmlsetups \xmlregistersetup{xml:testsetups} \startxmlsetups xml:document \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:chapters \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:chapter \xmlfunction {#1} {chapter} \xmlflush {#1} \stopxmlsetups \startxmlsetups xml:chapter:content \xmltext {#1} {content} \stopxmlsetups \startxmlsetups xml:emph {\em \xmlflush {#1}} \stopxmlsetups \startluacode function xml.functions.chapter (t) local ch_id = t.at.id local metadata = xml.filter (root, '../../topics/topic[@id=="%s"]', ch_id) print (inspect(metadata)) lxml.command(t, ".", "xml:chapter:content") context.par () context (ch_id) context.par () end \stopluacode \starttext \xmlprocessbuffer{main}{test}{} \stoptext The line with xml.filter does not work as I expected. How can I walk the tree, find the “topic” element with the same “id” attribute as the chapter I’m currently in, and then typeset the different children of the topic element? Thanks a lot and all best Thomas
Hi Thomas. I'm not sure about the code, sorry, but I do know that an XML document can't have two IDs of the same value. Typically you would use a linkend attribute on the element which is referencing an id (in this case the topics, I think). Probably doesn't help with your problem, but it's likely a prerequisite for it to work. Bests, Duncan On Wed, 16 Nov 2022 at 16:11, Thomas A. Schmitz via ntg-context < ntg-context@ntg.nl> wrote:
Hi all,
I thought this was easy, but I overestimated my competence… I want to filter xml elements via their attributes and retrieve and typeset parts belonging together. Here is a small test file that explains what I’m trying:
\startbuffer[test] <document> <topics> <topic id="test1"> <title>This is the first test</title> <date>22/11/16</date> </topic> <topic id="test2"> <title>This is the second test</title> <date>22/11/17</date> </topic> </topics> <chapters> <chapter id="test1"> <content> This will be the content of the <emph>first</emph> chapter. </content> </chapter> <chapter id="test2"> <content> This will be the content of the <emph>second</emph> chapter. </content> </chapter> </chapters> </document> \stopbuffer
\startxmlsetups xml:testsetups \xmlsetsetup{#1}{*}{-} \xmlsetsetup{#1}{document|chapters|chapter|content|emph}{xml:*} \stopxmlsetups
\xmlregistersetup{xml:testsetups}
\startxmlsetups xml:document \xmlflush{#1} \stopxmlsetups
\startxmlsetups xml:chapters \xmlflush{#1} \stopxmlsetups
\startxmlsetups xml:chapter \xmlfunction {#1} {chapter} \xmlflush {#1} \stopxmlsetups
\startxmlsetups xml:chapter:content \xmltext {#1} {content} \stopxmlsetups
\startxmlsetups xml:emph {\em \xmlflush {#1}} \stopxmlsetups
\startluacode function xml.functions.chapter (t) local ch_id = t.at.id local metadata = xml.filter (root, '../../topics/topic[@id=="%s"]', ch_id) print (inspect(metadata)) lxml.command(t, ".", "xml:chapter:content") context.par () context (ch_id) context.par () end \stopluacode
\starttext \xmlprocessbuffer{main}{test}{} \stoptext
The line with xml.filter does not work as I expected. How can I walk the tree, find the “topic” element with the same “id” attribute as the chapter I’m currently in, and then typeset the different children of the topic element?
Thanks a lot and all best
Thomas
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net
___________________________________________________________________________________
Hi Duncan, Thank you for pointing this out! I knew this was true inside the xmlns namespace, so you can’t have identical xml:id tags, but you’re probably right that it’s better to avoid this confusion altogether. Alas, this doesn’t help with my problem. Since there was a type in my minimal example from my experimentations, I include a corrected version, avoiding the identical tags All best Thomas \startbuffer[test] <document> <topics> <topic t:id="test1"> <title>This is the first test</title> <date>22/11/16</date> </topic> <topic t:id="test2"> <title>This is the second test</title> <date>22/11/17</date> </topic> </topics> <chapters> <chapter ch:id="test1"> <content> This will be the content of the <emph>first</emph> chapter. </content> </chapter> <chapter ch:id="test2"> <content> This will be the content of the <emph>second</emph> chapter. </content> </chapter> </chapters> </document> \stopbuffer \startxmlsetups xml:testsetups \xmlsetsetup{#1}{*}{-} \xmlsetsetup{#1}{document|chapters|chapter|content|emph}{xml:*} \stopxmlsetups \xmlregistersetup{xml:testsetups} \startxmlsetups xml:document \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:chapters \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:chapter \xmlfunction {#1} {chapter} \xmlflush {#1} \stopxmlsetups \startxmlsetups xml:chapter:content \xmltext {#1} {content} \stopxmlsetups \startxmlsetups xml:emph {\em \xmlflush {#1}} \stopxmlsetups \startluacode function xml.functions.chapter (t) local ch_id = t.at.ch:id local metadata = xml.filter (t, '../../topics/topic[@t:id=="%s"]', ch_id) print (inspect(metadata)) lxml.command(t, ".", "xml:chapter:content") context.par () context (ch_id) context.par () end \stopluacode
On 16. Nov 2022, at 17:18, Duncan Hothersall via ntg-context
wrote: I'm not sure about the code, sorry, but I do know that an XML document can't have two IDs of the same value. Typically you would use a linkend attribute on the element which is referencing an id (in this case the topics, I think).
Probably doesn't help with your problem, but it's likely a prerequisite for it to work.
Bests,
Duncan
Il 16/11/22 18:33, Thomas A. Schmitz via ntg-context ha scritto:
\startbuffer[test] <document> <topics> <topic t:id="test1"> <title>This is the first test</title> <date>22/11/16</date> </topic> <topic t:id="test2"> <title>This is the second test</title> <date>22/11/17</date> </topic> </topics> <chapters> <chapter ch:id="test1"> <content> This will be the content of the <emph>first</emph> chapter. </content> </chapter> <chapter ch:id="test2"> <content> This will be the content of the <emph>second</emph> chapter. </content> </chapter> </chapters> </document> \stopbuffer
\startxmlsetups xml:testsetups \xmlsetsetup{#1}{*}{-} \xmlsetsetup{#1}{document|chapters|chapter|content|emph}{xml:*} \stopxmlsetups
\xmlregistersetup{xml:testsetups}
\startxmlsetups xml:document \xmlflush{#1} \stopxmlsetups
\startxmlsetups xml:chapters \xmlflush{#1} \stopxmlsetups
\startxmlsetups xml:chapter \xmlfunction {#1} {chapter} \xmlflush {#1} \stopxmlsetups
\startxmlsetups xml:chapter:content \xmltext {#1} {content} \stopxmlsetups
\startxmlsetups xml:emph {\em \xmlflush {#1}} \stopxmlsetups
\startluacode function xml.functions.chapter (t) local ch_id = t.at.ch:id
local ch_id = t.at["ch:id"]
local metadata = xml.filter (t, '../../topics/topic[@t:id=="%s"]', ch_id) print (inspect(metadata)) lxml.command(t, ".", "xml:chapter:content") context.par () context (ch_id) context.par () end \stopluacode
Best wishes, Massi
On 11/16/22 19:56, mf via ntg-context wrote:
local ch_id = t.at["ch:id"]
You're right, of course, using a colon was a stupid idea. When I replace it with an underscore, you can see that both are in fact identical: \startbuffer[test] <document> <topics> <topic t_id="test1"> <title>This is the first test</title> <date>22/11/16</date> </topic> <topic t_id="test2"> <title>This is the second test</title> <date>22/11/17</date> </topic> </topics> <chapters> <chapter ch_id="test1"> <content> This will be the content of the <emph>first</emph> chapter. </content> </chapter> <chapter ch_id="test2"> <content> This will be the content of the <emph>second</emph> chapter. </content> </chapter> </chapters> </document> \stopbuffer \startxmlsetups xml:testsetups \xmlsetsetup{#1}{*}{-} \xmlsetsetup{#1}{document|chapters|chapter|content|emph}{xml:*} \stopxmlsetups \xmlregistersetup{xml:testsetups} \startxmlsetups xml:document \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:chapters \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:chapter \xmlfunction {#1} {chapter} % \xmlflush {#1} \stopxmlsetups \startxmlsetups xml:chapter:content \xmltext {#1} {content} \stopxmlsetups \startxmlsetups xml:emph {\em \xmlflush {#1}} \stopxmlsetups \startluacode function xml.functions.chapter (t) local chapter_id = t.at.ch_id local other_chapter_id = t.at["ch_id"] context (chapter_id) context.par () context (other_chapter_id) context.par () local metadata = xml.filter (t, '../../topics/topic[@t:id=="%s"]', ch_id) print (inspect(metadata)) lxml.command(t, ".", "xml:chapter:content") context.par () end \stopluacode \starttext \xmlprocessbuffer{main}{test}{} \stoptext
Just a quick question regarding this? Is xml.filter equivalent to \xmlfilter? If so, how do you pass the match to a command as you'd do with \xmlfilter? Best, Denis
-----Ursprüngliche Nachricht----- Von: ntg-context
Im Auftrag von mf via ntg- context Gesendet: Mittwoch, 16. November 2022 20:56 An: ntg-context@ntg.nl Cc: mf Betreff: Re: [NTG-context] Xml filtering in Lua This works:
local metadata = xml.filter (t, '../../topics/topic[@t:id=="' .. ch_id .. '"]')
also this:
local lpath = string.format('../../topics/topic[@t:id=="%s"]', ch_id)
local metadata = xml.filter (t, lpath)
It looks like xml.filter supports only 2 arguments (see lxml-tex.lua), and so it doesn't let you use string formatting patterns like the "context" command does.
You can write:
context('the value of @t:id is "%s"', ch_id)
but you can't write:
xml.filter (t, '../../topics/topic[@t:id=="%s"]', ch_id)
Best wishes,
Massi
__________________________________________________________ _________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg- context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net __________________________________________________________ _________________________
-----Ursprüngliche Nachricht----- Von: ntg-context
Im Auftrag von Thomas A. Schmitz via ntg-context Gesendet: Mittwoch, 16. November 2022 20:56 An: mailing list for ConTeXt users Cc: Thomas A. Schmitz Betreff: Re: [NTG-context] Xml filtering in Lua On 11/16/22 19:56, mf via ntg-context wrote:
local ch_id = t.at["ch:id"]
You're right, of course, using a colon was a stupid idea. When I replace it with an underscore, you can see that both are in fact identical:
\startbuffer[test] <document> <topics> <topic t_id="test1"> <title>This is the first test</title> <date>22/11/16</date> </topic> <topic t_id="test2"> <title>This is the second test</title> <date>22/11/17</date> </topic> </topics> <chapters> <chapter ch_id="test1"> <content> This will be the content of the <emph>first</emph> chapter. </content> </chapter> <chapter ch_id="test2"> <content> This will be the content of the <emph>second</emph> chapter. </content> </chapter> </chapters> </document> \stopbuffer
\startxmlsetups xml:testsetups \xmlsetsetup{#1}{*}{-} \xmlsetsetup{#1}{document|chapters|chapter|content|emph}{xml: *} \stopxmlsetups
\xmlregistersetup{xml:testsetups}
\startxmlsetups xml:document \xmlflush{#1} \stopxmlsetups
\startxmlsetups xml:chapters \xmlflush{#1} \stopxmlsetups
\startxmlsetups xml:chapter \xmlfunction {#1} {chapter} % \xmlflush {#1} \stopxmlsetups
\startxmlsetups xml:chapter:content \xmltext {#1} {content} \stopxmlsetups
\startxmlsetups xml:emph {\em \xmlflush {#1}} \stopxmlsetups
\startluacode function xml.functions.chapter (t) local chapter_id = t.at.ch_id local other_chapter_id = t.at["ch_id"] context (chapter_id) context.par () context (other_chapter_id) context.par () local metadata = xml.filter (t, '../../topics/topic[@t:id=="%s"]', ch_id) print (inspect(metadata)) lxml.command(t, ".", "xml:chapter:content") context.par () end \stopluacode
\starttext \xmlprocessbuffer{main}{test}{} \stoptext
__________________________________________________________ _________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg- context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net __________________________________________________________ _________________________
On 11/16/22 21:51, Denis Maier via ntg-context wrote:
Just a quick question regarding this? Is xml.filter equivalent to \xmlfilter? If so, how do you pass the match to a command as you'd do with \xmlfilter?
I'm still digesting and playing with Massi's reply; will probably be back with more questions :-) Anyway: If you've filtered something out like this local tree = xml.filter (t, "../chapter[@title='mytitle'") you can then apply a command to it lxml.command(lxml.id(tree), ".", "xml:chapter:command") and have to define the command as \startxmlsetups xml:chapter:command \xmltext {#1} {content} \stopxmlsetups for example. Thomas
On 11/16/2022 10:09 PM, Thomas A. Schmitz via ntg-context wrote:
On 11/16/22 21:51, Denis Maier via ntg-context wrote:
Just a quick question regarding this? Is xml.filter equivalent to \xmlfilter? If so, how do you pass the match to a command as you'd do with \xmlfilter?
I'm still digesting and playing with Massi's reply; will probably be back with more questions :-)
Anyway:
If you've filtered something out like this
local tree = xml.filter (t, "../chapter[@title='mytitle'")
always keep in mind that some expressions return a list of matches, that can be looped over and some commands just process the first anyway, it can sometimes help to add print(tstring(tree)) so see what you got Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
This works: local metadata = xml.filter (t, '../../topics/topic[@t:id=="' .. ch_id .. '"]') also this: local lpath = string.format('../../topics/topic[@t:id=="%s"]', ch_id) local metadata = xml.filter (t, lpath) It looks like xml.filter supports only 2 arguments (see lxml-tex.lua), and so it doesn't let you use string formatting patterns like the "context" command does. You can write: context('the value of @t:id is "%s"', ch_id) but you can't write: xml.filter (t, '../../topics/topic[@t:id=="%s"]', ch_id) Best wishes, Massi
On 11/16/2022 8:56 PM, mf via ntg-context wrote:
This works:
local metadata = xml.filter (t, '../../topics/topic[@t:id=="' .. ch_id .. '"]')
also this:
local lpath = string.format('../../topics/topic[@t:id=="%s"]', ch_id)
local metadata = xml.filter (t, lpath)
It looks like xml.filter supports only 2 arguments (see lxml-tex.lua), and so it doesn't let you use string formatting patterns like the "context" command does.
You can write:
context('the value of @t:id is "%s"', ch_id)
but you can't write:
xml.filter (t, '../../topics/topic[@t:id=="%s"]', ch_id) In Thomas example this is also an approach:
\startxmlsetups xml:document \xmlfunction{#1} {document} \xmlflush{#1} \stopxmlsetups with \startluacode local topics = { } local chapters = { } function xml.functions.document(t) for c in xml.collected(t,"/topics/topic") do topics[c.at.t_id] = c end -- for c in xml.collected(t,"/chapters/chapter") do -- chapters[c.at.ch_id] = c -- -- or flush here -- end end function xml.functions.chapter (t) local ch_id = t.at.ch_id local metadata = topics[ch_id] lxml.command(t, ".", "xml:chapter:content") context.par () context (ch_id) context.par () end \stopluacode so, basically you collect data and use it later ... for huge datasets that saves some time if you have only chapters to process you can even decide to flush in that function Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On 17. Nov 2022, at 11:04, Hans Hagen via ntg-context
wrote: so, basically you collect data and use it later ... for huge datasets that saves some time
if you have only chapters to process you can even decide to flush in that function
Hans
I think this is exactly the approach I’m looking for: collecting everything in Lua tables and then retrieving and typesetting it later. I’m experimenting with it right now. I will have to define a proper lxml.command for every xml tag, I guess; otherwise, the xml gets serialized? I’ll play some more and will certainly be back with questions :-) Thank you, as always, and all best Thomas
On 11/17/22 11:04, Hans Hagen via ntg-context wrote:
so, basically you collect data and use it later ... for huge datasets that saves some time
if you have only chapters to process you can even decide to flush in that function
Alright, I'm making very good progress here, but right now I'm stumbling upon a problem I can't solve. It's difficult to make a minimal example, so bear with some snippets. I load data from an external xml file (not the one I'm processing) and store some of it in a lua table. local examples = lxml.load ("my_examples", "examples.xml") local sets = lxml.load ("my_sets", "example_sets.xml") for e in xml.collected (examples, "/examples/chapter/example") do local ex_id = e.at.id all_examples [ex_id] = e end This works as expected, with print (inspect (all_examples)), I can see that the table looks the way I expect. I then retrieve some entries of the table by their key: local current_example = all_examples [key] Again, this appears to work; when I have a lxml.displayverbatim (current_example) in my file, the xml is typeset and looks like I would expect it to look. However, whatever I try, I get the serialized xml typeset, with all <tags> verbatim, instead of processed. Here's what I've tried: \startxmlsetups xml:chapter:example \xmlfirst {#1} {.} \par \stopxmlsetups lxml.command (current_example, ".", "xml:chapter:example") or xml.sprint (lxml.id (current_example)) or local problem = xml.text (lxml.id (current_example), "./[text()]") xml.sprint (problem) I was expecting at least the last version to retrieve the pure text, but it typesets again with the tags included. So I guess my question is: how can I tell ConTeXt to parse my xml as xml and apply the proper setups instead of serializing it? All best wishes Thomas
On 11/20/22 19:19, Thomas A. Schmitz via ntg-context wrote:
I load data from an external xml file (not the one I'm processing) and store some of it in a lua table.
local examples = lxml.load ("my_examples", "examples.xml")
Replying to myself, and sorry for the noise (this was fairly easy, should have seen it earlier): instead of loading the file "examples.xml," I simply include it via xmlinclude into the tree; this way the proper setups are applied. All best Thomas
participants (5)
-
denis.maier@unibe.ch
-
Duncan Hothersall
-
Hans Hagen
-
mf
-
Thomas A. Schmitz