XML: Correct usage of hash symbol in url
Hi, I need to process URLs in XML documents differently depending on the target. Thanks to Hans and Thomas I can now deal with the attributes. Now I need to prefix certain kind of targets with a certain URL (a web viewer for IIIF images in this particular case). The prefix contains a #, which, as a special character seems to cause an error. Adding a double hash seems to work, but those ## will end up in the link URL. How can I properly escape a single #? I've already tried using \Ux{23}, but that won't work either. Best, Denis \setupinteraction[state=start] \startxmlsetups xml:test \xmlsetsetup{#1}{*}{-} \xmlsetsetup{#1}{doc|element}{xml:*} \stopxmlsetups \xmlregisterdocumentsetup{test}{xml:test} \startxmlsetups xml:doc \xmlflush{#1} \stopxmlsetups \startluacode function xml.finalizers.tex.url(e,a) local u = #e > 0 and e[1].at[a] local s = u and lpeg.match(lpeg.patterns.urlunescaper,u) context(s) end \stopluacode \startxmlsetups xml:element \xmldoifelse{#1}{.[@href and contains(@href,'https://iiif.ub.unibe.ch')]} {\goto{\xmlflush{#1}}[url(https://uv-v4.netlify.app/##?manifest=\xmlatt{#1}{href})]} % gives me two hashes in the Link %{\goto{\xmlflush{#1}}[url(https://uv-v4.netlify.app/#?manifest=\xmlatt{#1}{xlink:href})]} \par % does not work {\goto{\xmlflush{#1}}[url(\xmlatt{#1}{href})]} \stopxmlsetups \startbuffer[test] <?xml version="1.0" encoding="UTF-8"?> <doc> <element href="https://iiif.ub.unibe.ch/presentation/v2.1/berner-inkunabeln/manifest/Inc%20I%20104%20fol%20a1r">IIIF-Link</element> <element href="https://wiki.contextgarden.net/">Other Link</element> </doc> \stopbuffer \starttext \xmlprocessbuffer{test}{test}{} \stoptext
On 8/25/2023 10:16 AM, denis.maier@unibe.ch wrote:
Hi,
I need to process URLs in XML documents differently depending on the target. Thanks to Hans and Thomas I can now deal with the attributes. Now I need to prefix certain kind of targets with a certain URL (a web viewer for IIIF images in this particular case). The prefix contains a #, which, as a special character seems to cause an error. Adding a double hash seems to work, but those ## will end up in the link URL. How can I properly escape a single #? I've already tried using \Ux{23}, but that won't work either. \#
----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
Hi Denis,
I need to process URLs in XML documents differently depending on the target. Thanks to Hans and Thomas I can now deal with the attributes. Now I need to prefix certain kind of targets with a certain URL (a web viewer for IIIF images in this particular case). The prefix contains a #, which, as a special character seems to cause an error. Adding a double hash seems to work, but those ## will end up in the link URL. How can I properly escape a single #? I've already tried using \Ux{23}, but that won't work either. The duplication of hashes is kind of special. Consider this:
\def\foo#1{#1 test ##} \def\foo#1{#1 test\def\more##1{(##1)}} the internal representation of #1 is a reference to parameter 1 while the ## becomes one # (cc parameter) in the second exmaple followed by a character 1 (cc other) then, when tex serializes e.g. in tracing it duplicates the hash (with cc parameter) so that the user is not confused (an alternative could have been to indicate a parameter reference differently but changing that now is no real option - what symbol to use anyway?) with lua(meta)tex opening up matters this duplication becomes a bit annoying so i decided to see if it could be avoided in some cases first i decided to provide an escape, possible because we already have such mechanism using # (not advertised because it is still experimental) then i decided to just not duplicate unless we trace in the lowlevel macro manual there is a section added that goes in more detail to come back to the escaping: here is a list: #I loop iteraor #P parent loop iterator #G grandparent loop iterator #H hash #S space (ascii 32) #T tab \t #L newline \n #R return \r #X backslash considering #N nbsp and some more. Keep in mind that #1-#9 #A-#E are in use as parameter references but in the case of your url's escaping should not be needed (hopefully) Hans ps. all kind of expereimental because it's a bit hairy and we don't want to loose compatibility (hashes get interpreted in the macro preamble, macro body, running text, serialization etc) ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
participants (2)
-
denis.maier@unibe.ch
-
Hans Hagen