Dear list, first episode of series on nbsp of XML in lmtx. Unfortunately, not that catchy as Netflix. Used XML input has two types of non-breakable space: - unicode character - html entitity (in fact an ugly output of HTML editor) HTML is preprocessed with ctx preprocessor (great feature!) and substituted for unicode char nbsp or tilde. MWE shows unichar spaces are non-breakable (see end of the first lines), however they are not stretchable (see second line of the paragraphs). Does unicode nbsp have fixed with in ctx? When tilde is the replacement in preprocessor (uncomment first replacement in preprocessor), xmlfush will display tilde (which is, as character, non-breakable and unstretchable, no surprise). Why tilde is displayed? Replacing or adding nbsp (tilde) with finalizers have different results, see next episode after this one is understood. Thank you, Jano MWE (rather use attached file not to loose invisible characters): \startbuffer[doc] <?xml version "1.0"?> <document> <p>Temperature 20 °C 20 °C 20 °C 20 °C average.</p> <p>Altitude 6000 m 6000 m 6000 m 6000 m average.</p> </document> \stopbuffer \startluacode function lxml.preprocessor(data) -- data = string.gsub(data, " ", "~") -- replacement nbsp invisible in luacode data = string.gsub(data, " ", " ") return data end \stopluacode \startxmlsetups xml:name \xmlsetsetup{\xmldocument}{*}{-} \xmlsetsetup{\xmldocument}{document|p}{xml:name:*} \stopxmlsetups \xmlregistersetup{xml:name} \startxmlsetups xml:name:document \xmlflush{#1}\par \stopxmlsetups \startxmlsetups xml:name:p \parfillskip0pt\xmlflush{#1}\par \stopxmlsetups \startTEXpage[offset=5mm,width=60mm] \xmlprocessbuffer{xml:name}{doc}{} \stopTEXpage
Why tilde is displayed?
Wouldn't the simple answer not be: because XML is not TeX? dr. Hans van der Meer
On 21 Apr 2021, at 20:17, Jano Kula
wrote: Dear list,
first episode of series on nbsp of XML in lmtx. Unfortunately, not that catchy as Netflix.
Used XML input has two types of non-breakable space: unicode character html entitity (in fact an ugly output of HTML editor) HTML is preprocessed with ctx preprocessor (great feature!) and substituted for unicode char nbsp or tilde.
MWE shows unichar spaces are non-breakable (see end of the first lines), however they are not stretchable (see second line of the paragraphs).
Does unicode nbsp have fixed with in ctx?
When tilde is the replacement in preprocessor (uncomment first replacement in preprocessor), xmlfush will display tilde (which is, as character, non-breakable and unstretchable, no surprise).
Why tilde is displayed?
Replacing or adding nbsp (tilde) with finalizers have different results, see next episode after this one is understood.
Thank you, Jano
MWE (rather use attached file not to loose invisible characters):
\startbuffer[doc] <?xml version "1.0"?> <document> <p>Temperature 20 °C 20 °C 20 °C 20 °C average.</p> <p>Altitude 6000 m 6000 m 6000 m 6000 m average.</p> </document> \stopbuffer
\startluacode function lxml.preprocessor(data) -- data = string.gsub(data, " ", "~") -- replacement nbsp invisible in luacode data = string.gsub(data, " ", " ") return data end \stopluacode
\startxmlsetups xml:name \xmlsetsetup{\xmldocument}{*}{-} \xmlsetsetup{\xmldocument}{document|p}{xml:name:*} \stopxmlsetups \xmlregistersetup{xml:name}
\startxmlsetups xml:name:document \xmlflush{#1}\par \stopxmlsetups
\startxmlsetups xml:name:p \parfillskip0pt\xmlflush{#1}\par \stopxmlsetups
\startTEXpage[offset=5mm,width=60mm] \xmlprocessbuffer{xml:name}{doc}{} \stopTEXpage
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___________________________________________________________________________________
On Wed, Apr 21, 2021 at 8:37 PM Hans van der Meer
Why tilde is displayed?
Wouldn't the simple answer not be: because XML is not TeX?
I see your point for tilde: with finalizers in mind I was already in the stomach, while mouth was looking at the menu. Teaser for S01E02: finalizers. I still would expect unicode nbsp to be expandable, otherwise I would have to treat it somehow (no problem with that). Remember times when non expandable/shrinkable nbsp was the first clue the book was typeset in Word? I've checked it now and it's still the case. Thank you, Jano
Re tilde: maybe the answer is in the entities section of the xml mkiv manual.
Denis
________________________________________
Von: ntg-context
Hi,
On 21 Apr 2021, at 23:09, Jano Kula
wrote: On Wed, Apr 21, 2021 at 8:37 PM Hans van der Meer
wrote: Why tilde is displayed?
Wouldn't the simple answer not be: because XML is not TeX?
You are never going back to “TeX mode”: the preprocessor converts XML into *other* XML. And tilde in XML is just that: the ascii tilde glyph.
I still would expect unicode nbsp to be expandable,
I agree with that, but for fine-tuning XML output I would use a trick like this: \startluacode function lxml.preprocessor(data) return string.gsub(data, " ", "<nbsp/>") end \stopluacode \startxmlsetups xml:name ... \xmlsetsetup{\xmldocument}{document|nbsp}{xml:name:*} \stopxmlsetups \startxmlsetups xml:name:nbsp \penalty10000\hskip .3em plus 2em % or something, just a wild example. \stopxmlsetups Using an xml element would also allow your code to ‘look around’ to make sure all is well with its (typesetting) environment. Best wishes, Taco — Taco Hoekwater E: taco@bittext.nl genderfluid (all pronouns)
Try this: %\xmltexentity{nbsp}{\nobreakspace} \xmlsetentity{nbsp}{ } % U+00A0 NBSP between braces %\xmlsetentity{nbsp}{ } % U+0020 normal space between braces \startbuffer[doc] <?xml version "1.0"?> <document> <p>Temperature 20 °C 20 °C 20 °C 20 °C average.</p> <p>Altitude 6000 m 6000 m 6000 m 6000 m average.</p> </document> \stopbuffer \startluacode --[[ function lxml.preprocessor(data) -- data = string.gsub(data, " ", "~") -- replacement nbsp invisible in luacode data = string.gsub(data, " ", " ") return data end --]] \stopluacode \startxmlsetups xml:name \xmlsetsetup{\xmldocument}{*}{-} \xmlsetsetup{\xmldocument}{document|p}{xml:name:*} \stopxmlsetups \xmlregistersetup{xml:name} \startxmlsetups xml:name:document \xmlflush{#1}\par \stopxmlsetups \startxmlsetups xml:name:p \parfillskip0pt\xmlflush{#1}\par \stopxmlsetups \startTEXpage[offset=5mm,width=60mm] \xmlprocessbuffer{xml:name}{doc}{} \stopTEXpage Massi Il 21/04/21 20:17, Jano Kula ha scritto:
Dear list,
first episode of series on nbsp of XML in lmtx. Unfortunately, not that catchy as Netflix.
Used XML input has two types of non-breakable space:
* unicode character * html entitity (in fact an ugly output of HTML editor)
HTML is preprocessed with ctx preprocessor (great feature!) and substituted for unicode char nbsp or tilde.
MWE shows unichar spaces are non-breakable (see end of the first lines), however they are not stretchable (see second line of the paragraphs).
Does unicode nbsp have fixed with in ctx?
When tilde is the replacement in preprocessor (uncomment first replacement in preprocessor), xmlfush will display tilde (which is, as character, non-breakable and unstretchable, no surprise).
Why tilde is displayed?
Replacing or adding nbsp (tilde) with finalizers have different results, see next episode after this one is understood.
Thank you, Jano
MWE (rather use attached file not to loose invisible characters):
\startbuffer[doc] <?xml version "1.0"?> <document> <p>Temperature 20 °C 20 °C 20 °C 20 °C average.</p> <p>Altitude 6000 m 6000 m 6000 m 6000 m average.</p> </document> \stopbuffer
\startluacode function lxml.preprocessor(data) -- data = string.gsub(data, " ", "~") -- replacement nbsp invisible in luacode data = string.gsub(data, " ", " ") return data end \stopluacode
\startxmlsetups xml:name \xmlsetsetup{\xmldocument}{*}{-} \xmlsetsetup{\xmldocument}{document|p}{xml:name:*} \stopxmlsetups \xmlregistersetup{xml:name}
\startxmlsetups xml:name:document \xmlflush{#1}\par \stopxmlsetups
\startxmlsetups xml:name:p \parfillskip0pt\xmlflush{#1}\par \stopxmlsetups
\startTEXpage[offset=5mm,width=60mm] \xmlprocessbuffer{xml:name}{doc}{} \stopTEXpage
On 4/21/2021 8:17 PM, Jano Kula wrote:
Does unicode nbsp have fixed with in ctx?
sometimes ... but you just uncovered an old bug if attr >= 1 or attr <= 3 then -- flushright someplace should be if attr >= 1 and attr <= 3 then -- flushright Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
Hello,
the first episode was more dramatic than expected, seem to be a good series.
On Thu, Apr 22, 2021 at 11:36 AM Hans Hagen
On 4/21/2021 8:17 PM, Jano Kula wrote:
Does unicode nbsp have fixed with in ctx?
sometimes ... but you just uncovered an old bug if attr >= 1 or attr <= 3 then -- flushright someplace should be if attr >= 1 and attr <= 3 then -- flushright
After the patch, nbsp is working as expected.
On Thu, Apr 22, 2021 at 8:03 AM Taco Hoekwater
for fine-tuning XML output I would use a trick like this:
\startluacode function lxml.preprocessor(data) return string.gsub(data, " ", "<nbsp/>") end \stopluacode
\startxmlsetups xml:name \xmlsetsetup{\xmldocument}{document|nbsp}{xml:name:*} \stopxmlsetups
\startxmlsetups xml:name:nbsp \penalty10000\hskip .3em plus 2em % or something, just a wild example. \stopxmlsetups
Using an xml element would also allow your code to ‘look around’ to make sure all is well with its (typesetting) environment.
It didn't occur to me to change it by preprocessor to the new xml elements. You are right, one can even have more control. Thank you all for your help, Jano And thanks for watching!
participants (6)
-
denis.maier@ub.unibe.ch
-
Hans Hagen
-
Hans van der Meer
-
Jano Kula
-
mf
-
Taco Hoekwater