[NTG-context] ignore not closed tags in XML input

Thangalin thangalin at gmail.com
Wed May 18 19:14:27 CEST 2022


Hey Pablo,

> One of the not irrelevant tasks for me is finding examples of XML code.

To clarify, XHTML documents *are* XML documents. XHTML happens to use a
standardized set of XML element and attribute names. All XHTML examples are
also XML examples.

> But my worries came from having to sanitize HTML sources (which aren’t

That was discussed in the blog post: finding a source of well-formed XHTML
documents. There are a number of tools to sanitize HTML, as mentioned in
the thread. KeenWrite uses the Java-based JSoup library https://jsoup.org/
to sanitize HTML and then create an XHTML version.

All the best!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ntg.nl/pipermail/ntg-context/attachments/20220518/d6f1c198/attachment.htm>


More information about the ntg-context mailing list