Hey Pablo,
>
One of the not irrelevant tasks for me is finding examples of XML code.
To clarify, XHTML documents are XML documents. XHTML happens to use a standardized set of XML element and attribute names. All XHTML examples are also XML examples.
>
But my worries came from having to sanitize HTML sources (which aren’t
That was discussed in the blog post: finding a source of well-formed XHTML documents. There are a number of tools to sanitize HTML, as mentioned in the thread. KeenWrite uses the Java-based JSoup library
https://jsoup.org/ to sanitize HTML and then create an XHTML version.
All the best!