[NTG-context] Typesetting unicode characters
Thangalin
thangalin at gmail.com
Thu Mar 31 10:06:27 CEST 2022
On the rare chance that someone else stumbles across this problem ...
By default, Java's Xalan transformer for creating XML documents does not
correctly encode emojis. Instead of 👍 for the thumbs up emoji,
Xalan encodes it as . As Arthur pointed out, this is not a
valid entity encoding.
One solution is to use Saxonica's Saxon 11 transformer, which produces the
expected output:
<html>
<head><meta charset="utf8"/></head>
<body>
<p id="caret">the 👍 emoji</p>
</body>
</html>
In Java, switching to Saxon entails installing the Jar files for Saxonica
and its resolvers. Then set the system property before invoking the XML
transformer: System.setProperty( "javax.xml.transform.TransformerFactory",
"net.sf.saxon.TransformerFactoryImpl" );
ConTeXt handles the emoji from the transformed XML file without any issues.
Thank you, Arthur.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ntg.nl/pipermail/ntg-context/attachments/20220331/5001d61a/attachment-0001.htm>
More information about the ntg-context
mailing list