On 01/14/2015 09:29 PM, Idris Samawi Hamid ادريس سماوي حامد wrote:
[...] I was just about to ask about how pandoc handles xhtml. Some questions for you:
Hi Idris,
Context produces three relevant files:
darwin-xml-div.xhtml darwin-xml-tag.xhtml darwin-xml-raw.xml
1. Which one of these three files is the one we want to convert to docx?
Only *-div.xhtml are (X)HTML files.
2. I modified Alan's test file [same preamble]:
==== \starttext
\startquotation \input darwin
\bf \input darwin \stopquotation
\stoptext ====
darwin-xml-div.xhtml and darwin-xml-tag.xhtml show up in the browser, but the bold does not.
Well, I cannot see any bold in "<div class="break"><!--empty--></div>". It seems to be how ConTeXt handles the blank line and the \bf command. BTW, the quotation environment is not translated as blockquote and paragraphs lack their <p> tags.
In Opera 12.17, darwin-xml-raw.xml gives a syntax error
==== XML parsing failed: syntax error (Line: 17, Character: 0) ====
It seems weird to me (that should be the error in Opera), that a <break /> element is placed outside the <document> element.
But the "Reparse document as HTML" does work.
I guess this format won’t be understood by pandoc (unless you write an specific reader for it). -raw.xml is not what you need.
In each of the three cases, there is no bold effect at all.
What is needed to get the typography info transmitted?
I have no experience with xhtml export in ConTeXt. This is beyond my knowledge. Sorry.
3. My assumption is this: If I can get the xml/xhtml file looking right in the browser, I should be able to build a working docx file via pandoc.
If it doesn’t look good in the browser, you won’t get it in pandoc. But it might be that you get it right in the browser and not in pandoc. It depends how ConTeXt outputs the XHTML. I hope it helps, Pablo -- http://www.ousia.tk