On Sat, 16 Nov 2013, Bill Meahan wrote:
I would /expect/ to get a valid EPUB file, or so I'm lead to believe.
At the moment, I'm simply trying it out using Hans' "export-example.tex" file
that comes as part of the standard ConTeXt distribution, either Standalone or
part of one of the other distributions. I haven't even opened the
export-example.tex file in an editor (yet) in this round of trials and I've
even run the script against it right in the ..../base/ directory where it is
found in the distribution so I don't understand why it is not producing a
valid EPUB. Once I've got that sorted out, I can try applying the lessons
learned to my own documents.
ConTeXt provides two types of exports. The first is an XML export.
Consider a sample file:
~~~ {test.tex}
\setupbackend[export=yes]
\starttext
\startsection[title={This is a test}]
\startparagraph
Some random text
\startitemize
\item First
\item Second
\stopitemize
\stopparagraph
\stopsection
\stoptext
~~~
Running `context test.tex` generates a `test.export` file that looks as
follows:
~~~ {test.export}
<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<!-- input filename : test -->
<!-- processing date : Sat Nov 16 12:19:59 2013 -->
<!-- context version : 2013.11.01 15:02 -->
<!-- exporter version : 0.30 -->
<document language="en" file="test" date="Sat Nov 16 12:19:59 2013"
context="2013.11.01 15:02" version="0.30"
xmlns:m="http://www.w3.org/1998/Math/MathML">
<section detail="section" location='aut:1'>
<sectionnumber>1</sectionnumber>
<sectiontitle>This is a test</sectiontitle>
<sectioncontent>
<paragraph>Some random text <itemgroup detail="itemize"
symbol="1"><item><itemtag><!-- begin m:mrow
-->•<!-- end m:mrow
--></itemtag><itemcontent>First</itemcontent></item>
<item><itemtag><!-- begin m:mrow
-->•<!-- end m:mrow
--></itemtag><itemcontent>Second</itemcontent></item></itemgroup></paragraph>
</sectioncontent>
</section>
</document>
~~~
which is simply an XML representation of the document.
In prinicple, if one adds an appropriate CSS file with that XML, any
recent browser will be able to display it. So, if you change the first
line of `test.tex` to
~~~
\setupbackend[export=yes, xhtml=yes, css=yes]
~~~
and run `context test.tex`, you will get four additional files:
`test.xhtml`, `test-styles.css`, `test-images.css`, and
`test.specification`.
The `test.xhtml` file look as follows:
~~~{test.xhtml}
<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<!-- input filename : test -->
<!-- processing date : Sat Nov 16 12:22:58 2013 -->
<!-- context version : 2013.11.01 15:02 -->
<!-- exporter version : 0.30 -->
<?xml-stylesheet type="text/css" href="test-styles.css"?>
<?xml-stylesheet type="text/css" href="test-images.css"?>
<?xml-stylesheet type="text/css" href="export-example.css"?>
<document language="en" version="0.30" file="test"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:m="http://www.w3.org/1998/Math/MathML" date="Sat Nov 16 12:22:58
2013" context="2013.11.01 15:02">
<section location="aut:1" detail="section">
<sectionnumber>1</sectionnumber>
<sectiontitle>This is a test</sectiontitle>
<sectioncontent>
<paragraph>Some random text <itemgroup symbol="1"
detail="itemize"><item><itemtag><!-- begin m:mrow
-->•<!-- end m:mrow
--></itemtag><itemcontent>First</itemcontent></item>
<item><itemtag><!-- begin m:mrow
-->•<!-- end m:mrow
--></itemtag><itemcontent>Second</itemcontent></item></itemgroup></paragraph>
</sectioncontent>
</section>
</document>
~~~
Notice that apart from the three lines specifying the CSS files, the rest
of the document is the same as in XML export. The two css files,
`test-styles.css` and `test-images.css` include the relevant code for the
style modifications and images in the document. The css file
`export-example.css` comes with the ConTeXt distribution and has the
default values for most ConTeXt elements.
If you open the `test.xhtml` file in any browser, it will work correctly
(because an XHTML markup is extensible and can use any XML tags as long as
the behavior of the tag is specified in a CSS file). This is, however, not
a XHTML file that includes the default XHTML markup (<h1>, <p>, <ul>,
etc.)
Now, lets come back to the last file generated by the export:
`test.specification`. This is a lua file that contains:
~~~{test.specification}
return {
["files"]={ "test-styles.css", "test-images.css", "export-example.css",
"test.xhtml" },
["identifier"]="e6a91a13-4e08-9494-3817-bfffe872be2c",
["images"]={},
["language"]="en",
["name"]="test",
["root"]="test.xhtml",
}
~~~~
When you run `mtxrun --script epub --make test`, it just takes the files
specificied in the "files" field, and zips them in as a epub file.
Now, in principle, any epub reader should support the any XHTML file; in
practice, they only support the default XHTML tags. The XML+CSS file that
ConTeXt generates are not handled correctly by most (all?) EPUB readers.
So there are three options:
1. Wait until the EPUB readers catch up. It took almost 10-15 years for
the browsers to catch up with the HTML standards, and I don't have much
hope for EPUB readers here. Last I checked, none of them supported
even MATHML-2.
2. Write a script (either using xmlproc, or using you favorite XML parser
in your favorite language) that converts the XML generated by ConTeXt into
a "standard" XHTML file. This is the easiest and the least time consuming
alternative.
3. Modify the way in which ConTeXt generates the XML files. Ideally, I
should be able to write something like
~~~
\setupparagraph[tag=p, class=default]
~~~
to tell context that \startparagraph ... \stopparagraph should translate
to `<p class="default"> ... </p>". Last I checked the code that generates
the XML file, there was no easy way to change the tags and classes.
I hope that the above description clarifies the situation.
Aditya