LuaMetaTEX as LaTeX to XHTML/ePub transpiler?
I have documents in LaTeX, and would like to generate XHTML (ePub) output without going through an intermediate DVI or PDF step. Markup to markup, translating or transpiling rather than typesetting. My use case is that I have two tabletop gaming books, 60 - 80 pages of text and diagrams, written for pdfLaTeX and now with XeLaTeX. I'm very happy with LaTeX and the wonderful PDF output for print. But now I also want to create ePub/XHTML as well as print versions. So far I've tried tex4ebook and tex4ht and neither works for me. Firstly, some of the LaTeX commands are not recognised or causing errors. And secondly, when I managed to get a small test section to work, the generated XHTML/HTML is very large, full of tiny <span>s. The problem seems to be that tex4ht runs TeX which typesets everything into DVI with every element carefully placed on a page, and then tex4ht tries to reverse that back into HTML. All this extra HTML will slow down / interfere with the ebook reader which is doing the final page layout at runtime on a particular device. How I would like it to work is directly from LaTeX to HTML without any low level typesetting. If I have a LaTex source paragraph This is some text with \textbf{some parts} in bold. The <whatever>TEX will copy the source text to the destination. If there's a TeX command, here \textbf, it looks for a Lua function with that name and invokes it with whatever argument text is present. The Lua function emits <b>, then recursively processes the argument text, then emits </b>. Similarly there would be an implied lookup of \beginParagraph and \endParagraph which would emit <p> and </p>. Plain text just gets copied through unchanged. So (finally) my question: is LuaMetaTEX what I'm looking for? Yes is the answer I'm hoping for. And any guidance would be much appreciated. No, but best starting point? I've never tried modifying TeX code itself, but I am an experienced and sometimes competent programmer. who has written a compiler parser and a high level code generator. No and not a good idea to try? Any other responses? -- cheers, Hugh Fisher
On Fri, 10 Sept 2021 at 21:26, Henning Hraban Ramm
No.
LuaMetaTeX is ConTeXt-only. You would need a LaTeX -> ConTeXt conversion, and there is none.
Well I am thinking about switching to ConTeX/LuaMetaTEX anyway, because at the moment I draw vector art in the last non-subscription version of Adobe Illustrator, now approaching ten years old. I'll be trying out Metapost as a replacement. My markup isn't that complicated, so at worst I could translate by hand. But it occurs to me that if I get this markup to markup text translation going, I'd be able to write a LaTeX -> ConTeXt converter as a set of LaTeX named Lua functions. -- cheers, Hugh Fisher
On 9/10/2021 1:13 PM, Hugh Fisher via ntg-context wrote:
I have documents in LaTeX, and would like to generate XHTML (ePub) output without going through an intermediate DVI or PDF step. Markup to markup, translating or transpiling rather than typesetting.
My use case is that I have two tabletop gaming books, 60 - 80 pages of text and diagrams, written for pdfLaTeX and now with XeLaTeX. I'm very happy with LaTeX and the wonderful PDF output for print.
indeed, stay with what you're happy working with
But now I also want to create ePub/XHTML as well as print versions. So far I've tried tex4ebook and tex4ht and neither works for me. Firstly, some of the LaTeX commands are not recognised or causing errors.
i suppose that you can define commands that somehow make your own commands export something; i have no experience with latex or tex4ht
And secondly, when I managed to get a small test section to work, the generated XHTML/HTML is very large, full of tiny <span>s. The problem seems to be that tex4ht runs TeX which typesets everything into DVI with every element carefully placed on a page, and then tex4ht tries to reverse that back into HTML. All this extra HTML will slow down / interfere with the ebook reader which is doing the final page layout at runtime on a particular device.
that's probbaly because there is not enough info in the dvi file ... maybe you can use xslt to sanitize the spans?
How I would like it to work is directly from LaTeX to HTML without any low level typesetting. If I have a LaTex source paragraph
This is some text with \textbf{some parts} in bold.
so kind of interpreting
The <whatever>TEX will copy the source text to the destination. If there's a TeX command, here \textbf, it looks for a Lua function with that name and invokes it with whatever argument text is present. The Lua function emits <b>, then recursively processes the argument text, then emits </b>. Similarly there would be an implied lookup of \beginParagraph and \endParagraph which would emit <p> and </p>. Plain text just gets copied through unchanged.
i once played with this (context speak): \def\textbf#1{\type{<bf>}#1\type{</bf>}} so, you define all the commands that you use (normally a subset of what a macro package provides, you just ignore what doesn't make sense) then you define a very large page (say A1) that you use completely then you typeset the document in verbatim (nil headers and footers) the resulting pdf can then be converted to html with pdftotex or something like that so, basically, you just typeset the html
So (finally) my question: is LuaMetaTEX what I'm looking for?
in this area there is nothing in luametatex that luatex can't do
Yes is the answer I'm hoping for. And any guidance would be much appreciated.
as said, i don't know latex but context has an xml export option
No, but best starting point? I've never tried modifying TeX code itself, but I am an experienced and sometimes competent programmer. who has written a compiler parser and a high level code generator.
so, if your source uses a limited set of commands you can write a parser (in any language)
No and not a good idea to try?
Any other responses? you can consider coding your documents in xml and convert them to latex and html .. neutral input so to say
Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On Fri, 10 Sept 2021 at 21:47, Hans Hagen
[ munch ]
in this area there is nothing in luametatex that luatex can't do
As in my earlier reply, I'm thinking about replacing Adobe Illustrator with Metapost, and LuaMetaTEX seems to have better integration?
so, if your source uses a limited set of commands you can write a parser (in any language)
This parser would need to understand TeX source files and conventions such as % for comments, recognise commands starting with \ and with arguments/parameters bracketed by [] and {}, and look up command names that might be written in Lua, then call them. Isn't that what LuaMetaTEX does? No, I haven't looked at the actual source code yet, but starting with something that already does most of what you want is always quicker than writing from scratch. -- cheers, Hugh Fisher
On 9/11/2021 1:19 PM, Hugh Fisher wrote:
On Fri, 10 Sept 2021 at 21:47, Hans Hagen
wrote: [ munch ]
in this area there is nothing in luametatex that luatex can't do
As in my earlier reply, I'm thinking about replacing Adobe Illustrator with Metapost, and LuaMetaTEX seems to have better integration?
indeed the interfaces in lmtx/luametatex are better than in mkiv/luatex and new things will only be done in lmtx anyway, context users most likely will move to lmtx (mkiv is not really frozen as it is also the test case for luatex, but there will be no fundamental new things added)
so, if your source uses a limited set of commands you can write a parser (in any language)
This parser would need to understand TeX source files and conventions such as % for comments, recognise commands starting with \ and with arguments/parameters bracketed by [] and {}, and look up command names that might be written in Lua, then call them.
Isn't that what LuaMetaTEX does? No, I haven't looked at the actual source code yet, but starting with something that already does most of what you want is always quicker than writing from scratch. sure, any tex engine is better at parsing tex input
the main differences between luatex and luametatex (much is disucussed in articles and manuals) is that luametatex has no backend built in and has some better interfaces in the front end; there are extension to the subsystems of the tex engine (fonts, language, math, inserts, marks, alignments, conditionals, macro definition, par handling) that are not in luatex (which is basically frozen in order to permit other macro packages to support it); lua helpers have been cleaned up and there are some more; luametatex has a smaller binary, is more efficient wrt memory and has better performance (if used well) than luatex Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
You may want to have a look at the lwarp package as an alternative to tex4ht. Denis
-----Ursprüngliche Nachricht----- Von: ntg-context
Im Auftrag von Hugh Fisher via ntg-context Gesendet: Freitag, 10. September 2021 13:14 An: ntg-context@ntg.nl Cc: Hugh Fisher Betreff: [NTG-context] LuaMetaTEX as LaTeX to XHTML/ePub transpiler? I have documents in LaTeX, and would like to generate XHTML (ePub) output without going through an intermediate DVI or PDF step. Markup to markup, translating or transpiling rather than typesetting.
My use case is that I have two tabletop gaming books, 60 - 80 pages of text and diagrams, written for pdfLaTeX and now with XeLaTeX. I'm very happy with LaTeX and the wonderful PDF output for print.
But now I also want to create ePub/XHTML as well as print versions. So far I've tried tex4ebook and tex4ht and neither works for me. Firstly, some of the LaTeX commands are not recognised or causing errors.
And secondly, when I managed to get a small test section to work, the generated XHTML/HTML is very large, full of tiny <span>s. The problem seems to be that tex4ht runs TeX which typesets everything into DVI with every element carefully placed on a page, and then tex4ht tries to reverse that back into HTML. All this extra HTML will slow down / interfere with the ebook reader which is doing the final page layout at runtime on a particular device.
How I would like it to work is directly from LaTeX to HTML without any low level typesetting. If I have a LaTex source paragraph
This is some text with \textbf{some parts} in bold.
The <whatever>TEX will copy the source text to the destination. If there's a TeX command, here \textbf, it looks for a Lua function with that name and invokes it with whatever argument text is present. The Lua function emits <b>, then recursively processes the argument text, then emits </b>. Similarly there would be an implied lookup of \beginParagraph and \endParagraph which would emit <p> and </p>. Plain text just gets copied through unchanged.
So (finally) my question: is LuaMetaTEX what I'm looking for?
Yes is the answer I'm hoping for. And any guidance would be much appreciated.
No, but best starting point? I've never tried modifying TeX code itself, but I am an experienced and sometimes competent programmer. who has written a compiler parser and a high level code generator.
No and not a good idea to try?
Any other responses?
--
cheers, Hugh Fisher ________________________________________________________________ ___________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ________________________________________________________________ ___________________
Oh, as a a more general response I can only reiterate what has been said already: Depending on your particular needs it might be better to start with some neutral input and generate output formats from there. There are plenty of options each with particular up- and downsides, e.g. markdown via pandoc, or asciidoc., or XML with XSLT. In the Racket ecosystem there's also scribble/pollen which looks quite interesting. I have never used that though. Denis
-----Ursprüngliche Nachricht----- Von: ntg-context
Im Auftrag von Denis Maier via ntg-context Gesendet: Freitag, 10. September 2021 17:35 An: ntg-context@ntg.nl Cc: Maier, Denis Christian (UB) ; hugo.fisher@gmail.com Betreff: Re: [NTG-context] LuaMetaTEX as LaTeX to XHTML/ePub transpiler? You may want to have a look at the lwarp package as an alternative to tex4ht.
Denis
-----Ursprüngliche Nachricht----- Von: ntg-context
Im Auftrag von Hugh Fisher via ntg-context Gesendet: Freitag, 10. September 2021 13:14 An: ntg-context@ntg.nl Cc: Hugh Fisher Betreff: [NTG-context] LuaMetaTEX as LaTeX to XHTML/ePub transpiler? I have documents in LaTeX, and would like to generate XHTML (ePub) output without going through an intermediate DVI or PDF step. Markup to markup, translating or transpiling rather than typesetting.
My use case is that I have two tabletop gaming books, 60 - 80 pages of text and diagrams, written for pdfLaTeX and now with XeLaTeX. I'm very happy with LaTeX and the wonderful PDF output for print.
But now I also want to create ePub/XHTML as well as print versions. So far I've tried tex4ebook and tex4ht and neither works for me. Firstly, some of the LaTeX commands are not recognised or causing errors.
And secondly, when I managed to get a small test section to work, the generated XHTML/HTML is very large, full of tiny <span>s. The problem seems to be that tex4ht runs TeX which typesets everything into DVI with every element carefully placed on a page, and then tex4ht tries to reverse that back into HTML. All this extra HTML will slow down / interfere with the ebook reader which is doing the final page layout at runtime on a particular device.
How I would like it to work is directly from LaTeX to HTML without any low level typesetting. If I have a LaTex source paragraph
This is some text with \textbf{some parts} in bold.
The <whatever>TEX will copy the source text to the destination. If there's a TeX command, here \textbf, it looks for a Lua function with that name and invokes it with whatever argument text is present. The Lua function emits <b>, then recursively processes the argument text, then emits </b>. Similarly there would be an implied lookup of \beginParagraph and \endParagraph which would emit <p> and </p>. Plain text just gets copied through unchanged.
So (finally) my question: is LuaMetaTEX what I'm looking for?
Yes is the answer I'm hoping for. And any guidance would be much appreciated.
No, but best starting point? I've never tried modifying TeX code itself, but I am an experienced and sometimes competent programmer. who has written a compiler parser and a high level code generator.
No and not a good idea to try?
Any other responses?
--
cheers, Hugh Fisher
________________________________________________________________
___________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net
________________________________________________________________
___________________
___________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ________________________________________________________________ ___________________
Collating several suggestions into one:
On Fri, 10 Sept 2021 at 21:26, Henning Hraban Ramm
Did you try pandoc?
On Fri, 10 Sept 2021 at 21:47, Hans Hagen
you can consider coding your documents in xml and convert them to latex and html .. neutral input so to say
On Sat, 11 Sept 2021 at 01:07, T. Kurt Bond
You might also consider hevea (a LaTeX to HTML translator) and pandoc (which bills itself as a universal document converter) and can convert into and out of LaTeX. I use pandoc a lot, although not for LaTeX to HTML translation. Pandoc can output EPUB, BTW.
On Sat, 11 Sept 2021 at 01:34,
You may want to have a look at the lwarp package as an alternative to tex4ht.
From what I know of pandoc, it is like Sphinx in that the way it generates PDF output is by translating pandoc into LaTeX/TeX, then running TeX! So instead of my current toolchain where I write the LaTeX I want directly, I'd be examining
Thanks T. Kurt Bond and Denis Maier for the suggestions. A better alternative to tex4ht / tex4ebook would certainly be much easier for me, even if I'm still somewhat offended by the intermediate steps. As for xml or pandoc, I'd rather not because I want to keep print (PDF) as the primary output, and I don't want to lose what TeX/LaTeX can do that most markup languages can't. the pandoc output and if it isn't what I want, poking at pandoc in the hope of making things better. It may be unfair, but my impression is that TeX and typesetting / layout systems based on TeX can do more interesting things than say XML or Sphinx. Moving to a more "universal" markup format might broaden my options, but I don't want a lowest common denominator solution. -- cheers, Hugh Fisher
It may be unfair, but my impression is that TeX and typesetting / layout systems based on TeX can do more interesting things than say XML or Sphinx. Moving to a more "universal" markup format might broaden my options, but I don't want a lowest common denominator solution. As soon as documents become more complex and one wants control over th elayout all these alternative-to-tex formats in the end are not better
On 9/11/2021 1:49 PM, Hugh Fisher via ntg-context wrote: than structured tex input. The simpler the input tagging, the more complex the escaping from that. So in the end it all depends on what kind of documents one has to deal with. And it's all about abstraction and structure: the more, the easier. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
participants (4)
-
denis.maier@unibe.ch
-
Hans Hagen
-
Henning Hraban Ramm
-
Hugh Fisher