Re: [NTG-context] DOC/RTF to ConTeXt via XML
Slightly OT, sorry:
OpenOffice.org does allow you to attach an XSLT stylesheet to an export process which therefore allows you to do a (limited) transformation from the visual markup which is its native format to a more structured one
Why „limited“?
Well, XSLT seems to have been designed, and certainly tends to be implemented, as a tool for simple transformations of small XML chunks. Obviously complex transformations can be constructed from a bunch of simple transformations, but there comes a point when you should really just use a better tool - though these tend to cost serious money (e.g. OmniMark). Also, most XSLT implementations use the DOM model, which is fine for a 50Kb file but will be incredibly resource-hungry if you're processing files of 5Mb. At that point you want a streaming model, and for a streaming model you want a better suited language than XSLT. As I say, horses for courses. For article-length pieces and simple transforms, XSLT might suffice.
Also, don't limit your authors to Word. Offering Word is obviously a requirement, but if you go the way through OOo, there would be no point in not offering an OOo template file. If you are using a standard xml format, such as (a subset of) DocBook or TEI, you probably should accept articles in that format, too. And, of course, ConTeXt.
Absolutely; particularly if you can offer authors an incentive or direct benefit from adopting OO.o, such as speed of turnaround of proofs, etc.
Duncan Hothersall wrote:
Well, XSLT seems to have been designed, and certainly tends to be implemented, as a tool for simple transformations of small XML chunks.
No, xslt is a tool for arbitrary xml -> xml conversions (and a little more than that). With a good implementation (say, saxon), working with moderately large trees is pretty fast. The stylesheet is actually compiled before running.
Obviously complex transformations can be constructed from a bunch of simple transformations, but there comes a point when you should really
Just about any programming language gives you simple operations to build whatever you want from.
just use a better tool - though these tend to cost serious money (e.g.
„Better“ depends on your task at hand.
OmniMark). Also, most XSLT implementations use the DOM model, which is
XSLT uses a DOM model, which is different from the W3C DOM model.
fine for a 50Kb file but will be incredibly resource-hungry if you're processing files of 5Mb. At that point you want a streaming model, and
That depends on what you want to do with your data. For many of my needs, a streaming model simply wouldn't work without keeping lots of information (to be processed later) in memory, defeating the model. I have found splitting my data into files that form conceptional units to be a good way, both for editing the files and for turnaround times. (I am using Makefiles, so the granularity of finding unchanged items for me is the file.) We are talking about almost 15MB here, which I regard as pretty much, considering it is almost pure text. Again, I don't mind using something else on XML data. I'm doing it myself. It all depends on what you want to do. In the case of transforming xml to ConTeXt, I would go for an xslt implementation, but ymmv. After all, the choice of tools always depends on many factors, including familiarity. (I've continued using perl instead of ruby for ages, until recently, for that reason.)
for a streaming model you want a better suited language than XSLT. As I say, horses for courses. For article-length pieces and simple transforms, XSLT might suffice.
For number crunching, xslt is certainly inadequate. Transforming books of average length (say, 300-500 pages) is certainly doable, although I would go for a transformation chapter-by-chapter,especially considering that we are talking about a process where crossreferences etc. are going to be handled later in the chain. But I thought we were talking about article-length pieces anyway? Christopher
participants (2)
-
Christopher Creutzig
-
Duncan Hothersall