Hi Christopher, Duncan, Hans, and Adam, Thank you so much for your detailed comments and suggestions. Again, I'm completely new to xml and feel like a fish out of water. OTOH I use sooo much time just manually extracting text (with innumerable transliteration diacritics) and then copying-pasting to WinEDT that I am willing to explore the xml approach if it can be made sane enough...
===== Original Message From Christopher Creutzig
===== Duncan Hothersall wrote: Well, XSLT seems to have been designed, and certainly tends to be implemented, as a tool for simple transformations of small XML chunks.
No, xslt is a tool for arbitrary xml -> xml conversions (and a little more than that).
Ok, you guys have lost me now-) Maybe the best thing to do is try something practical: take an average word article and see what's involved in converting it to ConTeXt. From what I gather so far the process goes something like doc => rtf rtf => OO.o OO.o => xml But here things get dicey because \startHans converting open office xml is not always easy; stay away from tab's and use high level constructs as much as possible \stopHans Question: Will a proper doc (or OO.o) template solve this problem or is this a post-OO.o-processing problem no matter what I do beforehand?
From this discussion it seems that I (as an xml ignoramous) would be better off converting to ConTeXt code rather than processing pure xml blocks (but maybe I'm wrong).
Once I get a sane xml file (this seems to be the biggest problem) what is the best tool to convert this to ConTeXt? We are all extremely busy, of course, but if anyone finds this interesting I can send a sample doc article from my journal. Maybe we can do a MyWay or something to document this process for ourselves and others, as well as find the most practical approach to creating a sane workflow. Besides, this kind of project seems to be exactly the kind of thing to illustrate the full power of ConTeXt. This is a mid-term project so no urgency (I'll keep copying and pasting for now->) Thanks again you all for your advice. Best Idris ============================ Professor Idris Samawi Hamid Department of Philosophy Colorado State University Fort Collins, CO 80523
Idris Samawi Hamid wrote:
Ok, you guys have lost me now-) Maybe the best thing to do is try something
Just ignore the detail of what xslt can and can't do for the moment. That just influences the choice of tools for one particular step and we all agree that there are tools for this step.
it to ConTeXt. From what I gather so far the process goes something like
doc => rtf rtf => OO.o OO.o => xml
No need for rtf. That would loose lots of information anyway, wouldn't it?
\startHans converting open office xml is not always easy; stay away from tab's and use high level constructs as much as possible \stopHans
I'm not really sure what Hand meant by this. I assume he does have a valid point, since so far I only had a short and theoretical look at the format, but I can only guess what it is. Hans, could you give an example or two?
From this discussion it seems that I (as an xml ignoramous) would be better off converting to ConTeXt code rather than processing pure xml blocks (but maybe I'm wrong).
XML is much, much easier to parse than just about anything else. That means that whatever your conversion process uses, you can simply reuse an XML parser in whatever language you want to use. (Interpreting the file may be easy or hard, depending on the xml structure at hand.) The only exception I can see right now would be a rather large and error-prone “Visual” Basic program to create a sort of export filter for Word to write ConTeXt. I certainly don't think that's easier.
Once I get a sane xml file (this seems to be the biggest problem) what is the best tool to convert this to ConTeXt?
It depends on who is going to write the conversion. From the languages I've used so far, it's probably easiest to do in xslt, but if you are/have at hand a programmer who's good at ruby but would have to learn xslt first, the whole thing may not be big enough to warrant learning another language first. Unless that programmer wants to, which would be a very good sign. Learning a new language per year is not really a bad idea.
We are all extremely busy, of course, but if anyone finds this interesting I can send a sample doc article from my journal. Maybe we can do a MyWay or something to document this process for ourselves and others, as well as find
It might be a pretty specific thing, though. My guess is that you could make more progress by thinking about what sort of structurals you would like to have, rather than looking at what you have right now. Christopher
participants (2)
-
Christopher Creutzig
-
Idris Samawi Hamid