Context, LaTeX, or an XML for academic writing?
Hi, I'm returning to graduate study after a few years out in the workplace. I'm a bit rusty on what good stuff there is out there for academic writing, and after a bit of research I've come up with: ConTeXt, LaTeX or an XML dtd (tbook or DocBook?) plus appropriate tools. I'm ruling out Word (having wrestled with it at work), and am reluctant to use anything similar like OpenOffice. I have used LaTeX for some things in the past. There will a little maths in my writing, but it's not central. Here are my main criteria for choice, in order of priority: 1) future-proofing. ie. I want my text to be always available to me forever, or until I die, whichever comes first. I take this to mean that I want the canonical form of my documents to be plain text of some sort. It also means that the system needs to be widely-used enough that it will be translateable into essential future formats as they arise. 2) semantic rather than layout-oriented markup as much as possible. I'm impatient with, and marginally interested in, layout. I'm very interested in what my text means. As much as possible, I want to set up my layouts early in the piece, and never think about them again. 3) relatively easy integration with some form of bibliographic database(ish) system (bibtex would do). 4) ability to produce pdf's, html, and rtf versions (for interoperation with Word-users) at least. 5) no need for me to write any code. I used to be a programmer, and when I left, promised myself, my wife, and my cat that I would never write a line of code again. I don't mind a bit of TeXish fiddling if *absolutely* necessary. ConTeXt seems to fit the bill for 1,3 and 5. I'm not sure about 4 (html? rtf?) or 2 (I haven't had a proper look at the nature of the available macros yet) . Would anyone with 1st hand knowledge of writing in academia care to comment either on the above or your own reasons for your choice of tools? I am doing my own research on all this stuff, but I know that until I get into the fray, there will be things I haven't thought of. Cheers, CB.
or an XML dtd (tbook or DocBook?) plus appropriate tools. I'm ruling out Word (having wrestled with it at work), and am reluctant to use anything similar like OpenOffice. I have used LaTeX for some things in the past.
I was in a similar situation a few years ago (writing my PhD thesis). I think you are absolutely right when you avoid Word and everything Wordish. Making a big document with Word requires a lot of knowledge about what you should avoid. And in the end you'll still spend your nights wondering why the **** the crossreferences or page numbers go wrong. I ended up using LaTeX. I didn't know much about ConTeXt by that time, and also had a lot of maths in the book. I am not sure which one I'd take, if I could choose right now. I think your choice is one of the following: LaTeX, DocBook, ConTeXt, ConTeXt+XML. However, your wishlist looks a bit difficult. A few comments:
1) future-proofing.
LaTeX is more common. On the other hand, you can (and should) take a snapshot of your working environment when you've finished what you're doing. All TeX variants (and XML stuff) are future-proof in the sense that all text and images are easy to recover if needed. Use only PDF, JPG, and PNG for images to be on the safe side. Reproducing the same layout depends on many other issues, even small changes in font metrics may change things. It is also well possible that 30 years from now nobody remembers ConTeXt (or DocBook or LaTeX or TeX). XML is in a way a safe bet, but even there you're up to some programming if the tools disappear.
2) semantic rather than layout-oriented markup as much as possible.
I think this is something you can do with all alternatives. In a typical ConTeXt (and LaTeX) file there is a lot of layout stuff in the beginning, but in the document itself the tagging is really independent from layout, if you've done the preliminary work right. At least I consider it bad style, if you use explicit font switches or equivalent in a document. However, even if you think the layout is not that important, you'll need to do a lot of things with it before having a printable book. In this sense ConTeXt seems to give a lot of possibilities, but the documentation is not very complete. LaTeX is a bit more difficult, and you need to do more TeXing, but in practice you don't as someone else has done it before (packages). Fonts are difficult in any case :) I am not a DocBook specialist, but my impression is that it is really not so much geared towards printable layout. This, of course, makes the markup separate from the layout. This is the key in making successful documents with any system: The content and the layout are two different layers. Word processing programs mix them into a sorry mess, but for the smoothest workflow they should be separated. It should even be possible for different people do do carry out the two different tasks.
3) relatively easy integration with some form of bibliographic database(ish) system (bibtex would do).
(.*)TeX will do.
4) ability to produce pdf's, html, and rtf versions (for interoperation with Word-users) at least.
PDF is a must. HTML can be reproduced from (.*)TeX, but DocBook is the only one designed with HTML in mind. On the other hand this may reflect to the print quality; TeX is a real typesetting system. There are ways to make TeX out of DocBook (e.g. passiveTeX), but the quality is not always as good as with other alternatives. HTML is more a matter of taste. A nicely working PDF is -- IMHO -- much easier to use. It is easy to search from the complete document, and links from the index and ToC make the use straightforward. Modern displays are sufficiently high-res for PDF to be read on-screen. Also, printing a complete PDF document is easy. The situation becomes much more complicated if you need RTF. It is a completely different story, a word processor editable format. I guess you don't really want to distribute your work in editable format, and PDF can be read with virtually any computer. So, I'd concentrate on making a visually pleasing high-quality PDF with working links in it. That will make most readers happy.
5) no need for me to write any code. I used to be a programmer, and when I left, promised myself, my wife, and my cat that I would never write a line of code again. I don't mind a bit of TeXish fiddling if *absolutely* necessary.
All alternatives are equivalent in this sense. Of course, if you plan on doing something with ConTeXt/XML, that requires some work, but not really programming. And all layout stuff with (.*)TeX requires some serious head scratching in the beginning, anyway.
ConTeXt seems to fit the bill for 1,3 and 5. I'm not sure about 4 (html? rtf?) or 2 (I haven't had a proper look at the nature of the available macros yet) .
I'd say it'll fill number 2, as well. But RTF, no. There may be kludges to make it kind of, you know, a bit like, errr, RTFish, but nothing really good. The reason is simple: the two things are far apart from each other. - Ville
Hi Ville, Thanks for your reply. I don't have much more to say on this yet, but have added a few comments below.
I was in a similar situation a few years ago (writing my PhD thesis). I think you are absolutely right when you avoid Word and everything Wordish. Making a big document with Word requires a lot of knowledge about what you should avoid. And in the end you'll still spend your nights wondering why the **** the crossreferences or page numbers go wrong.
Absolutely. Word seems easy at first, but I've watched people go gray trying to get large texts to do what they want, close to deadline.
However, your wishlist looks a bit difficult.
Actually your comment here might suggest how far we have to go then, as I'd consider my wishlist a very roughly stated but really quite minimal set of requirements for academic writing.
The situation becomes much more complicated if you need RTF. It is a completely different story, a word processor editable format. I guess you don't really want to distribute your work in editable format, and PDF can be read with virtually any computer.
I'd say it'll fill number 2, as well. But RTF, no. There may be kludges to make it kind of, you know, a bit like, errr, RTFish, but nothing really good. The reason is simple: the two things are far apart from each other.
Since posting I've thought a bit more about why I wanted RTF, and realised it wouldn't do what I wanted anyway. The 'inter-operation with Word users' I was referring to is primarily this: it's common amongst academics I know here in Australia to use some of the collaboration features of Word (marginal comments and revision control, particularly). RTF wouldn't actually help with those anyway. So there's really no way around this without using Word, which I will only do at gunpoint.
Am 11.05.2005 um 01:52 schrieb CB:
Since posting I've thought a bit more about why I wanted RTF, and realised it wouldn't do what I wanted anyway. The 'inter-operation with Word users' I was referring to is primarily this: it's common amongst academics I know here in Australia to use some of the collaboration features of Word (marginal comments and revision control, particularly). RTF wouldn't actually help with those anyway. So there's really no way around this without using Word, which I will only do at gunpoint.
If you and your collaborators have Acrobat (full) or Jaws PDF Editor you could at least use the comment features of PDF and perhaps the workflow possibilities of Acrobat 6+. Grüßlis vom Hraban! --- http://www.fiee.net/texnique/ http://contextgarden.net
Actually your comment here might suggest how far we have to go then, as I'd consider my wishlist a very roughly stated but really quite minimal set of requirements for academic writing.
Well, if you drop the RTF part, then your wishlist is not that difficult. However, there are some requirements which look trivial at first but are rather difficult to make well. The most important of these is the difference between HTML and a printed book. As long as you use only running text (no illustrations, graphs, images, formulae, tables), there is no problem. By making suitable templates the text may be typeset well and it works as a web page (or a collection of web pages). In HTML you have less control over the layout, but as the user has the control, everything is well. Some problems arise when you add any special elements to the text. Formulae are a good example. Even though you might in principle use MathML or equivalent, the browser support is not built-in, so most users cannot read the formulae. You'll need to use images, but then the best resolution is hard to find. The same goes with images, SVG is not ready yet, so resolution problems are really difficult. Illustrations which print well at high resolution do not necessarily look good at screen resolution. But the real problems start with floats. Where do you put a picture with its captions on a web page? Or a footnote? One common solution is to put them behind a link. However, some people (yours truly included) find that following the links back and forth is clumsy. Another solution would be to place the figures within the text, but then we have all sorts of typesetting problems without having a typesetting engine. Of course, you can make miracles with XHTML/CSS. You can make something that looks laike pages from a book, for example. But then, why not really use PDF instead? Because then you can be sure of the layout. The hyperlink navigation paradigm of HTML is a good one for many purposes. It is not a good one for a book. If I have a book (or a PDF), I can easily verify I've read it to the last comma. With a more complicated (even a simple tree without loops) HTML document trying the same reminds me of the "Maze all different" in the old "Adventure" game (Colossal Cave Adventure by Will Growther). I am not saying HTML is bad and PDF good. HTML is extremely good for many purposes. Wiki is a good example of this, and so are many web pages. But as HTML is not necessarily a good form for a book, concentrating on PDF is probably a better idea. ---
Since posting I've thought a bit more about why I wanted RTF, and realised it wouldn't do what I wanted anyway. The 'inter-operation with Word users' I was referring to is primarily this: it's common amongst academics I know here in Australia to use some of the collaboration features of Word (marginal comments and revision control, particularly). RTF wouldn't actually help with those anyway. So there's really no way around this without using Word, which I will only do at gunpoint.
Well, if everyone around you is using Word and requires you to collaborate by using Word, you are up to your lower back in alligators. On the other hand, there are ways around this. What I use when commenting on other people's texts, I want to have the texts as PDF. Then I just simply write a mail with my comments: "p. 123, paragraph 2: Not so. Dr. Frankenstein proved this to be wrong in 1974, see Journal of Unlikely Science, 1865, pp. 1456-1505" p.127, figure 2.13: I don't get it." Exactly same thing as scribbling things into the margin. This method is independent of the programs used and does not really take any more time. I have found only two shortcomings with this method: 1. it is difficult to combine comments from several reviewers, 2. you cannot edit the text yourself even if you wanted to. The first one is a problem with Word documents, as well, and the second one is not always so desirable, anyway. Really, I hate it when people send me their Word files. I am quite convinced I am not the only one. The annotation mechanism in Word is similar to almost everything else in the program: looks easy, feels easy at first, makes you run circles on the walls in the end. - Ville
Ville Voipio wrote:
I am not saying HTML is bad and PDF good. HTML is extremely good for many purposes. Wiki is a good example of this, and so are many web pages. But as HTML is not necessarily a good form for a book, concentrating on PDF is probably a better idea.
I hadn't thought of half the stuff you mention, which comes of the fact that my requirements come anticipation rather than recent use (I'm returning to academia after 10 years being in jobs where the only writing I've had to do is reports in Word for semi-literate business people). I thought it might be good to pick and learn a system now rather than start with one format only to find deficiencies and have to switch later. I can see a place for books and articles in HTML, but as a supplement to PDF for fast online browsing (and in that context I don't see a problem with just reducing layout standards). But I agree PDF is the thing to concentrate on for fully-formatted output.
Well, if everyone around you is using Word and requires you to collaborate by using Word, you are up to your lower back in alligators. On the other hand, there are ways around this. What I use when commenting on other people's texts, I want to have the texts as PDF. Then I just simply write a mail with my comments:
"p. 123, paragraph 2: Not so. Dr. Frankenstein proved this to be wrong in 1974, see Journal of Unlikely Science, 1865, pp. 1456-1505"
p.127, figure 2.13: I don't get it."
That seems fine to me, but many people are often so wowed by GUI stuff that they wouldn't consider using this rather than the pretty marginal notes that Word produces. I have a friend in academia here who does successfully resist the (sometimes quite heavy) insistence on Word. She just says that she's not willing to be forced to use the products of a foreign monopolist which has been found guilty of large-scale corporate malfeasance in multiple jurisdictions worldwide. Being a humanities-based academic, she can get away with this ;) Her colleagues yawn and tell her to use what she wants.
Really, I hate it when people send me their Word files. I am quite convinced I am not the only one. The annotation mechanism in Word is similar to almost everything else in the program: looks easy, feels easy at first, makes you run circles on the walls in the end.
- Ville
That's also my experience. I've worked in a company which has hired very expensive Microsoft consultants to come in and set up some Sharepoint+Word-based workflow for documentation. The system was so complex and fragile, it got dumped within weeks and everyone went back to hacking up adhoc Word docs again, copying and pasting like fury.
participants (3)
-
CB
-
Henning Hraban Ramm
-
Ville Voipio