[NTG-context] Microsoft Word -> Context

Henning Hraban Ramm hraban at fiee.net
Tue Apr 3 23:26:08 CEST 2007

Am 2007-04-03 um 09:20 schrieb Mari Voipio:

> Note! If your files  contain graphics, for ConTeXt you have to ask
> people to send them in separately as pdf, png or jpg (instead of  
> putting
> them inline in the Word file). I have found *this* hard to achieve  
> once
> in a while and I still often spend substantial time chasing down
> originals of graphics I get in Word files.

A good way is to save the docs as OpenOffice docs, unzip them and  
collect the images from their folder.
But pictures in Word documents are crap anyway, most of the time.

For my main project at work (a city magazine, typeset with InDesign)  
I got everything as Word Docs until some issues before. After  
struggling with useless text formatting (hyperlinks! blech!) we  
copypasted only plain text and did the formatting again manually.
Now I wrote a editorial system as web application, where the authors  
have to fill fixed text boxes (title, intro, text, infos, author  
etc.). If everything's ready, I pull the whole stuff from the  
database and apply formatting (InDesign tagged text, but could be  
anything) to ease the layout work.
Event timetable data works similar, but via XML. (Why? InDesign can  
place images with XML, but not with TaggedText, and we need some  
icons in the calendar. We could use XML for everything, but InDesign  
is much faster with TaggedText.)

Of course that's no solution for most Word-to-ConTeXt cases, only as  
a side note...
And BTW: I really like InDesign as a layout app, but it's text  
handling (regarding XML or TaggedText import) is horrible! (Crappy  
coded - doesn't understand different line endings or different text  
encodings, only incomplete UTF-16 without BOM and predeclared Win or  
Mac line endings... XML is always whitespace sensible...)
Enough OT.

> [I've found that generally my fellow office workers don't want to deal
> with *anything* like this, but professional translators have no  
> problems
> with ConTeXt code; and anybody with html-by-hand experience usually  
> gets
> the drift very fast.]

Unfortunately even my HTML coding colleagues fear the command line.
And providing GUIs for my nice automation scripts (e.g. CD cover  
generator with ConTeXt) is tedious...

> For example about now I have to start writing a product manual where
> some parts of text come from an old Word file. I'll probably just cut
> and paste what I need from the pdf file, but it's still faster than
> fighting with Word over original the 9 MB (!) doc - and consistency  
> can
> be guaranteed, unlike if I used Word, because the old file is done  
> with
> Word95 and 97 and we now use Word 2003 where the list functions and
> styles work slightly differently and don't open quite as they used  
> to be.

Yup, I get a lot of crashes if the Word versions don't fit. I use  
TextEdit.app then to extract the text, but then (like with most other  
Word converters) you have to clean up the hyperlink and versions crap.

Greetlings from Lake Constance!
https://www.cacert.org (I'm an assurer)

More information about the ntg-context mailing list