On 8/6/06, Aditya Mahajan wrote:
On Sun, 6 Aug 2006, Mojca Miklavec wrote:
Base on those three answers I got a more clear idea of two (different, but complementary) methods that might be sensible:
a) ctxtools --wordcount filename[tex|pdf] to do the wordcount for the whole document using pdftotext + ruby regexp
b) \usemodule[wordcount]
whatever
\startstatistics[name][words|letters|lines] some more-or-less plain text \stopstatistics
whatever
and according to Aditya's idea, run a (ruby) regular expression (insead of detex) on it which would write the nicely formatted desired number to the output/log file. (I don't know if it's possible to use the first approach for the second problem, but it doesn't make sense to complicate things too much.)
If you have a script that counts words in a Context document, the second approach is straight forward. Write everything to a buffer and run the script on the buffer. However, such a mechansim will never be perfect (or close to perfect) in the sense of parsing arbitrary input.
The most dummy solution that I could think of (using slightly modified Hans's ruby script): \unprotect \def\startstatistics {\dodoubleempty\dostartstatistics} \def\dostartstatistics[#1][#2]#3\stopstatistics {\setbuffer[#1]#3\endbuffer \executesystemcommand{ruby wordcount.rb \jobname-#1.tmp}% \getbuffer[#1]} \protect \doifnotmode{demo}{\endinput} ... but a friend who asked me for a favour actually wants to use abbreviations and bibliography as well, so only the first method (to create PDF first) would work. He currently keeps copy-pasting the resulting PDF to Word and uses Word's statistics to cound the words and/or characters for him. But I guess that his wishes will have to wait for some more time in this case.
ftp://tug.ctan.org/pub/tex-archive/macros/plain/contrib/misc/xii.tex
But of course, you will not write anything like this in an abstract :-)
Nevertheless, I love the story (and esp. the document which creates it)! All the best, Mojca