Peter Heslin wrote:
A little while ago, I said that I hoped to convert Dimitrios Filippou's ancient Greek hyphenation patterns (the elhyphen package) to utf-8, in order to use them with xetex. Before thinking about starting this work, I decided to look to see if anyone else had done it, and I came across something interesting in ConTeXt, which is not a package I normally use.
There appears to be a whole subdirectory in the ConTeXt distribution that is full of utf-8 hyphenation patterns, including Filippou's ancient Greek ones, but also including German, French, etc. They are in the file: http://www.pragma-ade.com/context/current/cont-tmf.zip, in the tex/context/patterns directory.
Can anyone who knows about ConTeXt explain about where these patterns come from and how it is that context manages to use these patterns? (I thought that non-xetex TeX could only use single-byte encoded patterns.)
some time ago i decided to ship patterns with context because (1) there is no sound infrastructure in the tex world for managin gpatterns (2) i need encoding neutral patterns [most patterns are ec only] (3) i want control over what gets loaded in context (4) i wanted to get rid of every year's disappearing, renamed, changed patterns (5) apart from the fact that i wanted patterns that were not in a sense hard wired latex patterns
If there is a script that was used to convert these from the source to utf-8, is it available? A quick glance at the ancient greek patterns (in the file lang-agr.pat) shows that there is a bug in the conversion that I'd like to report and fix.
ctxtools --pat [en nl agr ...] ctxtools --pat --utf [en nl agr ...] the greek conversions were done with the help of a greek language users on the context list, so in case of troubles, so i cc there; bugs need to be fixed indeed in ctxtools.rb you can grep for 'agr' and see what conversions takes place for greek more info can be found in: http://www.pragma-ade.com/general/manuals/mpattern.pdf (also published in tugboat) there is a file lang-all.xml in the context distribution
On a more general level, if both ConTeXt and XeTeX are engaged in converting legacy TeX hyphenation patterns to utf-8, should they be coordinated in order to avoid duplication of effort?
anyone can use the patterns; of course bugs need to be sorted out, but given my experiences with pattern maintainance i will not drop them from context; too much has gone wrong in the past; but you can consider them to be generic so indeed we can avoid duplication of work. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------