On Tue, May 14, 2013 at 6:17 PM, Hans Hagen wrote:
On 5/14/2013 6:07 PM, luigi scarso wrote:
I Hope that someone can help here
as Mojca mentioned thai at bachotex i'll add the patterns as a start
given specs, examples and time, adding support for thai to context shouldn't be too hard (assuming that there are users)
But it's not trivial either. There's an opensource project implementing word segmentation: http://linux.thai.net/projects/swath The specification (someone's thesis) can be found here: http://www.cs.cmu.edu/~paisarn/papers/thesis99.pdf The ugly part of pdfTeX approach is that it requires an external text processor to digest an input TeX document and return a copy with word segmentation. Then pdfTeX is run on the resulting file. XeTeX can use ICU library to do the segmentation. In LuaTeX one would have to plug the word segmentation somewhere (but writing that part is slightly non-trivial). Mojca