On 5/15/2013 4:09 PM, Mojca Miklavec wrote:
On Tue, May 14, 2013 at 6:17 PM, Hans Hagen wrote:
On 5/14/2013 6:07 PM, luigi scarso wrote:
I Hope that someone can help here
as Mojca mentioned thai at bachotex i'll add the patterns as a start
given specs, examples and time, adding support for thai to context shouldn't be too hard (assuming that there are users)
But it's not trivial either.
It depends ... we're using a dictionary to determine word boundaries, aren't we? I'm pretty sure that I've done more complex coding.
There's an opensource project implementing word segmentation: http://linux.thai.net/projects/swath The specification (someone's thesis) can be found here: http://www.cs.cmu.edu/~paisarn/papers/thesis99.pdf
Ok, so there are some ttext files there with words.
The ugly part of pdfTeX approach is that it requires an external text processor to digest an input TeX document and return a copy with word segmentation. Then pdfTeX is run on the resulting file. XeTeX can use ICU library to do the segmentation.
In LuaTeX one would have to plug the word segmentation somewhere (but writing that part is slightly non-trivial).
I just did a quick test using those dictionaries (abusing some code that i already had on my machine). Quite doable. It all depends on having the dictionaries available (on the garden or in the distribution). Anyhow, it's not that much font related, just language / script support and we already have that for some languages and adding thai to it doesn't hurt. Of course we'd need some testing. It doesn't make much sense to add features to context that no one would use at some point. But ... Luigi is already teaching himself Thai, so ... Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------