[NTG-context] Translating PDF-files

R. Ermers
Wed Jan 19 16:03:17 CET 2011

Off topic: Well, I speak Russian and some other languages. Yes you are right, the 5th case is the locative (after o), the 6th case is the instrumental. One does not count the cases everyday :-)

It is not a language in general that is difficult, but the pair a language is in: the pair English-Russian is, in some aspects, more difficult than the other way around because of the choice for the perfective aspect or imperfective aspect of the tenses. An English text does not offer any clues as to which aspect to choose, but anyone who wants to speaks Russian has to decide instantly. A program is unlikely do that.

These problems might not exist for the pair Ukrainian-Russian, or perhaps (?) Polish-Russian, or - who knows - Basque-Russian.
The options for determine the appropriate aspect, if programmers succeed in building them at all, are, for example, not needed in the pair English-Dutch.

The reversed pair Russian-English poses different problems, such as when and where to put an article. The program has to derive from the context whether a given Russian noun in the text should be interpreted as determined or undetermined, and then whether it is appropriate to put the article, etcetera.


>> Even though the result will no doubt show cyrillic words, which looks interesting, the factual result will be rubbish, and most likely unintelligible to any Russian.
>  That's an interesting statement; do you have any experience with that
> at all, or are you simply speculating?  I have never heard any claim
> that machine translation would be more difficult for some particular
> languages.  It's generally a hard problem, and each language has its
> specific issues, not only Russian (that has 6 cases, by the way, not 7,
> and really only one fully conjugated tense).
> 	Arthur
