Le 22/05/2009 17:51, Arthur Reutenauer a écrit :
but instead of arbitrary adding a 0.25em before and 1em after the punctuation mark you should use the real nnbsp (U+202F) before and real normal space (U+0020) after.
I don't think so. Space characters don't mix very well with TeX glue and should best be avoided, generally speaking. In particular, all inter-word spaces that are input in the TeX source as one or more of U+0020 are simply ignored, and replaced by normal inter-word glue, with its appropriate stretchability and shrinkability. This has always been the case in TeX and is not going to change. All other types of Unicode spaces should really, in my opinion, be processed in the same way, while respecting their additional properties in the case of non-breakable spaces, for instance.
Not knowing the internals, that's what I tried to say with adding a space after instead of 1em, i.e. "calculated by the engine". If w an em is added after, it is not stretchable and shrinkable, right?
In addition, characters like U+202F are very badly supported across fonts, and if you take in account the fact that the most appropriate width will probably change depending on the language, you're likely to observe much more arbitrary results if you use the glyph for that character in font. I seriously doubt you want to rely on the font for that.
Why? Let me take your example again:
{\setcharacterspacing[frenchpunctuation]a? aa? aaa? abba?}
a\,? aa\,? aaa\,? abba\,?
Surprise: the first line is longer than the second. It's because sizes of the U+0020 and U+202F depend on the font design, their size are not exactly 1em and 0.25em.
That's not the reason. The reason is simply that \, is defined as a \kern by one sixth of an em (see core-spa.mkiv: it's equivalent to \thinspace, which is \kern .16667em). In the first line, the value of .25em is defined in core-spa.mkiv; you can redefine it if you want. In any case, every space is completely controlled by ConTeXt, we don't let the font mess around.
For that matter, Latin Modern doesn't have a glyph for U+202F, so if we'd use it, we'd just see nothing: there would be no space at all, see attached file.
Thank you so much for the detailed technical explanation! So, AFAIK, I believe that the space before should be equivalent to thinspace.
All this really calls for more coordination in order to produce decent specifications, in my opinion. If you think ConTeXt's default should be different, it's fine and I encourage you to contact Sébastien to discuss about it. Report then to Hans and Peter for the implementation.
Thanks, I'll see that. Maybe I could write some detailled specs in the wiki. Regards, Bob.