Hello all, I've found some strange behavior with hyphenation; but since hyphenation has been discussed frequently, and might involve some settings in conf files somewhere I'm not aware of, I wouldn't like to submit an unnecessary tracker item in case the problem is well known. The code is: \hsize=0pt \overfullrule=0pt \hyphenation{W-h-a-t-e-v-e-r} \uchyph=1 Acceptation Behavior Dying Expression Whatever \uchyph=0 Acceptation Behavior Dying Expression Whatever \bye And comparing the plain TeX compilation in PDFTeX and LuaTeX (all from TL2011, except LuaTeX, which is rev 4358), I notice that: - LuaTeX doesn't obey \uchyph=0; it does hyphenate the words, although not as completely as when \uchyph=1. - "Acceptation" is hyphenated in LuaTeX even though it's the first word, and a hyphenable word should be preceded by a glue (see TeXbook p.256); it is not hyphenated in PDFTeX. - As shown by the hyphenation of "Whatever", LuaTeX disregards \lefthyphenmin and \righthypenmin. Does it ring any bell, or should I add it to the tracker item? Best, Paul
On 18-2-2012 12:08, Paul Isambert wrote:
Hello all,
I've found some strange behavior with hyphenation; but since hyphenation has been discussed frequently, and might involve some settings in conf files somewhere I'm not aware of, I wouldn't like to submit an unnecessary tracker item in case the problem is well known. The code is:
\hsize=0pt \overfullrule=0pt \hyphenation{W-h-a-t-e-v-e-r}
\uchyph=1 Acceptation Behavior Dying Expression Whatever
\uchyph=0 Acceptation Behavior Dying Expression Whatever \bye
And comparing the plain TeX compilation in PDFTeX and LuaTeX (all from TL2011, except LuaTeX, which is rev 4358), I notice that:
- LuaTeX doesn't obey \uchyph=0; it does hyphenate the words, although not as completely as when \uchyph=1.
I must admit that I never used \uchyph (and even wonder if we really need it as it's a kind of tweaking that seldom is done at the document level). I would have expected this primitive to do nothing at all in luatex.
- "Acceptation" is hyphenated in LuaTeX even though it's the first word, and a hyphenable word should be preceded by a glue (see TeXbook p.256); it is not hyphenated in PDFTeX.
luatex hyphenates the whole list in one go and does not do this delayed and partial as in traditional tex ... this is on purpose as it provides callback code with the whole lot (in luatex the hyphenation / ligature building and justification steps are separated) ... also, there is no reason not to hyphenate the first word if you have real narrow columns.
- As shown by the hyphenation of "Whatever", LuaTeX disregards \lefthyphenmin and \righthypenmin.
Explicit \hyphenation always wins and is not influenced by the *min values (after all, it is mostly meant as a an escape for providing exceptions and not for extending the patterns so obeying the *min parameters would defeat that purpose). Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hans Hagen
On 18-2-2012 12:08, Paul Isambert wrote:
Hello all,
I've found some strange behavior with hyphenation; but since hyphenation has been discussed frequently, and might involve some settings in conf files somewhere I'm not aware of, I wouldn't like to submit an unnecessary tracker item in case the problem is well known. The code is:
\hsize=0pt \overfullrule=0pt \hyphenation{W-h-a-t-e-v-e-r}
\uchyph=1 Acceptation Behavior Dying Expression Whatever
\uchyph=0 Acceptation Behavior Dying Expression Whatever \bye
And comparing the plain TeX compilation in PDFTeX and LuaTeX (all from TL2011, except LuaTeX, which is rev 4358), I notice that:
- LuaTeX doesn't obey \uchyph=0; it does hyphenate the words, although not as completely as when \uchyph=1.
I must admit that I never used \uchyph (and even wonder if we really need it as it's a kind of tweaking that seldom is done at the document level). I would have expected this primitive to do nothing at all in luatex.
Actually, it is explicitly mentioned in section 6.1 of the manual, stating that it is effective immediately, not at the end of the paragraph, since its value is stored in nodes themselves. I don't find it terribly useful either, but who knows what trickery has been based on it.
- "Acceptation" is hyphenated in LuaTeX even though it's the first word, and a hyphenable word should be preceded by a glue (see TeXbook p.256); it is not hyphenated in PDFTeX.
luatex hyphenates the whole list in one go and does not do this delayed and partial as in traditional tex ... this is on purpose as it provides callback code with the whole lot (in luatex the hyphenation / ligature building and justification steps are separated) ... also, there is no reason not to hyphenate the first word if you have real narrow columns.
I agree, it's just that TeX3 does otherwise.
- As shown by the hyphenation of "Whatever", LuaTeX disregards \lefthyphenmin and \righthypenmin.
Explicit \hyphenation always wins and is not influenced by the *min values (after all, it is mostly meant as a an escape for providing exceptions and not for extending the patterns so obeying the *min parameters would defeat that purpose).
The parameters are obeyed in all other engines, exception or not; I suppose the reason is that they may very well change in the course of the document. But actually I've just found things are a little bit more complicated: LuaTeX doesn't hyphenate a word if it is shorter than \lefthyphenmin + \righthyphenmin (which makes sense); but if the word is hyphenated, and it is an exception, then it gets hyphens everywhere, whatever the values of the parameters: \hsize=0pt \overfullrule=0pt \hyphenation{a-b-c d-e-f-g h-i-j-k-l m-n-o-p-q-r} \lefthyphenmin=2 \righthyphenmin=3 abc defg hijkl mnopqr castle keepers % PDFTeX = abc defg hi-jkl mn-o-pqr cas-tle keep-ers % LuaTeX = abc defg h-i-j-k-l m-n-o-p-q-r cas-tle keep-ers \lefthyphenmin=4 \righthyphenmin=1 abc defg hijkl mnopqr castle keepers % PDFTeX = abc defg hijk-l mnop-q-r castle keep-er-s % LuaTeX = abc defg h-i-j-k-l m-n-o-p-q-r castle keep-er-s \bye This half-obeying the parameters does seem quite strange to me. Best, Paul
On 19-2-2012 07:25, Paul Isambert wrote:
luatex hyphenates the whole list in one go and does not do this delayed and partial as in traditional tex ... this is on purpose as it provides callback code with the whole lot (in luatex the hyphenation / ligature building and justification steps are separated) ... also, there is no reason not to hyphenate the first word if you have real narrow columns.
I agree, it's just that TeX3 does otherwise.
btw, similar differences in engines can be found in ligature de/re composition where tex3 does some juggling that luatex doesn't (for good reason)
The parameters are obeyed in all other engines, exception or not; I suppose the reason is that they may very well change in the course of the document.
Ok, but as luatex can have more advanced content in \hyphenation it is more tricky (and probably costly) to have dynamic adaption to those min values so I wonder if it's worth the trouble.
But actually I've just found things are a little bit more complicated: LuaTeX doesn't hyphenate a word if it is shorter than \lefthyphenmin + \righthyphenmin (which makes sense); but if the word is hyphenated, and it is an exception, then it gets hyphens everywhere, whatever the values of the parameters:
ok, that's weird indeed and could qualify as a bug Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hans Hagen
On 19-2-2012 07:25, Paul Isambert wrote:
But actually I've just found things are a little bit more complicated: LuaTeX doesn't hyphenate a word if it is shorter than \lefthyphenmin + \righthyphenmin (which makes sense); but if the word is hyphenated, and it is an exception, then it gets hyphens everywhere, whatever the values of the parameters:
ok, that's weird indeed and could qualify as a bug
Added to the bug tracker, then. Paul
participants (2)
-
Hans Hagen
-
Paul Isambert