A design philosophy question...
I have this year reported something as a bug to Barbara Beeton (the bug collector for TeX) which I find disturbing: TeX's arithmetic on units truncates and thus is biased. This has the annoying effect that "1in", supposedly representing one inch which is _exactly_ 72.27 points, will in TeX stand for the value "72.26999pt" and be actually smaller than the value "72.27pt". Of course, fixing this might change existing documents. Also we have: **\dimen0=1in entering extended mode *\showthe\dimen0
72.26999pt.
*\showthe\dimexpr100\dimen0\relax
7226.9989pt.
*\dimen0=72.27pt *\showthe\dimen0
72.27pt.
*\showthe\dimexpr100\dimen0\relax
7227.00043pt.
So while "72.27pt" is _quite_ closer to one inch than "1in" is, it is larger. Anyway, is something like that an area that LuaTeX would actually ever consider touching? After all, it would break document compatibility. Which might be, I believe, less of a holy grail for ConTeXt than for LaTeX. Opinions? -- David Kastrup, Kriemhildstr. 15, 44793 Bochum
David Kastrup wrote:
Anyway, is something like that an area that LuaTeX would actually ever consider touching? After all, it would break document compatibility. Which might be, I believe, less of a holy grail for ConTeXt than for LaTeX.
in extended mode, when using dimexpr one also gets more precission so a switch to etex mode may break backward compatibility anyway when looking at this aspect, keep in mind that other programs (like afm2tfm) also have their special way of truncating/rounding, so as soon as one changes tools compatibility concerning context ... truncating vs rounding is less an issue than precission (roundtrip calculations resulting in 1+ sp off comparisons) but that is/will be covered with special test stuff concerning latex and old docs ... there will always be pdftex vs < 2, so with luatex we have no intention to be 100% downward compatible at all (some internals may also be adapted / extended / opened up, some good old tex code may disappear (lig rebuilding already has), etc Hans
Hans Hagen
David Kastrup wrote:
Anyway, is something like that an area that LuaTeX would actually ever consider touching? After all, it would break document compatibility. Which might be, I believe, less of a holy grail for ConTeXt than for LaTeX.
in extended mode, when using dimexpr one also gets more precission so a switch to etex mode may break backward compatibility anyway
Uh, how can there be backward compatibility broken when dimexpr did not exist previously? Do you mean compatibility to eTeX? Nobody claimed that dimexpr arithmetic is the same as TeX arithmetic (in fact, it is a **** nuisance that its integer division rounds instead of truncating).
concerning context ... truncating vs rounding is less an issue than precission (roundtrip calculations resulting in 1+ sp off comparisons) but that is/will be covered with special test stuff
TeX's way of always truncating dimensioned expressions is certainly bad for roundtripping. That "1in" is not resolved to the same value as "72.27pt" is what I'd call a roundtrip problem.
concerning latex and old docs ... there will always be pdftex vs < 2, so with luatex we have no intention to be 100% downward compatible at all (some internals may also be adapted / extended / opened up, some good old tex code may disappear (lig rebuilding already has), etc
Interesting. What do you do instead of ligature rebuilding? -- David Kastrup
David Kastrup wrote:
(in fact, it is a **** nuisance that [dimexpr] integer division rounds instead of truncating).
I agree, but that is a completely different problem.
concerning latex and old docs ... there will always be pdftex vs < 2, so with luatex we have no intention to be 100% downward compatible at all (some internals may also be adapted / extended / opened up, some good old tex code may disappear (lig rebuilding already has), etc
Interesting. What do you do instead of ligature rebuilding?
What has happened already is that ligature replacement and kerning are totally separated inside luatex: at font loading time, the ligkern information from the tfm metrics is split into ligatures and kernings. This means there are changes inside main_control() and reconstitute(). The reconstitution process was not 'perfect' before, and that now makes it hard to ascertain wether the new code is 100% identical. I think so, but I am not willing to bet on it. What will happen soon is that alternative implementation will become used for the hyphenation algorithm: one that does not limit words to 64 characters arbitrarily, and that allows pattern loading and augmenting at run-time. And there will likely be more changes along these lines. Taco
David Kastrup wrote:
Anyway, is something like that an area that LuaTeX would actually ever consider touching?
Perhaps one day, but definately not soon. There are a few ways out of this, but they all require dimens to occupy more than 32 bits internally, a step not to be taken lightly.
After all, it would break document compatibility.
In this particular case, I would not worry too much about that. Updated executables, hyphenation patterns, or font metrics will have a much deeper inpact on line breaking, and I very much doubt anybody has actually saved their old versions of those files alongside their input file.
Which might be, I believe, less of a holy grail for ConTeXt than for LaTeX.
Holy grail or Red herring? It has never been true that all different versions of TeX produce identical output in all cases. Identical versions of TeX should produce identical output on all platforms, but that is where it stops. The results from TeX 3.14 can be different from TeX 3.141592, even without any other change to your installation. Best, Taco
Taco Hoekwater
David Kastrup wrote:
Anyway, is something like that an area that LuaTeX would actually ever consider touching?
Perhaps one day, but definately not soon. There are a few ways out of this, but they all require dimens to occupy more than 32 bits internally, a step not to be taken lightly.
Not at all. For fixing rounding of the units, the following patch
should do it:
Index: luatex.web
===================================================================
--- luatex.web (revision 382)
+++ luatex.web (working copy)
@@ -11570,7 +11570,7 @@
@.sp@>
else @
2007/4/3, David Kastrup
Anyway, this would make "72.27pt" the same value as "1in". Which was what this was all supposed to be about. Only that I wanted to check that I did not break anything, and things were broken already.
Looks good. Does this affect the trip test? We should try to keep passing it, if possible. :-) Best Martin
"Martin Schröder"
2007/4/3, David Kastrup
: Anyway, this would make "72.27pt" the same value as "1in". Which was what this was all supposed to be about. Only that I wanted to check that I did not break anything, and things were broken already.
Looks good.
Does this affect the trip test? We should try to keep passing it, if possible. :-)
If the trip test does a good job at capturing TeX's idiosyncrasies, it should break. I have scanned through the source code of the test and still feel sick. Anyway, there are surprisingly few uses of units. So I can't really guess either which way. In some locations he uses numbers with a ridiculous precision (making them exact): in those cases likely neither rounding nor truncation occur. So the change _has_ a possibility of passing the trip test, but if it does so, it is more by accident than by spirit. I have also filed this several months ago as a bug report to Bb, so there is a minuscule chance of Knuth considering a fix upstream at the end of the year. I know that he has an aversion of changing anything with such an impact, but then he wants TeX to become an epitaph, and what kind of epitaph for the author of "The Art of Computing -- Seminumerical Algorithms" would it be if "1in" and "72.27pt" had different values? -- David Kastrup
David Kastrup wrote:
I have also filed this several months ago as a bug report to Bb, so there is a minuscule chance of Knuth considering a fix upstream at the end of the year. I know that he has an aversion of changing anything with such an impact, but then he wants TeX to become an epitaph, and what kind of epitaph for the author of "The Art of Computing -- Seminumerical Algorithms" would it be if "1in" and "72.27pt" had different values?
in a sense he's saying that a pt in tex is not really a pt (tex book) which means that 1in is not 72.27 pt; the problem is that in the tex book he mentions those two values (as he does with cm and in and such) i guess that you have a better change with filing a bug for the tex book: \eq should be \approx dek could also say: use cm instead of in -) I think that the assumption is that one stays within a similar unit. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hans Hagen
David Kastrup wrote:
I have also filed this several months ago as a bug report to Bb, so there is a minuscule chance of Knuth considering a fix upstream at the end of the year. I know that he has an aversion of changing anything with such an impact, but then he wants TeX to become an epitaph, and what kind of epitaph for the author of "The Art of Computing -- Seminumerical Algorithms" would it be if "1in" and "72.27pt" had different values?
in a sense he's saying that a pt in tex is not really a pt (tex book) which means that 1in is not 72.27 pt;
No, as far as TeX is concerned, the ratio "in" is _exactly_ 72.27 times the ratio "pt". And indeed, "100in" is exactly "7227pt". But "1in" maps to a different value than "72.27pt". Of course, one inch (as opposed to a hundred of them) can't be represented exactly by TeX's scaled numbers, so an approximation has to be picked. The problem is that the approximations picked for "1in" and for "72.27pt" are different. In general, TeX will pick the lower enclosing approximation instead of the closest one for numbers with units, unless the unit happens to be "sp" or "pt".
the problem is that in the tex book he mentions those two values (as he does with cm and in and such)
i guess that you have a better change with filing a bug for the tex book: \eq should be \approx
No, the ratios _are_ exactly 100/7227. But what TeX does with them when it does not have an exact representation of the value resulting from a "scaled" 15.16 value times a unit expressed as a fraction, results in "1in" being unequal to "72.27pt" (neither exactly representable), even though "100in" _are_ "7227pt" (both exactly representable).
dek could also say: use cm instead of in -)
I think that the assumption is that one stays within a similar unit.
I think this analysis is not supported by the source code. -- David Kastrup
David Kastrup wrote:
Taco Hoekwater
writes: David Kastrup wrote:
Anyway, is something like that an area that LuaTeX would actually ever consider touching? Perhaps one day, but definately not soon. There are a few ways out of this, but they all require dimens to occupy more than 32 bits internally, a step not to be taken lightly.
Not at all. For fixing rounding of the units, the following patch should do it:
Yes, but not now. As Hans said, it is vital in this stage to compare against (pdf)tex and aleph. Also, this only fixes the rounding vs. truncation, not the deviations created by the fact that TeX talks in base10 but calculates in base2. Best wishes, Taco
Taco Hoekwater
David Kastrup wrote:
Taco Hoekwater
writes: David Kastrup wrote:
Anyway, is something like that an area that LuaTeX would actually ever consider touching? Perhaps one day, but definately not soon. There are a few ways out of this, but they all require dimens to occupy more than 32 bits internally, a step not to be taken lightly.
Not at all. For fixing rounding of the units, the following patch should do it:
Yes, but not now. As Hans said, it is vital in this stage to compare against (pdf)tex and aleph.
I understand.
Also, this only fixes the rounding vs. truncation, not the deviations created by the fact that TeX talks in base10 but calculates in base2.
Certainly. But at least TeX's talk is round-trippable: the 5-digit values output by \the will convert back to the same value on input. This kind of quality is absent from units different from "pt" and "sp": "1in" is perhaps the ugliest example. While TeX only does one-way conversions with units, I consider the difference in implementation quality disturbing, given the author. -- David Kastrup
participants (4)
-
David Kastrup
-
Hans Hagen
-
Martin Schröder
-
Taco Hoekwater