[dev-context] sortedkeys and compare in LMTX l-table.lua

Hans Hagen j.hagen at xs4all.nl
Sun Jan 26 22:55:44 CET 2020


On 1/26/2020 1:17 AM, Sebastian Miele wrote:
> On table.sortedkeys the page
> https://www.contextgarden.net/Table_manipulation says: "Returns a sorted
> array of all the keys in table t. The comparer used for sorting will
> treat everything as a string iff the compared values are of mixed type."
> 
> According to l-table.lua, sortedkeys first checks whether the keys in
> the given table are (1) all strings, (2) all numbers, or (3): neither
> (1) or (2). In cases (1) and (2) table.sort is used with the default
> comparer (Lua's '<'). In case (3) table.sort is used with a custom
> compare function named 'compare' and defined in l-table.lua.
> 
> According to the sentence from the garden cited above, 'compare' should
> return (something equivalent) to tostring(a) < tostring(b). However, it
> does something more complex that, e.g., makes 10 < "2" < 3 < 10 ('<'
> meaning infix 'compare' here). I.e. there are strict circles in the
> strict order.
> 
> https://www.contextgarden.net/Table_manipulation goes on explaining
> that, "In general this means that you will be fine as long as you avoid
> stuffing indices along with number hashes together in t, lest the order
> will end up confused depending on what type an element is compared with
> in what order (cf. example [2])."
> 
> This reflects the way 'compare' actually is at the moment. But it
> contradicts the previous sentence "The comparer used for sorting will
> treat everything as a string iff the compared values are of mixed type."
> If everything is treated as a string when neither (1) nor (2), then
> everything is well-defined and there is no confusion.
> 
> Why is 'compare' not just (something equivalent to) tostring(a) <
> tostring(b), violating the sentence cited above? Are there any
> meaningful uses of the way 'compare' is now?
> 
> I suspect that the intent was to create an order that is lexicographic
> and numeric at the same time. As far as I see it, it just is not
> possible the way it was tried: Lexicographically, 11<2. Numerically,
> 2<11.
> 
> I see three ways out: (1) Disallow mixed types of keys in tables given
> to sortedkeys. (2) Always use lexicographic order in the case of mixed
> key types. (3) Combine the orders by an order-theoretic sum. One
> possiblity: Every value that is neither a number nor a string is smaller
> than every number. Every number is smaller than every string. Otherwise
> elements are compared as they are now. Another possiblity: Define an
> order over all types, define an order for every type. Then sort by type
> first, and in type second.
> 
> (1) may not possible without breaking a lot. (2) has sort of a
> discontinuity: E.g., add one numeric key to a table with only strings,
> and the order changes globally. (3) contains well defined, very
> predictable solutions without discontinuities.
Normally keys are not mixed but when they are this is the way it's done. 
It's been so for more than a decade so unlikely to change.

These sorters are not used when typesetting because there we also deal 
with languages and such. If's mostly a convenience feature that makes 
sure we can handles hashes in a predictable order when needed (because 
hashes have random order, even per run) and it also tries to be 
efficient in doing so.

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------


More information about the dev-context mailing list