Bugs item #993, was opened at 2008-05-24 13:24
Status: Open
Priority: 3
Submitted By: Martin Schröder (oneiros)
Assigned to: The Thanh Han (hanthethanh)
Summary: /Names array not sorted correctly
Category: None
Group: v1.40.6
Resolution: Accepted
Initial Comment:
Axel Berger and Heiko Oberdiek report an error with the sorting of the
/Names array:
/Names [(cite.Kr07) 44 0 R (cite.Li08) 45 0 R (cite.M\37403) 23 0 R
(cite.Mi00) 47 0 R (cite.Mi84) 48 0 R (cite.Mo04) 50 0 R]
"cite.Mü03" (= "cite.M\37403") should come after "cite.Mo04".
A test file with output is attached.
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2009-04-06 08:22
Message:
thanks Heiko for the patch, hence I reopened this bug report.
----------------------------------------------------------------------
Comment By: Heiko Oberdiek (oberdiek)
Date: 2009-04-06 03:49
Message:
Sorry, I cannot accept the rejection, because the
problem cannot be fixed at macro level unless
* some characters (backslash, parentheses, ...)
must be forbidden,
* octal escape sequences are used for *every* character
of the destination name
(otherwise a large amount of programming work is needed
to prevent that the bug is triggered.)
Therefore I have written patch #3886 that fixes the
comparison function `str_less_str' that is used
for the sorting of destination names.
Yours sincerely
Heiko
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2009-04-05 22:55
Message:
since the beginning we chose not to parse pdf string of \pdf* commands. Fixing this now would mean a too big change in pdftex code. Sorry.
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2008-05-27 16:48
Message:
ok then this should be fixed as the spec says. BTW, it doesn't really solve the reported problem (ü will come after eg z).
----------------------------------------------------------------------
Comment By: Hans Hagen (hagen)
Date: 2008-05-27 16:03
Message:
the spec says:
The Names entries in the leaf (or root) nodes contain the
tree’s keys and their associated values, arranged in
key-value pairs and sorted lexically in ascending order by
key. Shorter keys appear before longer ones beginning with
the same byte sequence. The encoding of the keys is
immaterial as long as it is self-consistent; keys are
compared for equality on a simple byte-by-byte basis.
so, it's just bytes
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2008-05-27 15:43
Message:
this is a problem, but it's not clear to me how it should be fixed. Assume that we translate "\374" to the corresponding charcode (0374 = 252) before sorting, then how we decide the lexicographic order of that char? it can be intepreted as ü but also can be eg ytilde or something different, depending on the encoding we use. Comments?
----------------------------------------------------------------------
You can respond by visiting:
http://sarovar.org/tracker/?func=detail&atid=493&aid=993&group_id=106