[NTG-context] Permissible characters in ConTeXt reference labels

Hans Hagen pragma at wxs.nl
Thu Sep 18 00:18:08 CEST 2014


On 9/18/2014 12:06 AM, Mark Szepieniec wrote:
> Bump...
>
> If it's not too much trouble, I would greatly appreciate some feedback
> on this before I propose it to be merged into pandoc; even a "looks good
> to me" from one of the ConTeXt gurus would be very helpful.
>
> Thanks in advance,
>
> Mark
>
> On Tue, Sep 9, 2014 at 12:20 AM, Mark Szepieniec <mszepien at gmail.com
> <mailto:mszepien at gmail.com>> wrote:
>
>     I'm trying to fix a problem in pandoc (see
>     https://github.com/jgm/pandoc/pull/1589) where it doesn't properly
>     sanitize the reference labels in ConTeXt output, causing errors
>     during compilation when a label contains '#' for example. Note that
>     this sanitizing is needed in addition to the regular backslash
>     escaping used for control characters: '\#' is still illegal in a
>     label for example.
>
>     In the sanitizer function I'm writing, I'd like to properly escape
>     all illegal characters, but I couldn't find an explicit list of
>     allowed or illegal characters. Based on some testing I've conducted
>     (see attached file), I've arrived at the following set:
>
>     \#[]",{}%()|=

it depends on where these characters end up in

#  : always tricky as it denotes an argument, so escape
[] : depends if it gets fed into a macro that uses [] as delimiters
{} : only an issue when not balanced
%  : escaping needed as it's comment otherwise
() : depends on where it ends up, like []
|  : is special in context so needs escaping
\  : of course that one needs escaping

>     1) Does this look like a reasonable set? Are there other characters
>     or sequences that should be included, or are worth testing?

keep in mind that escapes should end up unescaped at some point

>     2) I was told (see
>     https://groups.google.com/forum/#!topic/pandoc-discuss/tYpXMUkmbEY)
>     that if the characters " and , didn't work, it would count as a
>     ConTeXt bug, is there any truth to that? Please let me know if any
>     further info is needed on my part.

well, define bug ... one can say the same of < and > in xml -)

if the result ends up in a comma separated list then , can be an issue 
but one can always wrap an argument in {} to hide that

>     3) Does anyone see issues with this general approach? I'm relatively
>     new to ConTeXt, so I might be missing either a huge problem, or an
>     obviously easier way to do this.

i don't know ... i never used pandoc input

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------


More information about the ntg-context mailing list