# [NTG-context] Permissible characters in ConTeXt reference labels

Thu Sep 18 04:26:47 CEST 2014

On Thu, 18 Sep 2014, Hans Hagen wrote:

On 9/18/2014 12:06 AM, Mark Szepieniec wrote:
>> Bump...
>>
>> If it's not too much trouble, I would greatly appreciate some feedback
>> on this before I propose it to be merged into pandoc; even a "looks good
>> to me" from one of the ConTeXt gurus would be very helpful.
>>
>>
>> Mark
>>
>> On Tue, Sep 9, 2014 at 12:20 AM, Mark Szepieniec <mszepien at gmail.com
>> <mailto:mszepien at gmail.com>> wrote:
>>
>>     I'm trying to fix a problem in pandoc (see
>>     https://github.com/jgm/pandoc/pull/1589) where it doesn't properly
>>     sanitize the reference labels in ConTeXt output, causing errors
>>     during compilation when a label contains '#' for example. Note that
>>     this sanitizing is needed in addition to the regular backslash
>>     escaping used for control characters: '\#' is still illegal in a
>>     label for example.

(LaTeX label) = (ConTeXt reference). What Mark mean was references such as

\section[...]{...} or \startplacefigure[reference={...}].

>>     In the sanitizer function I'm writing, I'd like to properly escape
>>     all illegal characters, but I couldn't find an explicit list of
>>     allowed or illegal characters. Based on some testing I've conducted
>>     (see attached file), I've arrived at the following set:
>>
>>     \#[]",{}%()|=
>
> it depends on where these characters end up in
>
> #  : always tricky as it denotes an argument, so escape
> [] : depends if it gets fed into a macro that uses [] as delimiters
> {} : only an issue when not balanced
> %  : escaping needed as it's comment otherwise
> () : depends on where it ends up, like []
> |  : is special in context so needs escaping
> \  : of course that one needs escaping
>
>>     1) Does this look like a reasonable set? Are there other characters
>>     or sequences that should be included, or are worth testing?
>
> keep in mind that escapes should end up unescaped at some point
>
>>     2) I was told (see
>>     that if the characters " and , didn't work, it would count as a
>>     ConTeXt bug, is there any truth to that? Please let me know if any
>>     further info is needed on my part.
>
> well, define bug ... one can say the same of < and > in xml -)

Since I made that comment on the pandoc mailing list, let me explain.

Consider:

\section["some" reference]{Title}

Given how " behaves elsewhere in ConTeXt, a user would expect the above to
be a valid input. If it is not, then it is bug (or atleast, surprising).

The same goes for

\section[some, reference]{Title}

> if the result ends up in a comma separated list then , can be an issue but
> one can always wrap an argument in {} to hide that
>
>>     3) Does anyone see issues with this general approach? I'm relatively
>>     new to ConTeXt, so I might be missing either a huge problem, or an
>>     obviously easier way to do this.
>
> i don't know ... i never used pandoc input