[NTG-context] String substitution using regular expressions and backreferences

Thangalin thangalin at gmail.com
Mon Aug 1 21:58:53 CEST 2022


Hi list,

I'm looking to perform text replacements.

\definereplacement[SubstPostmeridian][
  match={[Pp].[Mm].]},
  replace={\cap{pm}}
]

The \replaceword command doesn't handle periods well. The translate module
doesn't seem flexible enough to cover edge cases. Consider the following
example document containing both sample inputs and sample outputs:

\starttext
  {\bf Markdown Input}

  Our grandmother clock rang 11 p.m. and we fled.

  Our grandmother clock rang 11 p.m., so we fled.

  Our grandmother clock rang 11 p.m. We fled.

  \blank[big]

  {\bf \ConTeXt{} Output}

  Our grandmother clock rang 11 \cap{pm} and we fled.

  Our grandmother clock rang 11 \cap{pm}, so we fled.

  Our grandmother clock rang 11 \cap{pm}. We fled.
\stoptext

It would be most convenient to write:

% Strip periods from p.m.
\definereplacement[SubstPostmeridianLowercase][
  match={[Pp].[Mm]. ([^:upper:])},
  replace={\cap{pm} \1}
]

% Preserve terminal period for p.m. (e.e. cummings notwithstanding)
\definereplacement[SubstPostmeridianTerminal][
  match={[Pp].[Mm]. ([:upper:])},
  replace={\cap{pm}. \1}
]

% Apply a macron for lowercase 'c' (McAnulty, McGenius, etc.)
% Well, not quite a macron: https://tex.stackexchange.com/q/364024/2148
\definereplacement[SubstMac][
  match={Mc([:upper:]\w)},
  replace={M\macronbelow{c}\1}
]

The \1 may be problematic. Other sigils include $1 and #1, which may also
have issues.

Thank you!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.ntg.nl/pipermail/ntg-context/attachments/20220801/4b7f8b87/attachment-0001.htm>


More information about the ntg-context mailing list