------ Original Message ------
From "Hans Hagen via ntg-context" <ntg-context@ntg.nl>
To "ntg-context@ntg.nl" <ntg-context@ntg.nl>
Cc "Hans Hagen" <j.hagen@freedom.nl>
Date 9/22/2023 3:53:03 AM
Subject [NTG-context] Re: Toggling the symbol for the zero-width joiner and related Unicode control characters
1. Can this approach be generalized to get what we want, viz., a way to toggle the symbols?
 
given the inconsistency in what is or is not in a font the only way out
is to have our own visualization (consistent across fonts) and even then
it would add some mess because we're talking of a mix of characters that
can have gone (as part of rendering) or are not characters at all but
spacing
 
so, in that case only 'verbatim' is a candidate for visualization, not
so much typeset text

Hm, ok. Since almfixed is based on Knuth's mono, perhaps its visuals of the control characters can be extracted and used as fallback symbols. 

Yes: For typeset text/printing visualization is generally unnecessary (the point of this thread).

2. \enabletrackers[typesetters.nbsp] gives a colored box, which is at least something.. But how can we get the NBSP symbol that's alerady in the font?
 
it's gone by that time ... the line break mmechanism uses glue, not
characters

Ok

3. Ideally:
a. we want all Unicode control symbols to show up in verbatim or in \typebuffer (as in a text editor);
 
only there (with some non interfering rendering i guess) and even then
it's probably an additonal pass over the node list

Ok, that would be good.

b. we want all Unicode control symbols to be suppressed in final pdf output (for, e.g., printing).
 
they basically are unless some font features keeps them around which is
out of our control

If the symbols are in the font, then they are not suppressed. See below.

But some fonts meant for printing have symbols for Unicode control chars -- that poses a challenge.
 
so an inconsistent mess not worth wasting time on (as this is hobbyism
only fun can be a motivational factir)

But there is a certain consistency -- see below.

And some fonts meant for verbatim/editing do not have symbols for the control chars -- that also poses a challenge. AlmFixed, of course, has them.
 
Most minimally decent Arabic fonts have symbols for the Unicode control chars as default, including Scheherazade, Amiri, Uthmanic, and Noto Naskh Arabic -- all free fonts.
 
Industry workhorses like Linotype Lotus (Arabic) also have them.
 
i'm not interested in those .. can't afford them for playing around
purposes .. we only look into commercial fonts if we get a dozen
unresticted copies for context developers

Except for Linotype Lotus, each of the Arabic-script fonts mentioned above is free, not commercial -)

(There is also a free version of Lotus -- it also has the symbolic rendering of the contol chars.)

Uniscribe applications like Notepad/Word allow for toggling in a WYSIWYG context -- can't speak for HarfBuzz -- so there is no harm in having explicit symbols in the font.
 
sure, as long as there is no rendering ... they show the input

But therein lies the problem: ConTeXt shows the rendering by default, and we need to turn it off. Since most non-Latin typography targets Uniscribe applications which allows for toggling, the font developers (commercial or free) don't have to concern themselves with this issue. 

Yet another curse of the WYSIWYG paradigm, which mixes form and content -)

The upshot is that, for non-Latin scripts, some toggling capability in ConTeXt is important to have -- even inescapable for Arabic-script piblishing.
 
a bit subjectiev arguing -)

Not really -) This brings us to the point of consistency: For Arabic-script fonts, hard symbolic rendering of the Unicode control characters is the rule, not the exception. So not "an inconsistent mess" -- at least not as far as Arabic-script typography is concerned.

(Yes, for the upcoming Husayni I can add a font feature that does the trick, but that will be an exception to the rule.)

Perhaps others who use Arabic-script or Indic, etc., can chime in.. Am hopeful that we can figure something out!
sore, but not with 'instant priority' (unless it is some project)

My immediate project (no Husayni) is a book that features English translation of an Arabic text (hence the interest in the recent streams thread). Using some Unicode control characters will be unavoidable to get the rendering effects correct, but the symbols will need to be suppressed. 

Am thinking/hoping that a ConTeXt-specific font feature can do the trick. Since there appears to be consistency across Arabic fonts in this matter it should not be messy at all, simply a fallback that sends the symbols to some no-man's land. 

(A thought: Some of the code you kindly provided for transliteration might be reusable as well.. But a general solution for all ConTeXt users would be ideal.)

In any case, many thanks for your help in thinking this through.

Best wishes
Idris
--
Idris Samawi Hamid, Professor
Department of Philosophy
Colorado State University
Fort Collins, CO 80523