\unit parser ignoring case, in some cases.
Hello list, I continue to work with the \unit command, and found some behavior very surprising. When I try to register units with capital letter names, it breaks lowercase metric prefixes. For example, registering C=coulomb, K=kelvin and N=newton breaks metric units cm, kg, and ns. The file below demonstrates the behavior. Obviously, I can use full names in my document. However, I would like the standard SI symbols, which are single capital letters, to work for me and my less TeX-savvy colaborators. Thanks! Gavin \starttext Units does not completely ignore case. \startformula \unit{3 meter} \qquad \unit{6 Meter} \qquad \unit{3 mEtEr} \qquad \stopformula Units with lowercase prefixes (c, k, n). \startformula \unit{3cm} \qquad \unit{6kg} \qquad \unit{3ns} \qquad \stopformula Units with capital letters, called with names (coulomb, kelvin, newton). \startformula \unit{3 coulomb} \qquad \unit{6 kelvin} \qquad \unit{3 newton} \qquad \stopformula Units with capital letters, called with the capital letter (C, K, N) fail. \startformula \unit{3 C} \qquad \unit{6 K} \qquad \unit{3 N} \qquad \stopformula Now I register some units with capital letter names: C=coulomb, K=kelvin, N=newton. \registerunit[unit][ C=coulomb, K=kelvin, N=newton] Units called by capital letter (C, K, N) now work. \startformula \unit{3 C} \qquad \unit{6 K} \qquad \unit{3 N} \qquad \stopformula However, units with lowercase prefixes (c, k, n) are broken. \startformula \unit{3cm} \qquad \unit{6kg} \qquad \unit{3ns} \qquad \stopformula \stoptext
On 3/9/2023 2:04 PM, Gavin via ntg-context wrote:
\startformula \unit{3 meter} \qquad \unit{6 Meter} \qquad \unit{3 mEtEr} \qquad \stopformula
Units with lowercase prefixes (c, k, n). \startformula \unit{3cm} \qquad \unit{6kg} \qquad \unit{3ns} \qquad \stopformula
Units with capital letters, called with names (coulomb, kelvin, newton). \startformula \unit{3 coulomb} \qquad \unit{6 kelvin} \qquad \unit{3 newton} \qquad \stopformula
you can look at phys-dim and see plenty of short and long keys and making all case insensitive is asking for troubles Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On Mar 13, 2023, at 3:44 PM, Hans Hagen via ntg-context
wrote: On 3/9/2023 2:04 PM, Gavin via ntg-context wrote:
\startformula \unit{3 meter} \qquad \unit{6 Meter} \qquad \unit{3 mEtEr} \qquad \stopformula Units with lowercase prefixes (c, k, n). \startformula \unit{3cm} \qquad \unit{6kg} \qquad \unit{3ns} \qquad \stopformula Units with capital letters, called with names (coulomb, kelvin, newton). \startformula \unit{3 coulomb} \qquad \unit{6 kelvin} \qquad \unit{3 newton} \qquad \stopformula
you can look at phys-dim and see plenty of short and long keys and making all case insensitive is asking for troubles
Indeed, I would like to make NONE of them case insensitive. But currently, when I register an upper case key (C=coulomb) it messes up the lower case prefix (“cm" gets typeset as C•m). I was expecting the parser to distinguish between the “C” and “c”, but it doesn’t. Is that intended? Gavin
On Mon, 13 Mar 2023 15:55:50 -0600
Gavin via ntg-context
you can look at phys-dim and see plenty of short and long keys and making all case insensitive is asking for troubles
Indeed, I would like to make NONE of them case insensitive. But currently, when I register an upper case key (C=coulomb) it messes up the lower case prefix (“cm" gets typeset as C•m). I was expecting the parser to distinguish between the “C” and “c”, but it doesn’t. Is that intended?
Indeed, \unit{} should allow (and presently does not) K, C, etc. Alan
Hi Alan, Hans, and List,
On Mar 13, 2023, at 8:10 PM, Alan Braslau via ntg-context
wrote: On Mon, 13 Mar 2023 15:55:50 -0600 Gavin via ntg-context wrote: you can look at phys-dim and see plenty of short and long keys and making all case insensitive is asking for troubles
Indeed, I would like to make NONE of them case insensitive. But currently, when I register an upper case key (C=coulomb) it messes up the lower case prefix (“cm" gets typeset as C•m). I was expecting the parser to distinguish between the “C” and “c”, but it doesn’t. Is that intended?
Indeed, \unit{} should allow (and presently does not) K, C, etc.
I agree. I added the following lines to phys-dim.lua, following line 461 C = "coulomb", K = "kelvin", N = "newton", This provided the desired capital shortcuts without compromising the lowercase prefixes. Hans, could we get those added to phys-dim.lua in the distribution? I would be happy to do a more comprehensive search for shortcuts to add, but those are the three I and my collaborators are using now. Looking at why my \registerunit attempt failed, I found that when you register a unit, both your capitalization, and an all lowercase version are registered. Here is an example, where I register “ReTeM” but \unit{1 retem} also works. \starttext \registerunit[unit][ReTeM=myunit] \setupunittext[myunit=reTeM] \startformula \unit{1 ReTeM} = \unit{1 retem} \neq \unit{1 reteM} \stopformula \stoptext The results are case sensitive, so \unit{1 reteM} does not work. The lowercase version is produced for all “long” units, but not for shortcuts. (See phys-dim.lua, lines 766-771 where the Lua string function “lower” is used.) Perhaps we could use a \registershortcut command that does not get the “lower" treatment. I will look into it some more. Thanks! Gavin P.S. I think there is a spelling error in phys-dim.lua, lines 974-981. local mapping = { prefix = "prefixes", unit = "units", operator = "operators", suffixe = "suffixes", symbol = "symbols", packaged = "packaged", } The key “suffixe” should probably be “suffix”.
On 3/14/2023 5:33 PM, Gavin via ntg-context wrote:
Hi Alan, Hans, and List,
On Mar 13, 2023, at 8:10 PM, Alan Braslau via ntg-context
wrote: On Mon, 13 Mar 2023 15:55:50 -0600 Gavin via ntg-context wrote: you can look at phys-dim and see plenty of short and long keys and making all case insensitive is asking for troubles
Indeed, I would like to make NONE of them case insensitive. But currently, when I register an upper case key (C=coulomb) it messes up the lower case prefix (“cm" gets typeset as C•m). I was expecting the parser to distinguish between the “C” and “c”, but it doesn’t. Is that intended?
Indeed, \unit{} should allow (and presently does not) K, C, etc.
I agree. I added the following lines to phys-dim.lua, following line 461
C = "coulomb", K = "kelvin", N = "newton",
This provided the desired capital shortcuts without compromising the lowercase prefixes. Hans, could we get those added to phys-dim.lua in the distribution? I would be happy to do a more comprehensive search for shortcuts to add, but those are the three I and my collaborators are using now.
maybe, when there are no conflicts
Looking at why my \registerunit attempt failed, I found that when you register a unit, both your capitalization, and an all lowercase version are registered. Here is an example, where I register “ReTeM” but \unit{1 retem} also works.
\starttext
\registerunit[unit][ReTeM=myunit] \setupunittext[myunit=reTeM]
\startformula \unit{1 ReTeM} = \unit{1 retem} \neq \unit{1 reteM} \stopformula
\stoptext
The results are case sensitive, so \unit{1 reteM} does not work. The lowercase version is produced for all “long” units, but not for shortcuts. (See phys-dim.lua, lines 766-771 where the Lua string function “lower” is used.) Perhaps we could use a \registershortcut command that does not get the “lower" treatment. I will look into it some more.
see previous mail, i already added that but no upload yet
P.S. I think there is a spelling error in phys-dim.lua, lines 974-981.
local mapping = { prefix = "prefixes", unit = "units", operator = "operators", suffixe = "suffixes", symbol = "symbols", packaged = "packaged", }
The key “suffixe” should probably be “suffix”. indeed, i noticed that when extending
Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On 3/13/2023 10:55 PM, Gavin wrote:
On Mar 13, 2023, at 3:44 PM, Hans Hagen via ntg-context
wrote: On 3/9/2023 2:04 PM, Gavin via ntg-context wrote:
\startformula \unit{3 meter} \qquad \unit{6 Meter} \qquad \unit{3 mEtEr} \qquad \stopformula Units with lowercase prefixes (c, k, n). \startformula \unit{3cm} \qquad \unit{6kg} \qquad \unit{3ns} \qquad \stopformula Units with capital letters, called with names (coulomb, kelvin, newton). \startformula \unit{3 coulomb} \qquad \unit{6 kelvin} \qquad \unit{3 newton} \qquad \stopformula
you can look at phys-dim and see plenty of short and long keys and making all case insensitive is asking for troubles
Indeed, I would like to make NONE of them case insensitive. But currently, when I register an upper case key (C=coulomb) it messes up the lower case prefix (“cm" gets typeset as C•m). I was expecting the parser to distinguish between the “C” and “c”, but it doesn’t. Is that intended?
I added an option and an extra registers but it's up to you to decide hwo to use it (and how to deal with conflicts in definitions). \registerunit [unit] [Point=PT, point=pt, Basepoint=BP, % basepoint=bp, ] \registerunitshortcut [unit] [C=coulomb] \startlines 10 \unit {square meter per second} 10 \unit {square Meter per Second} 10 \unit {point} 10 \unit {Point} 10 \unit {basepoint} 10 \unit {Basepoint} 10 \unit {C} \stoplines \setupunit[unit][option=keep] \startlines 10 \unit {square meter per second} 10 \unit {square Meter per Second} 10 \unit {point} 10 \unit {Point} 10 \unit {basepoint} 10 \unit {Basepoint} 10 \unit {C} \stoplines ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On Tue, 14 Mar 2023 15:36:37 +0100
Hans Hagen via ntg-context
Indeed, I would like to make NONE of them case insensitive. But currently, when I register an upper case key (C=coulomb) it messes up the lower case prefix (“cm" gets typeset as C•m). I was expecting the parser to distinguish between the “C” and “c”, but it doesn’t. Is that intended? I added an option and an extra registers but it's up to you to decide hwo to use it (and how to deal with conflicts in definitions).
There should not be conflicts, for, formally, c should be 1/100 C should be Coulomb k should be 1000 K should be Kelvin n should be 10^{-9} N should be Newton m should be meter M should be 10^6 (but m also means 10^{-3}) etc. The problems arise as \unit{} presently accepts Kelvin and kelvin Newton and newton Coulomb and coulomb Watt and watt etc. also, mm could be millimeters, or it could be m•m (m^2). Right now, \unit{1 mm-1} and \unit{1 m m-1} give the same result: inverse millimeters (whereas the second should be m•m^{-1}, also known as radians! ;-) I suggest that it be limited to formal (and well-defined) unit names, respecting casing. I also suggest that unrecognized units either give an error message in stdout and in the log file and show up in the output (as {\tt <K>} to be coherent with other subsystems), rather than to be simply ignored. Alan
On Mar 14, 2023, at 10:08 AM, Alan Braslau via ntg-context
Right now, \unit{1 mm-1} and \unit{1 m m-1} give the same result: inverse millimeters (whereas the second should be m•m^{-1}…)
Alan
Alan, I’d like to better understand how the \unit{} command works and why those choices were made. Some of the choices seem to be “asking for troubles,” but perhaps they are essential for some users. I’m happy to have the unit command accept a variety of different forms for the unit, but I’d really like one of the acceptable forms to be the form prescribed by Le Système international d'unités, so that “m s” is a meter second and “ms” is a millisecond. However, I’m not sure if this goal conflicts with other important goals. Would you like to explore \unit{} this summer to see if we can find a consistent solution? Perhaps we can produce a plan for \unit{} that does not conflict with other \unit{} features, or perhaps we can make a module that lacks some of the features of \unit{} but conforms to the SI for input as well as output. I would be happy with either. I’d also like to work on the luagraph module this summer. I’m getting a lot better at MetaPost programming! I can’t approach either issue in a comprehensive way until the summer, because I have a lot of content to produce for our physics class. This year we kept on schedule – for the first time ever! – and that means we will be studying a couple of topics that I haven’t prepared yet. Obviously, anyone else interested in \unit{} or luagraph would be welcome to join us, either remotely or here in sunny Fort Collins, Colorado. (School ends here on May 26.) Gavin
On Tue, 14 Mar 2023 12:03:23 -0600
Gavin
I’d really like one of the acceptable forms to be the form prescribed by Le Système international d'unités
I believe that this point is essential, regardless of history of use of the \unit{} command. Non-standard use of units can be *tolerated* as long as they do not conflict with the SI and do not impose non-standard syntax. Alan P.S. It is our guarded secret that the weather is nice here in Colorado. I do not know of any other place that has real seasons AND where it is (almost) always sunny! :-)
On 3/14/2023 7:14 PM, Alan Braslau via ntg-context wrote:
On Tue, 14 Mar 2023 12:03:23 -0600 Gavin
wrote: I’d really like one of the acceptable forms to be the form prescribed by Le Système international d'unités
I believe that this point is essential, regardless of history of use of the \unit{} command. Non-standard use of units can be *tolerated* as long as they do not conflict with the SI and do not impose non-standard syntax.
Alan
P.S. It is our guarded secret that the weather is nice here in Colorado. I do not know of any other place that has real seasons AND where it is (almost) always sunny! :-) Here it fluctuates from zero to 15 (and behind the single pane glass in
Maybe the french title is one of the reasons for the USA not picking up on these units? (So let me threaten once again to kick the "in" unit out of context.) the office room with sun on it then 18 or more). Now of course, given units, you have to guess how much that is becuase you're with your French foot in Celsius, and the English one in Farhenheit and with both feet in Kelvin (for your book). Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On Mar 14, 2023, at 3:32 PM, Hans Hagen via ntg-context
On 3/14/2023 7:14 PM, Alan Braslau via ntg-context wrote:
On Tue, 14 Mar 2023 12:03:23 -0600 Gavin
wrote: I’d really like one of the acceptable forms to be the form prescribed by Le Système international d'unités
Maybe the french title is one of the reasons for the USA not picking up on these units? (So let me threaten once again to kick the "in" unit out of context.)
I believe that this point is essential,….
Alan and I will come up with a scheme that fastidiously follows the SI, and release it as a “North American localization.” That should leave everyone sufficiently puzzled, annoyed, or amused. Signing off of this subject until 27 May. Thanks for all your help! Gavin
On 3/14/2023 7:03 PM, Gavin via ntg-context wrote:
I’d like to better understand how the \unit{} command works and why those choices were made. Some of the choices seem to be “asking for troubles,” but perhaps they are essential for some users.
I wonder if Alan was using context when the first unit module showed up in (what wasn't even call;ed mkii) because it is one of the oldest context modules and we use(d) for typesetting education related documents. Among the reasons for it was that in the pre-unicode times one had to compromise on a math / text mixture due to the way fonts and input was handled.
I’m happy to have the unit command accept a variety of different forms for the unit, but I’d really like one of the acceptable forms to be the form prescribed by Le Système international d'unités, so that “m s” is a meter second and “ms” is a millisecond. However, I’m not sure if this goal conflicts with other important goals.
Would you like to explore \unit{} this summer to see if we can find a consistent solution? Perhaps we can produce a plan for \unit{} that does not conflict with other \unit{} features, or perhaps we can make a module that lacks some of the features of \unit{} but conforms to the SI for input as well as output. I would be happy with either.
In principle one can think of different schemes (for different purposes even), after all everything is in tables; that is probably easier than tring to come up with some complex compromise. There can be instances of unit with different properties. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
participants (3)
-
Alan Braslau
-
Gavin
-
Hans Hagen