LuaTeX and Unicode Math
Hello, While trying to convert some stuff from HTML to PDF (using LuaTeX) I have noticed some minor problems: unicode math characters work OK in text mode (under assumption that the font has them), but not in math mode. In pdfTeX they work OK in both cases. (That behaviour is expected, but not necessary desired.) Is there any cure to it? % in LM, most characters are not present in otf fonts, so with LM this wouldn't work at all. Iwona is slightly better in that respect. \beginOLDTEX \enableregime[utf-8] \usetypescript[iwona][ec] \endOLDTEX \beginNEWTEX \usetypescript[iwona] \endNEWTEX \setupbodyfont[iwona] \def\testA{(φ)} % 03c6 \def\testB{(・)} % 00b7 \def\testC{(≤)} % 2264 \starttext \testA\testB\testC\crlf $\testA\testB\testC$ \stoptext I could/should solve the problem with better handling of xml entities or with fallbacks (if φ is not present in font, use \phi etc.), but I don't know how to do either of them. I would normally say something like \chardef`φ=\active \defφ{\phi} to solve the problem, but that doesn't work when doing XML conversion. Thanks, Mojca
Hi,
While trying to convert some stuff from HTML to PDF (using LuaTeX) I have noticed some minor problems: unicode math characters work OK in text mode (under assumption that the font has them), but not in math mode. In pdfTeX they work OK in both cases. (That behaviour is expected, but not necessary desired.)
I know about this, and this is one of the main reason that I am still using pdftex for day to day work. I have become used to typing most of my math in unicode, and now none of it works (with luatex). But I have been too busy to actually try to understand what is happening behind the scences.
Is there any cure to it?
AFAIK, luatex has not toched on the math handling of tex yet. There are some old ideas on how TeX's math support can be improved, and with luatex that is a real possibility. I am sure that at some stage, once some aspects of the lua functionality are more stable, this will be looked into. It makes sense to finalize mkiv support only after that.
I could/should solve the problem with better handling of xml entities or with fallbacks (if φ is not present in font, use \phi etc.), but I don't know how to do either of them.
I would normally say something like \chardef`φ=\active \defφ{\phi} to solve the problem, but that doesn't work when doing XML conversion.
Actually, for a unicode aware engine it should be the other way around \def\phi{φ} That is, all definemathcommand be changed to something like \definemathcharacter[φ][font:location] and so on. But my guess is that this will need a overhaul of the math character encoding. Again, that is something which I do not understand at the moment. Aditya
Aditya Mahajan wrote:
Hi,
While trying to convert some stuff from HTML to PDF (using LuaTeX) I have noticed some minor problems: unicode math characters work OK in text mode (under assumption that the font has them), but not in math mode. In pdfTeX they work OK in both cases. (That behaviour is expected, but not necessary desired.)
I know about this, and this is one of the main reason that I am still using pdftex for day to day work. I have become used to typing most of my math in unicode, and now none of it works (with luatex). But I have been too busy to actually try to understand what is happening behind the scences.
we just need to initialize the mappings,and i had no time for that yet, when done, we can get rid of most existing math definitions
AFAIK, luatex has not toched on the math handling of tex yet. There are some old ideas on how TeX's math support can be improved, and with luatex that is a real possibility. I am sure that at some stage, once some aspects of the lua functionality are more stable, this will be looked into. It makes sense to finalize mkiv support only after that.
sure, but adding definitions for unicode math is not that complex and should work ok
I would normally say something like \chardef`φ=\active \defφ{\phi} to solve the problem, but that doesn't work when doing XML conversion.
indeed, no option
Actually, for a unicode aware engine it should be the other way around
\def\phi{φ}
That is, all definemathcommand be changed to something like
\definemathcharacter[φ][font:location]
well, we can start thinking of virtal fonts; on the other hand, a year from now we will have math in the tex gyre fonts so maybe it's not worth the effort
and so on. But my guess is that this will need a overhaul of the math character encoding. Again, that is something which I do not understand at the moment.
Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On 10/3/07, Hans Hagen
well, we can start thinking of virtal fonts; on the other hand, a year from now we will have math in the tex gyre fonts so maybe it's not worth the effort
Also Cambria/Cambria Math is obtainable, and the Stix fonts have been a month away for half a year now. Truly Unicode-aware math handling in TeX will be really nice. --Joel
Hi Mojca, Your email message uses the chinese simplified (GB2312) encoding, is that intentional? Mojca Miklavec wrote:
Hello,
While trying to convert some stuff from HTML to PDF (using LuaTeX) I have noticed some minor problems: unicode math characters work OK in text mode (under assumption that the font has them), but not in math mode. In pdfTeX they work OK in both cases. (That behaviour is expected, but not necessary desired.)
Is there any cure to it?
Definitions like these should work in luatex (and xetex): \definemathcharacter [φ] [nothing] [lcgreek] ["1E] That is not font-related, it is just input remapping based on \mathchardef, the same thing happens in traditional tex. Best wishes, Taco
Hello Taco, On 10/3/07, Taco Hoekwater wrote:
Your email message uses the chinese simplified (GB2312) encoding, is that intentional?
Emmm ... no. But I have no influence on encoding - there seems to be some "smart" algorithm behind gmail, which tries to guess which encoding to use. Usually it takes ascii or utf-8, but apparently it sometimes favors other encodings for some reason :(
Mojca Miklavec wrote:
Hello,
While trying to convert some stuff from HTML to PDF (using LuaTeX) I have noticed some minor problems: unicode math characters work OK in text mode (under assumption that the font has them), but not in math mode. In pdfTeX they work OK in both cases. (That behaviour is expected, but not necessary desired.)
Is there any cure to it?
Definitions like these should work in luatex (and xetex):
\definemathcharacter [φ] [nothing] [lcgreek] ["1E]
That is not font-related, it is just input remapping based on \mathchardef, the same thing happens in traditional tex.
Thanks a lot. It has made my day :) However, your example worked OK, but \definemathsymbol [≤] [rel] [sy] ["14] \definemathsymbol [·] [bin] [sy] ["01] didn't Thanks, Mojca (čšž - hopefully gmail will choose utf-8 now :)
Mojca Miklavec wrote:
\definemathsymbol [≤] [rel] [sy] ["14] \definemathsymbol [·] [bin] [sy] ["01]
You need \definemathcharacter, otherwise you are setting the math equivalent of the control sequence \≤, not the character ≤. It still doesn't work then, but that could be some problem with initialization of the math collections, i don't know. A bare \mathcode `≤ = "3214 works fine, so it must be a context macro issue. Best wishes, Taco
On 10/3/07, Taco Hoekwater wrote:
Mojca Miklavec wrote:
\definemathsymbol [≤] [rel] [sy] ["14] \definemathsymbol [·] [bin] [sy] ["01]
You need \definemathcharacter, otherwise you are setting the math equivalent of the control sequence \≤, not the character ≤.
Oh, I have overseen that. Thanks a lot, it works now for me, I don't know why it fails on your side :)
It still doesn't work then, but that could be some problem with initialization of the math collections, i don't know.
A bare
\mathcode `≤ = "3214
works fine, so it must be a context macro issue.
we need to start thinking about unicode math support ... see char-mth for a starting point, i want to use that file for initializing math, once i know the rules
Oh, great! I haven't seen that file :) I will send you some feedback. Thanks to both, Mojca
On 10/3/07, Mojca Miklavec
Emmm ... no. But I have no influence on encoding - there seems to be some "smart" algorithm behind gmail, which tries to guess which encoding to use. Usually it takes ascii or utf-8, but apparently it sometimes favors other encodings for some reason :(
On the "General" tab in Gmail settings is an option to set "Outgoing message encoding" either to "default" (= autodetect, with sometimes amusing results) or to UTF-8. --Joel
On 10/3/07, Taco Hoekwater wrote:
Mojca Miklavec wrote:
Hello,
While trying to convert some stuff from HTML to PDF (using LuaTeX) I have noticed some minor problems: unicode math characters work OK in text mode (under assumption that the font has them), but not in math mode. In pdfTeX they work OK in both cases. (That behaviour is expected, but not necessary desired.)
Is there any cure to it?
Definitions like these should work in luatex (and xetex):
\definemathcharacter [φ] [nothing] [lcgreek] ["1E]
That is not font-related, it is just input remapping based on \mathchardef, the same thing happens in traditional tex.
What about \neq and \[l]dots? How can I get those working in "unicode math input"? \definemathcharacter [≠] {\neq} is probably not adapted to such definitions. Thanks a lot, Mojca
Mojca Miklavec wrote:
What about \neq and \[l]dots? How can I get those working in "unicode math input"?
In general, it is better not to do that, (because it is slower and needs lots of control sequences), but if the font does not contain what you need, you have no choice, of course.
\definemathcharacter [≠] {\neq} is probably not adapted to such definitions.
You are right, it is not. But the currently ignored command \definemathcharacter [≠] [\neq] could be made to work easily enough. Here is an example of such an approach (the implementation is very ugly, I am just trying to demonstrate: \let\mydodefinemathcharacter\dodefinemathcharacter \def\dodefinemathcharacter[#1][#2][#3][#4][#5][#6]% {\iffourthargument \mydodefinemathcharacter[#1][#2][#3][#4][#5][#6]% \else \begingroup \catcode`#1=\active \uccode`~=`#1 \uppercase{\gdef~{#2}}% \endgroup \mathcode`#1="8000 \fi} Best wishes, Taco
Mojca Miklavec wrote:
Hello,
While trying to convert some stuff from HTML to PDF (using LuaTeX) I have noticed some minor problems: unicode math characters work OK in text mode (under assumption that the font has them), but not in math mode. In pdfTeX they work OK in both cases. (That behaviour is expected, but not necessary desired.)
Is there any cure to it?
we need to start thinking about unicode math support ... see char-mth for a starting point, i want to use that file for initializing math, once i know the rules Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (6)
-
Aditya Mahajan
-
Arthur Reutenauer
-
Hans Hagen
-
Joel C. Salomon
-
Mojca Miklavec
-
Taco Hoekwater