
On 12/5/07, Hans Hagen wrote:
Mojca Miklavec wrote:
I compared tex-text and trep, tlig. Since you map features=default to trep and tlig, and both of them further to tex-text (twice), tex-text could be split in two parts as well, so that you can make one-to-one mapping.
there is a difference here between xetex and luatex
I know, that's why I'm volunteering to make XeTeX behave equal to LuaTeX (and asking for some opinions about it).
first i need to understand the problem;
In LuaTeX you define two "features": - trep (3 replacements) - tlig (8 ligatures) In XeTeX, these 11 and 4 others are combined in "mapping=tex-text". And I would like the ligatures to be the same on both engines, so I would like to clone the behavior from LuaTeX in XeTeX (once you decide what the default should be - from your reply I understand that you plan to change the behaviour) and split "mapping=tex-text" into more separate "mappings".
actually i'm even thinking of not defaulting (in luatex) the mapping other than -- and --- because they are (1) not sensible and (2) users should use quotation commands and/or (3) use the proper utf codes
That's fine by me, I would agree. Only: what about apostrophe (')? (I'm, you're, ..)
1.) TREP
function fonts.initializers.base.otf.texquotes(tfm,value) tfm.characters[0x0022] = table.fastcopy(tfm.characters[0x201D]) tfm.characters[0x0027] = table.fastcopy(tfm.characters[0x2019]) tfm.characters[0x0060] = table.fastcopy(tfm.characters[0x2018]) end
Corresponding lines from XeTeX:
U+0022; ; ; >;U+201D; ; " -> right double quote U+0027; ; ; <>;U+2019; ; ' -> right single quote U+0060; ; ; <>;U+2018; ; ` -> left single quote
these i hate most, and personally never use them ... if i key in a char explicitly i want that char and not another
I'm not sure about it, but how is it with the apostrophe in I'm, don't, ... ? I don't care too much about " -> right double quote and ` -> left single quote esp. the first one should better be left that way, and I never use ` (and have no idea what it's used for except in mysql and TeX source code)
2.) TLIG
{ "endash", "hyphen hyphen" }, U+002D U+002D; ; ; <>;U+2013; ; -- -> en dash
{ "emdash", "hyphen hyphen hyphen" }, U+002D U+002D U+002D;<>;U+2014; ; --- -> em dash
{ "quotedblright", "quotesingle quotesingle" }, U+0027 U+0027; <>;U+201D; ; '' -> right double quote
{ "quotedblleft", "grave grave" }, U+0060 U+0060; <>;U+201C; ; `` -> left double quote
{ "quotedblbase", "comma comma" } U+002C U+002C; <>;U+201E; ; ,, -> DOUBLE LOW-9 QUOTATION MARK
missing from tex-text (not needed so far)
actually there's even space + something becomes something else
I don't mind those, TeX never uses space anyway.
{ "quotedblleft", "quoteleft quoteleft" }, 0x2018 0x2018 <> 0x201C ; 2x left single quote -> left double quote
{ "quotedblright", "quoteright quoteright" }, 0x2019 0x2019 <> 0x201D ; 2x right single quote -> right double quote
and then those spanish ...
3.) Only in XeTeX's tex-text (Do people need it?):
U+0021 U+0060; <>;U+00A1; ; !` -> inverted exclam U+003F U+0060; <>;U+00BF; ; ?` -> inverted question
U+003C U+003C; <>;U+00AB; ; << -> LEFT POINTING GUILLEMET U+003E U+003E; <>;U+00BB; ; >> -> RIGHT POINTING GUILLEMET
let's get rid of it
Fine by me.
i'd like to let caps and such go away completely for mkiv so maybe we end up with xetex defs versus luatex defs;
So ... perhaps Caps support needs to be rewritten in XeTeX as well :) I still don't understand how to get "bold italic sans caps" for example (just as I don't know how to get bold/bold italic math). Do it your way in LuaTeX ... support for XeTeX can follow.
i wonder if in practice users will use both at the same time (ok, you do)
I use LuaTeX because: - sometimes Lua is really handy and you have a marvellous database for Unicode well integrated :), nice to inspect fonts etc. I use XeTeX because: - LuaTeX does't (or at least didn't) always work as it should. XeTeX "saved my life" in May because of some dirty LuaTeX bugs stopped the show in the middle (I already had a working copy - final PDF, and after minor modifications it stopped working) - others keep asking questions (and since I suggested or approved quite some bugs in XeTeX recently, I feel responsible for helping the victims :) - to show off with Zapfino - I don't have it as OpenType :-) In any case: what I really love about ConTeXt is that [more-or-less] no change is needed to compile the same (simple) document with either engine (and compare the result or to switch quickly if support in one engine is buggy). When using lua that is no longer true, but still, same definitions and similar results with both engines would be nice. (Esp. if both engines will be merged in the future :)
The problem is that "features=default" implies "script=latn", which is not always desired. A copy of mapping=tex-text comes from tlig & trep substitution.
we can fall back to dflt which in practice boils down to latn
That's probably better.
I assume that script=latn;language=dflt;+liga;+kern; is always on by default (were needed), so basically mapping=tex-text is the only thing that really needs to be added.
well, i'd prefer ... only -- and --- and make anything else up to the user, which means, redefining default in cont-sys if needed
[Iwona-Bold.otf]:script=latn;language=dflt;+liga;+kern;mapping=tex-text;mapping=tex-text;
two mappings?
I mean: the the font is called with mapping=tex-text;mapping=tex-text; whis doesn't make so much sense. I can create two new mappings, so that "tlig=yes" would then be "mapping=tlig" and "trep=yes" would be "mapping=trep" instead of "mapping=tex-text". We only need to agree which features are where (and you need to remove ?` from beginner's manual). Spanish users probably have it on keyboard and "others" either don't need it or will find it somehow. << and >> are not needed either in my opinion.
Some (non-latin) fonts complain when one requests non-existing features.
in xetex you mean?
Yes. Well, LuaTeX probably "complains" as well (as in report >> load otf: warning: Warning: Glyph 1423 is named fi which should mean it is mapped to Unicode U+FB01, but Glyph 207 already has that encoding. etc.) - complaining is a good thing in general, it only means "please do not use latin script for non-latin fonts". And "features=default" should prbably not force latin. (Then it should at least be "features=default-latin" or somothing similar.)
Also, it might be handy to be able to define \definetypeface[basic][rm][Xserif][whatever][script=arab,language=...] (for now forget that one, interface needs to be extended once and properly).
\definefontsynonym[a][file:Iwona-Bold.otf][mapping=tex-text] doesn't work, so the only way seems to be \definefontfeature[xetex][mapping=tex-text] \definefontsynonym[a][file:Iwona-Bold.otf][xetex]
I have tried to use \definefontfeature[xetex][mapping=tex-text] \definefontfeature[caps][+smcp] \definefontsynonym[a][file:Iwona-Bold.otf][features={xetex,caps}] but that didn't work.
indeed, handling comma separated lists is too slow there .. ok, we can do it for xetex and in luatex use lua for it ... or i could hash the commalist itself ... needs a bit of thinking but eventually we need to be able to combine features (this even more points into a separate definition file for xetex)
That's up to you. I don't know anything about internals here. Mojca