Hi, later taco will give more details about changes/additions in his change-log for the moment: - for index users: you can test utf 8, sort order (swedish) etc - for cp1250 users: it's there and you only need to give \enableregime since the vector will be loaded at runtime - for mojca: take a look at regi-syn and let me know what vectors need to be be added to the distribution - for taco: thanks for your patience but textext seems to work ok now Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hans Hagen wrote:
Hi,
later taco will give more details about changes/additions in his change-log
Mojca, it would be nice if you could give a go/nogo signal quickly. I am slowly getting drowned with all the diff files so I am really eager to have Hans go ahead and release a new version :) Here is the list compared to the 2006.03.31 beta. new files: * regi-cp1250.tex, regi-syn.tex tex.rb: * additional tcxflag-s when calling pdftex and mpost * extra \setupsystem[\c!method=2] * use one dash in front of passed-on options (instead of gnu-style double dashes) * processmpx fixes so that textext() works again texutil.rb: * downcasing (register sort routines) context.tex: * \input sort-ini.tex (only when newtexexec is used) * \TEXEXECcommand{texmfstart newtexexec} when newtexexec is used (bugfix for textext()) core-itm.tex: * new, expanded macro \getitemparameter core-ref.tex: * some \predefinereference and \predefinedreference changes (?) core-sys.tex: * new macro \systemparameter (for the \input sort-ini test) enco-ini.tex: * \docurrentencoding renamed to \dopureencodingname meta-ini.tex: * \settrue \manyMPspecials page-flt.tex: * "retain special alignments" page-lay.tex: * extended tests for layout states page-txt.tex: * optimization for background calculations * extended tests for layout states sort-ini.tex: * extra \savesortkey magic sort-lan.tex: * a bit of extra (test) code for sorting supp-mps.tex: * use \TEXEXECcommand
On 4/4/06, Taco Hoekwater wrote:
Hans Hagen wrote:
Hi,
- for mojca: take a look at regi-syn and let me know what vectors need to be be added to the distribution
Mojca, it would be nice if you could give a go/nogo signal quickly. I am slowly getting drowned with all the diff files so I am really eager to have Hans go ahead and release a new version :)
Taco & Hans: I'm really really really sorry. I didn't notice that question in thousands of mails on the list. Thanks a lot for adding the file, Hans! This line \defineregimesynonym[cp-1250] [cp1250] is not really needed: I never spotted any cp125* with a hyphen inbetween (in contrast to utf or iso encodings), otherwise everything seems to be working ok. \defineregimesynonym[1250] [cp1250] is also OK (didn't thought about it ;). If you're asking me about the other changes: here's the same list that I already suggested: renaming: windows -> cp1252 il1 -> iso-8858-1 latin2 -> iso-8858-2 iso88595 -> iso-8858-5 grk -> iso-8859-7 And then adding the following definitions (cp1250 is already there): \defineregimesynonym[utf-8][utf] \defineregimesynonym[utf8][utf] \defineregimesynonym[windows-1250][cp1250] \defineregimesynonym[windows-1251][cp1251] \defineregimesynonym[windows-1252][cp1252] \defineregimesynonym[windows-1253][cp1253] \defineregimesynonym[windows-1254][cp1254] %defineregimesynonym[windows-1255][cp1255] % not supported yet (Hebrew) %defineregimesynonym[windows-1256][cp1256] % not supported yet (Arabic) \defineregimesynonym[windows-1257][cp1257] %defineregimesynonym[windows-1258][cp1258] % not supported yet (Vietnamese) % for historical reasons / compatibility \defineregimesynonym[windows][cp1252] % 5 - Cyrillic % 6 - Arabic (not supported) % 7 - Greek % 8 - Hebrew (3 signs missing) % 11 - Thai (not supported) \defineregimesynonym[il1][iso-8859-1] \defineregimesynonym[il2][iso-8859-2] \defineregimesynonym[il3][iso-8859-3] \defineregimesynonym[il4][iso-8859-4] \defineregimesynonym[il5][iso-8859-9] \defineregimesynonym[il6][iso-8859-10] \defineregimesynonym[il7][iso-8859-13] %defineregimesynonym[il8][iso-8859-14] \defineregimesynonym[il9][iso-8859-15] \defineregimesynonym[il10][iso-8859-16] \defineregimesynonym[latin1][iso-8859-1] \defineregimesynonym[latin2][iso-8859-2] \defineregimesynonym[latin3][iso-8859-3] \defineregimesynonym[latin4][iso-8859-4] \defineregimesynonym[latin5][iso-8859-9] \defineregimesynonym[latin6][iso-8859-10] \defineregimesynonym[latin7][iso-8859-13] %defineregimesynonym[latin8][iso-8859-14] \defineregimesynonym[latin9][iso-8859-15] \defineregimesynonym[latin10][iso-8859-16] % for historical reasons / compatibility \defineregimesynonym[iso88595][iso-8859-5] \defineregimesynonym[grk][iso-8859-7] I don't know whether and how often people use all those encodings (I'm only pretty sure that people use the cp1250 one). LaTeX offers all of them for example. I would suggest at least to rename the five regimes mentioned above and to point to the more consistent names using synonyms. The mentioned regimes are all present on http://pub.mojca.org/tex/enco/contextbase/, so it's up to you wheter you add any of the other regimes to the distribution or perhaps better wait till someone requests them. (There are so many files that taking them all would almost require a separate folder.) I'm happy now that cp1250 is in and I'm not using any other regime, so it's really not my decision. As far as I remember there were also some inconsistencies in the present greek and cyrillic regime. http://pub.mojca.org/tex/enco/contextbase/regi-vis.tex is slightly different than the file in the distro (uses named glyphs), but conceptually the same. Mojca
Mojca Miklavec wrote:
On 4/4/06, Taco Hoekwater wrote:
Hans Hagen wrote:
Hi,
- for mojca: take a look at regi-syn and let me know what vectors need to be be added to the distribution
Mojca, it would be nice if you could give a go/nogo signal quickly. I am slowly getting drowned with all the diff files so I am really eager to have Hans go ahead and release a new version :)
Taco & Hans: I'm really really really sorry. I didn't notice that question in thousands of mails on the list.
Thanks a lot for adding the file, Hans!
This line \defineregimesynonym[cp-1250] [cp1250] is not really needed: I never spotted any cp125* with a hyphen inbetween (in contrast to utf or iso encodings), otherwise everything seems to be working ok.
\defineregimesynonym[1250] [cp1250] is also OK (didn't thought about it ;).
If you're asking me about the other changes: here's the same list that I already suggested:
renaming:
windows -> cp1252 il1 -> iso-8858-1 latin2 -> iso-8858-2 iso88595 -> iso-8858-5 ^^ Everywhere should be 8859!
Everything else seems all right to me. Vit
grk -> iso-8859-7
And then adding the following definitions (cp1250 is already there):
\defineregimesynonym[utf-8][utf] \defineregimesynonym[utf8][utf]
\defineregimesynonym[windows-1250][cp1250] \defineregimesynonym[windows-1251][cp1251] \defineregimesynonym[windows-1252][cp1252] \defineregimesynonym[windows-1253][cp1253] \defineregimesynonym[windows-1254][cp1254] %defineregimesynonym[windows-1255][cp1255] % not supported yet (Hebrew) %defineregimesynonym[windows-1256][cp1256] % not supported yet (Arabic) \defineregimesynonym[windows-1257][cp1257] %defineregimesynonym[windows-1258][cp1258] % not supported yet (Vietnamese)
% for historical reasons / compatibility \defineregimesynonym[windows][cp1252]
% 5 - Cyrillic % 6 - Arabic (not supported) % 7 - Greek % 8 - Hebrew (3 signs missing) % 11 - Thai (not supported)
\defineregimesynonym[il1][iso-8859-1] \defineregimesynonym[il2][iso-8859-2] \defineregimesynonym[il3][iso-8859-3] \defineregimesynonym[il4][iso-8859-4] \defineregimesynonym[il5][iso-8859-9] \defineregimesynonym[il6][iso-8859-10] \defineregimesynonym[il7][iso-8859-13] %defineregimesynonym[il8][iso-8859-14] \defineregimesynonym[il9][iso-8859-15] \defineregimesynonym[il10][iso-8859-16]
\defineregimesynonym[latin1][iso-8859-1] \defineregimesynonym[latin2][iso-8859-2] \defineregimesynonym[latin3][iso-8859-3] \defineregimesynonym[latin4][iso-8859-4] \defineregimesynonym[latin5][iso-8859-9] \defineregimesynonym[latin6][iso-8859-10] \defineregimesynonym[latin7][iso-8859-13] %defineregimesynonym[latin8][iso-8859-14] \defineregimesynonym[latin9][iso-8859-15] \defineregimesynonym[latin10][iso-8859-16]
% for historical reasons / compatibility \defineregimesynonym[iso88595][iso-8859-5] \defineregimesynonym[grk][iso-8859-7]
I don't know whether and how often people use all those encodings (I'm only pretty sure that people use the cp1250 one). LaTeX offers all of them for example. I would suggest at least to rename the five regimes mentioned above and to point to the more consistent names using synonyms. The mentioned regimes are all present on http://pub.mojca.org/tex/enco/contextbase/, so it's up to you wheter you add any of the other regimes to the distribution or perhaps better wait till someone requests them. (There are so many files that taking them all would almost require a separate folder.) I'm happy now that cp1250 is in and I'm not using any other regime, so it's really not my decision.
As far as I remember there were also some inconsistencies in the present greek and cyrillic regime. http://pub.mojca.org/tex/enco/contextbase/regi-vis.tex is slightly different than the file in the distro (uses named glyphs), but conceptually the same.
Mojca _______________________________________________ ntg-context mailing list ntg-context@ntg.nl http://www.ntg.nl/mailman/listinfo/ntg-context
-- ======================================================= Ing. Vít Zýka, Ph.D. TYPOkvítek database publishing databazove publikovani data maintaining and typesetting in typographic quality priprava dat a jejich sazba v typograficke kvalite tel.: (+420) 777 198 189 www: http://typokvitek.com =======================================================
Mojca Miklavec wrote:
On 4/4/06, Taco Hoekwater wrote:
Hans Hagen wrote:
Hi,
- for mojca: take a look at regi-syn and let me know what vectors need to be be added to the distribution
Mojca, it would be nice if you could give a go/nogo signal quickly. I am slowly getting drowned with all the diff files so I am really eager to have Hans go ahead and release a new version :)
Taco & Hans: I'm really really really sorry. I didn't notice that question in thousands of mails on the list.
Thanks a lot for adding the file, Hans!
This line \defineregimesynonym[cp-1250] [cp1250] is not really needed: I never spotted any cp125* with a hyphen inbetween (in contrast to utf or iso encodings), otherwise everything seems to be working ok.
\defineregimesynonym[1250] [cp1250] is also OK (didn't thought about it ;).
If you're asking me about the other changes: here's the same list that I already suggested:
renaming:
windows -> cp1252 il1 -> iso-8858-1 latin2 -> iso-8858-2 iso88595 -> iso-8858-5 grk -> iso-8859-7
And then adding the following definitions (cp1250 is already there):
\defineregimesynonym[utf-8][utf] \defineregimesynonym[utf8][utf]
\defineregimesynonym[windows-1250][cp1250] \defineregimesynonym[windows-1251][cp1251] \defineregimesynonym[windows-1252][cp1252] \defineregimesynonym[windows-1253][cp1253] \defineregimesynonym[windows-1254][cp1254] %defineregimesynonym[windows-1255][cp1255] % not supported yet (Hebrew) %defineregimesynonym[windows-1256][cp1256] % not supported yet (Arabic) \defineregimesynonym[windows-1257][cp1257] %defineregimesynonym[windows-1258][cp1258] % not supported yet (Vietnamese)
% for historical reasons / compatibility \defineregimesynonym[windows][cp1252]
% 5 - Cyrillic % 6 - Arabic (not supported) % 7 - Greek % 8 - Hebrew (3 signs missing) % 11 - Thai (not supported)
\defineregimesynonym[il1][iso-8859-1] \defineregimesynonym[il2][iso-8859-2] \defineregimesynonym[il3][iso-8859-3] \defineregimesynonym[il4][iso-8859-4] \defineregimesynonym[il5][iso-8859-9] \defineregimesynonym[il6][iso-8859-10] \defineregimesynonym[il7][iso-8859-13] %defineregimesynonym[il8][iso-8859-14] \defineregimesynonym[il9][iso-8859-15] \defineregimesynonym[il10][iso-8859-16]
\defineregimesynonym[latin1][iso-8859-1] \defineregimesynonym[latin2][iso-8859-2] \defineregimesynonym[latin3][iso-8859-3] \defineregimesynonym[latin4][iso-8859-4] \defineregimesynonym[latin5][iso-8859-9] \defineregimesynonym[latin6][iso-8859-10] \defineregimesynonym[latin7][iso-8859-13] %defineregimesynonym[latin8][iso-8859-14] \defineregimesynonym[latin9][iso-8859-15] \defineregimesynonym[latin10][iso-8859-16]
% for historical reasons / compatibility \defineregimesynonym[iso88595][iso-8859-5] \defineregimesynonym[grk][iso-8859-7]
I don't know whether and how often people use all those encodings (I'm only pretty sure that people use the cp1250 one). LaTeX offers all of them for example. I would suggest at least to rename the five regimes mentioned above and to point to the more consistent names using synonyms. The mentioned regimes are all present on http://pub.mojca.org/tex/enco/contextbase/, so it's up to you wheter you add any of the other regimes to the distribution or perhaps better wait till someone requests them. (There are so many files that taking them all would almost require a separate folder.) I'm happy now that cp1250 is in and I'm not using any other regime, so it's really not my decision.
As far as I remember there were also some inconsistencies in the present greek and cyrillic regime. http://pub.mojca.org/tex/enco/contextbase/regi-vis.tex is slightly different than the file in the distro (uses named glyphs), but conceptually the same.
cyrillic is indeed a bit messy can you check the alpha zips that i just uploaded? (I wonder is we still need to preload regimes, maybe the ones mostly used ... which ones) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
cyrillic is indeed a bit messy
can you check the alpha zips that i just uploaded?
(I wonder is we still need to preload regimes, maybe the ones mostly used ... which ones)
I'll check iso-8858-1 and iso-8859-7 (I think that I remember some differences from the auto-generated files), but I don't have any more time today. cp1251 is now duplicated in regi-cyr (although I didn't check for differences today - see http://article.gmane.org/gmane.comp.tex.context/24677/ about cyrillic).
% todo: regi-cyp has more in it % \defineregimesynonym [iso88595] [iso-8859-5]
Why is this commented out? You don't need to rename or to reorganize the files - you may only rename the regime vector and then put the synonym for backward compatibility (otherwise everything else but iso-8859-5 with hyphens will work: just because of consistency).
\defineregimesynonym [windows-1250] [cp1250]
duplicated
\defineregimesynonym [1250] [cp1250]
I guess that we don't need 1250 for historical reasons: Either we put all ten of them or none. In what way does \startregime[cp1250pl] (in regi-cpl) differ from cp1250 except that it's incomplete? Could cp1250pl be a synonym for cp1250 (for compatibility reasons) as well? (I remember some --translate=cp1250cs switches as well, but I guess that this operated on another level) Thanks a lot for adding all that, Mojca
Mojca Miklavec wrote:
cyrillic is indeed a bit messy
can you check the alpha zips that i just uploaded?
(I wonder is we still need to preload regimes, maybe the ones mostly used ... which ones)
I'll check iso-8858-1 and iso-8859-7 (I think that I remember some differences from the auto-generated files), but I don't have any more time today.
cp1251 is now duplicated in regi-cyr (although I didn't check for differences today - see http://article.gmane.org/gmane.comp.tex.context/24677/ about cyrillic).
% todo: regi-cyp has more in it % \defineregimesynonym [iso88595] [iso-8859-5]
Why is this commented out? You don't need to rename or to reorganize the files - you may only rename the regime vector and then put the synonym for backward compatibility (otherwise everything else but iso-8859-5 with hyphens will work: just because of consistency).
sure, but i was unsure about this vector (esp since regi-cyp has variants in it); anyhow, uncommented now
\defineregimesynonym [windows-1250] [cp1250]
duplicated
\defineregimesynonym [1250] [cp1250]
I guess that we don't need 1250 for historical reasons: Either we put all ten of them or none.
ok
In what way does \startregime[cp1250pl] (in regi-cpl) differ from cp1250 except that it's incomplete? Could cp1250pl be a synonym for cp1250 (for compatibility reasons) as well? (I remember some --translate=cp1250cs switches as well, but I guess that this operated
this translate stuff is not needed
on another level)
ah, that one mistakenly ended up in the zip because of a regi-cp* copy
Thanks a lot for adding all that,
well, it's you who maintain them now -) Hans -- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On 4/8/06, Hans Hagen wrote:
(I wonder is we still need to preload regimes, maybe the ones mostly used ... which ones)
If I knew how to do that, I would implement it in the following way: Somewhere it would be defined where a specific regime is located \defineregimefile[cp1250][regi-cp1250] ... \defineregimefile[isoir111][regi-cyp] ... Then you don't need to read all those files / preload all those regimes (except for the really common ones). So if someone says \enableregime[windows-1250] then ConTeXt knows that it is a synonym for cp1250, that cp1250 is located in regi-cp1250.tex and finally loads the appropriate file and enables the regime. I don't know what exactly "preloading" means (are the definitions already included in format or are the files read in at runtime?) I doubt that it would save any resources that way, but in case it would, the regimes might be preloaded depending on mainlanguage. (If someone typesets Slovenian texts, it would preloa utf, cp1250 and latin2, if someone typesets russian, you would preload cyrilic regimes, ...) Otherwise I would say: if the regimes eat a lot of memory, don't preload the new ones yet (except cp1250) until someone requests them. Mojca
Mojca Miklavec wrote:
On 4/8/06, Hans Hagen wrote:
(I wonder is we still need to preload regimes, maybe the ones mostly used ... which ones)
If I knew how to do that, I would implement it in the following way:
Somewhere it would be defined where a specific regime is located \defineregimefile[cp1250][regi-cp1250] ... \defineregimefile[isoir111][regi-cyp] ...
Then you don't need to read all those files / preload all those regimes (except for the really common ones). So if someone says \enableregime[windows-1250] then ConTeXt knows that it is a synonym for cp1250, that cp1250 is located in regi-cp1250.tex and finally loads the appropriate file and enables the regime.
actually, this is what happens when a regime is not preloaded (i.e. see end of regi-ini) currently: \useregime[def,uni,iso-8858-1,iso-8858-2,cp1252,mac] maybe: \useregime[def,uni] % rest runtime
I don't know what exactly "preloading" means (are the definitions already included in format or are the files read in at runtime?) I
a few are in the format, eating up hash entries and memory; but since regimes are global anyway ...
doubt that it would save any resources that way, but in case it would, the regimes might be preloaded depending on mainlanguage. (If someone typesets Slovenian texts, it would preloa utf, cp1250 and latin2, if someone typesets russian, you would preload cyrilic regimes, ...)
Otherwise I would say: if the regimes eat a lot of memory, don't preload the new ones yet (except cp1250) until someone requests them.
or preload none since loading is fast enough Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Taco Hoekwater wrote:
Hans Hagen wrote:
or preload none since loading is fast enough
Agreed (there are never any assumptions abour preloaded regimes, are there?). The mess with iso-8859-1 is more or less my fault. I fixed up iso-8859-15 but never bothered with the old one.
ok, so we only preload utf and the rest runtime Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (4)
-
Hans Hagen
-
Mojca Miklavec
-
Taco Hoekwater
-
Vit Zyka