Karel Skoup wrote:
\hsize=2in \the\list0 \par % typeset the node list
So \the\list0 will expand to tokens (consistent with \write), right? It won't just insert the list on the currently active list (would be inconsistent with \write), right?
indeed btw, we have the same situation with lua: \lua{tex.print("\string\\relax")} results in just the word \relax being typeset so in order to get it texed we nee to fee din into \scantokens so, i can imagine that there is something \scanlist\expandafter{\the\toks0}
Sure, that's the multiple paragraph (stream) stuff. It will be the really tricky part, not so much for me, but in TeX, the whole model must be generalized/extended. It's not yet very clear to anybody, or is it? I think it's a real research topic.
indeed, stepwise refinement (start small -)
OK, but that won't bring much, just some funny shapes.
sure, but on the other hand, it can be used to 'replace' the current par builder by a more advanced (e.g. hyphenation) one, imagine that we have: \paroutput {write list to file (or pipe) call plugin in one-paragraph mode read list from file (or pipe)} that way we can replace the current par builder, because by default it's something equivalent to: \paroutput{\scanlist\expandafter{\the\list255}} i wonder how hard this is to implement, you and taco should know -)
- more boundary conditions - possible page crossing
Not only page crossing, but also column/shape/container crossing ... The problem is that we are used to \parshape, which just specifies something for certain lines in the current paragraph. But if we want to introduce real page layouts, then the shapes are not relative to the paragraphs any more. It will be a matter of formatting where a particular paragraph starts in the layout.
it's a combination: - a main gutter shape (can be colums or whatever) - shapes bound to places on the gutter - shapes bound to specific places in the stream - shapes that may float (within boundary condition)
Sure, that would be great. Then I won't have to access metric files at all. But should I wait for that? I wanted to start with the \showlists output for prototyping. Well, I'll see how fast will I progress. Maybe, that you'll be faster :-).
ok, i know you don't like messing around with the tex source, but i can imagine that this showlist stuff is doable, so if you want, you can provide patches to the web source; we're working with a branch of pdftex anyway;
But concerning the metric files, if I want to treat hyphenation locally, then I also need the kerning and ligature programs. In TeX it is done too early (and then it is taken apart and (wrongly) reconstructed during hyphenation pass). I want to do ligatures and kernings on demand, basically after hyphenation (it's not that simple, but anyway).
how about a font daemon, that one could cache/access font files; we need to go open type anyway so maybe such a deamon can be built on top of existing (non tex) libraries (port 31415)
NO. It screws up everything, not only taken or potential breaks, but even the potential hyphenation points which are never considered a break. It is also known too late, in the middle of the (atomic) paragraph breaking process.
ok, so that's a dead end
hyphenation points, unless we let tex do a pre break run with a zero hsize so that we get 'm all
No, no, it's much more stupid than you think. TeX first builds the horizontal list with all kernings and ligatures, taking {} (in dif{}ferent) into account. Then it tries the first breaking pass with the \pretolerance. If that fails, then it takes the whole list, tries to hyphenate *all* words in the lists, inserts the explicit \discretionaries to *every* potential hyphen and reconstructs the kernings and ligatures for the segments between the \discretionaries, loosing all ligature preventions and yielding potentially incorrect ligatures and kernings for words which are actually not hyphenated. Then it tries the second (and maybe third pass), but it looses the originally built list forever. The whole breaking is an atomic operation (happening at \par), you can't do anything between the passes.
Taco, is that correct, or am I too TeX unfriendly?
-) that's indeed too hard-coded for our purpose, so, next to a font daemon, we need a hyphenation daemon
Maybe we should make a whole new glossary, for example 'node' is quite OK for everything in the list (char, box, glue, penalty, ...), but 'list' is so ambiguous, there should be something more specific (maybe 'node list'). TeX itself doesn't give clear names (classes) for those objects. I had to make them names in NTS (to name the classes), maybe we can look into it.
good idea; we indeed need to define proper names and descriptions; can you make a proposal for that based on your nts experiences?
well, it works for english, which was the objective; DEK would rightfully react with: then why did nobody adapt it, replace it, etc -)
High time, huh?
It works for English (does it really always ?), because it is simple, right? I don't know, whether it is a real problem in any other language in practice. I just know the code and I think that it is incorrect, inconsistent and illogical.
my impression is that tehnumber of missed/wrong cases for english is so small that it falls within the 'no problem to correct it manually' criteria; languages with compound words, accented characters etc hav ehigher demands Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------