Synchronization

newer
[ pdftex-Bugs-917 ] pdftex 1.40.6:...

LAURENS Jérôme

8 Feb 2008 8 Feb '08

12:44 p.m.

Hi, I'd like to know the opinion of the developer team about synchronization and the patch I sent. Now that xpdf allows both forwards and backwords synchronization, it might be time to raise the question. JL

Show replies by date

Taco Hoekwater

8 Feb 8 Feb

1:52 p.m.

LAURENS Jérôme wrote:

...

Hi,

I'd like to know the opinion of the developer team about synchronization and the patch I sent.

I looked at Synchronize4.zip a while back and liked the general idea well enough, but somehow never got around to writing up my feelings. Here are a few remarks and questions: * how does this patch relate to the --src-specials switch from web2c, and what editor/previewer combinations understand your files format? (and/or are likely to start doing that in the future) * I consider the |if s>3| test in get_node() pretty bad style. There may very well be a conflicting extension in the future, and then it may turn out that your code suddenly needs rewriting. It is better to adjust the node-specific functions like new_null_box() etc. This will result in more lines of web changes, but it will be easier to maintain the code by someone that is not you. * the web glue code in synchronize.c is not very pretty, but I doubt there is a nice solution to that, so just let it be. * you should use actual C comments, though. Not the C++ // things. (Martin may have something more to say about C indentation style) * you licensed the C file as GPL2, yes? * the attachments seem missing in sarovar.org at the moment Best wishes, Taco

LAURENS Jérôme

9 Feb 9 Feb

2:47 a.m.

Le 8 févr. 08 à 13:52, Taco Hoekwater a écrit :

...

LAURENS Jérôme wrote:

...
Hi, I'd like to know the opinion of the developer team about synchronization and the patch I sent.

I looked at Synchronize4.zip a while back and liked the general idea well enough, but somehow never got around to writing up my feelings.

Here are a few remarks and questions:

* how does this patch relate to the --src-specials switch from web2c,

No common code better design stronger design better result

...

and what editor/previewer combinations understand your files format? (and/or are likely to start doing that in the future)

I understand your question: you are wondering if this feature would be used by a viewer or an editor. Anyone who ever used synchronization does not ask this question because he knows that such a synchronization is a "must have". But let me give more technical answers. - editors/viewers currently supporting .pdfsync files will most certainly support this new file format, because from the editor/viewer developer point of view, supporting this new format is much easier than supporting the old pdfsync. - Integrated TeX environments like iTeXMac2 or TeXShop on Mac OS X will support this feature because they want to provide the user a really efficient experience. - this new synchronization scheme is conceptually well designed, and won't break where src-ltx, src-specials and pdfsync would break - this synchronization scheme works the same for dvi or pdf Next question is: how hard is it to support this new feature? There are 2 possibilities, first, we can let things evolve according to nature's law and see what happens. Second, we can help things go the right way. More precisely, I see 2 possible designs, both relying on a synchronization controller, named "synchronizer" Consider 1 - editor 2 - synchronizer 3 - viewer forwards sync: design 1: - given a source file and a line number, editor asks the synchronizer to display the output for that location in the source - The synchronizer then parses the .sync file, finds out the real page and location to display, then asks the viewer to display it design 2: given a source file and a line number, editor asks the synchronizer for the corresponding page number and location The synchronizer then parses the .sync file, finds out the real page and location to display, and returns the result to the editor Then the editor asks the viewer to display the result backwards sync: design 1: given a pdf/dvi page and location, the viewer asks the synchronizer to display the intput for that location The synchronizer then parses the .sync file, finds out the real line number and input source file to edit, then asks the editor to edit it design 2: given a pdf page and location, the viewer asks the synchronizer for the intput corresponding to that location The synchronizer then parses the .sync file, finds out the real line number and input source file to edit, tand returns the result to the viewer Then the viewer asks the editor to edit the result The question is: where does the synchronizer live? For integrated TeX environments, this controller should be integrated, just like the editor and the viewer are integrated. But in general, the controller should be an external component(I vote for a ruby script) to which the viewer and the editor would send messages. And this controller would be the same for all editors and viewers! This is nothing but a MVC design: the model is the .sync file, the view is either the viewer or the editor, and both don't care about what is really inside the .sync file. Once the synchronization controller is available as an external script, xpdf (forthcoming 3.0.3) would support synchronization with an appropriate config file. I guess that auctex support would take just a few minutes.

...

* I consider the |if s>3| test in get_node() pretty bad style. There may very well be a conflicting extension in the future, and then it may turn out that your code suddenly needs rewriting. It is better to adjust the node-specific functions like new_null_box() etc.

This will result in more lines of web changes, but it will be easier to maintain the code by someone that is not you.

If you are talking about code design, I would follow you if you could explain what kind of extensions you are thinking of. The actual synchronization rules are simple - only nodes with size>3 are used for synchronization - when a node is used for synchronization, the last 2 words are reserved for synchronization - medium_node_size'd and box_node_size'd nodes are synchronized - every type of node is a potential candidate for forthcoming enhancements in synchronization By spreding initialization to many different places, you propose to create a set of rules and code for each type of node, mutliplying the synchronization hook points. I have the impression that maintaining such a code would be a pain, even for the programmer! However, if we want to initialize the nodes properly, we can do so in the node-specific function. Unfortunately, those functions are not always used to create new nodes and one must take care of other routines. But, how can we be sure that all the locations where initialization should take place are found? Any clue? Another important problem concerns memory management. I would not swear there is no memory leak.

...

* the web glue code in synchronize.c is not very pretty, but I doubt there is a nice solution to that, so just let it be.

yes, this is not a strong design. I guess that everything could be made in web but I don't know how (I don't want to;!), and I think it would have a non negligible computational cost when typesetting.

...

* you should use actual C comments, though. Not the C++ // things. (Martin may have something more to say about C indentation style)

* you licensed the C file as GPL2, yes?

Open source, whether GPL2 or something else, I don't know for sure because I was not given any hint about pdftex development rules.

...

* the attachments seem missing in sarovar.org at the moment

All the attachments have been removed, there might be a good reason, should I upload once again?

...

Best wishes, Taco

Taco Hoekwater

9:43 a.m.

Hi Jérôme, LAURENS Jérôme wrote:

...

...
* how does this patch relate to the --src-specials switch from web2c,

No common code better design stronger design better result

Does that mean that eventually we may get rid of --src-specials completely? (That would be good). Right now, we will probably have both at the same time and I was a little worried that they might interfere with eachother. From you long explanation below, I now understand that they don't.

...

...
and what editor/previewer combinations understand your files format? (and/or are likely to start doing that in the future)

I understand your question: you are wondering if this feature would be used by a viewer or an editor. Anyone who ever used synchronization does not ask this question because he knows that such a synchronization is a "must have".

Thanks for the long technical answer as well. So far, I have never really used editor synchronization tools in my TeX work. Personally, I always found that the semi-automatic window changes were very distracting. I have never been much of a fan of integrated TeX systems either, for much the same reason (this is a matter of personal taste, so don't take it as a critique on your patch).

...

...
This will result in more lines of web changes, but it will be easier to maintain the code by someone that is not you.

If you are talking about code design, I would follow you if you could explain what kind of extensions you are thinking of.

You made a convincing argument, so just leave it as is. To give you an idea of what I am talking about, here is the list of nodes that have size>3 already even without your patch in luatex (you could consider luatex an extension to pdftex): Core output nodes: disc, box, rule, glyph, margin_kern Whatits that are output: pdf_refxform, pdf_refximage, pdf_annot, pdf_dest, pdf_thread, Internal stream nodes: ins_node, page_ins_node Internal whatsits: dir, open, local_par, pdf_colorstack, user_defined Temporary math objects: style, radical_noad, accent_noad, (about everything) fraction_noad, noad Alignment stack entries: align_stack Paragraph breaking: hyphenated, unhyphenated, passive, delta Weird ones: shape, pseudo_line (variable size) Some of those are candidates for synchronization, but many are not. A whole bunch of them is never written to the output at all. Like I said, leave your code as is. But I want to show you how it would also have been possible (luatex pascal web, not pdftex, but it does not really matter): -- @d box_node_size=8+synchronization_field_size {number of words to allocate for a box node} -- @p function new_null_box:pointer; {creates a new box node} var p:pointer; {the new node} begin p:=new_node(hlist_node,min_quarterword); + @; new_null_box:=p; end; -- @p function new_kern(@!w:scaled):pointer; var p:pointer; {the new node} begin p:=new_node(kern_node,normal); width(p):=w; + @; new_kern:=p; end; -- ... etc -- + glue_node: begin + @; + @; + end; -- + @ All synchronization stuff is handled in this section + + @d synchronization_field_size==2 + + @= + begin + mem[p+box_node_size].int := synctag; + mem[p+box_node_size+1].int := line; + end + .... + + @ @= + sync_glue(p); + ... The main advantages: * a person changing new_null_nox() can immediately see there is something going on with synchronization * all the actual synchronization code is bundled together, allowing better documentation * there may be a bit more code, but less initialization instructions are actually executed by the application.

...

...
* you licensed the C file as GPL2, yes?

Open source, whether GPL2 or something else, I don't know for sure because I was not given any hint about pdftex development rules.

GPL2 is fine, just making sure.

...

...
* the attachments seem missing in sarovar.org at the moment

All the attachments have been removed, there might be a good reason, should I upload once again?

I don't know why they were removed in the first place. Martin, is that something you did? Best wishes, Taco

The Thanh Han

11:06 a.m.

I myself never used synchronization, however I believe this is a very useful feature for a lot of users and if this is supported by pdftex/luatex, many editors/viewers will make use of it. so, here is my vote for accepting Jerome's patch. Thanh

Taco Hoekwater

11:11 a.m.

The Thanh Han wrote:

...

so, here is my vote for accepting Jerome's patch.

I never explicitly said so, but I am in favor as well. The // comments really need fixing, though. The language is C, not C++, and pdftex is not always compiled with gcc. Best wishes, Taco

Martin Schröder

1:22 p.m.

2008/2/9, Taco Hoekwater :

...

The Thanh Han wrote:

...
so, here is my vote for accepting Jerome's patch.

I never explicitly said so, but I am in favor as well.

Me too.

...

The // comments really need fixing, though. The language is C, not C++, and pdftex is not always compiled with gcc.

Once I get the code again, I'll polish it up. Jerome, just send me the code per mail or try to attach it to #871 again. Best Martin

LAURENS Jérôme

10 Feb 10 Feb

7:48 p.m.

Hi folks, I have made a real effort to improve the code. I have removed the C++ comments, and tried to follow the english C formatting rules, but this is not the main part. To follow Taco's suggestion about coding style, I finally took some time to read Knuth's article on literate programming. One of the consequences is that the patch no longer breaks weave. The other consequence is that I took the liberty to gather all the synchronization code into one section, except the macro definitions of course. This section is numbered 53b because I could not do anything else without breaking those weird *tex.ch. It is less than 3 pages long, I can send the extract off list. I also added some namespace convention. It appears that refering to "synchronization", or "synchronize" in pdftex.web is ambiguous because there is already a synchronization problem concerning the terminal. So I decided to change the name and call this "Synchronize TeXnology", abbreviated in the key word "synctex" Now, if you find trunk -exec grep "synctex" "{}" \; -print you will find absolutely all the places where synchronization plays the slightest role. Of course, the external file has been renamed to synctex.c the auxiliary file is foo.synctex the auxiliary shell script that will help editors and viewer to support synchronization will be named ... synctex. I think that this external helper should be stored on CTAN, in support/ synctex I forgot to say that the command line option is now -synctex=1 and the primitive is \synctex there are 25 google pages for synctex, and no pages for ""synchronize texnology"" I could not upload anything to sarovar, it seems that some limit has been reached. So I send the diff to Martin off list If necessary, I can create a new patch entry at sarovar.

Hans Hagen

8:22 p.m.

LAURENS Jérôme wrote:

...

I forgot to say that the command line option is now -synctex=1 and the primitive is \synctex

the primitive probably will be changes into something \pdfsynchronize=<number> Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

LAURENS Jérôme

11 Feb 11 Feb

10:33 a.m.

Le 10 févr. 08 à 20:22, Hans Hagen a écrit :

...

LAURENS Jérôme wrote:

...
I forgot to say that the command line option is now -synctex=1 and the primitive is \synctex

the primitive probably will be changes into something \pdfsynchronize=<number>

I don't think this is a good idea because synctex does not use any pdf related feature, unlike former pdfsync. It would be misleading to include "pdf" in the name. Moreover, one can see in synctex a common technology shared by all tex.web based engines, including xetex. I agree that a name like \pdfsynchronize is more user friendly, but the advantage of synctex is that it can also be used in the CLI... as I said the string "synctex" is also some kind of keyword.

Hans Hagen

11:23 a.m.

LAURENS Jérôme wrote:

...

Le 10 févr. 08 à 20:22, Hans Hagen a écrit :

...
LAURENS Jérôme wrote:

...
I forgot to say that the command line option is now -synctex=1 and the primitive is \synctex the primitive probably will be changes into something \pdfsynchronize=<number>

I don't think this is a good idea because synctex does not use any pdf related feature, unlike former pdfsync. It would be misleading to include "pdf" in the name. Moreover, one can see in synctex a common technology shared by all tex.web based engines, including xetex.

I agree that a name like \pdfsynchronize is more user friendly, but the advantage of synctex is that it can also be used in the CLI... as I said the string "synctex" is also some kind of keyword.

the policy for pdftex is not to introduce new primitives unless preceded by \pdf ... (not that i care that much because macro packages can deal with it but for latex it was considered a problem as there are many packages out there we don't know of), so such is life ... anyhow, even if we use a non prefixed primitive, i'd like a more verbose one \synchronizeoutputstate or, actually, it belongs to the trace family \traceoutputpositions or so; since we're talking tex here, the tex in \synctex is somewhat strange; of course you can call your mechanism as such, but that;s a different matter Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Martin Schröder

11:35 a.m.

2008/2/11, Hans Hagen :

...

LAURENS Jérôme wrote:

...
I agree that a name like \pdfsynchronize is more user friendly, but the advantage of synctex is that it can also be used in the CLI... as I said the string "synctex" is also some kind of keyword.

the policy for pdftex is not to introduce new primitives unless preceded by \pdf ... (not that i care that much because macro packages can deal with it but for latex it was considered a problem as there are many packages out there we don't know of), so such is life ...

This isn't really an issue anymore with \pdfprimitive. :-)

...

anyhow, even if we use a non prefixed primitive, i'd like a more verbose one

\synchronizeoutputstate

or, actually, it belongs to the trace family

\traceoutputpositions

or so; since we're talking tex here, the tex in \synctex is somewhat strange; of course you can call your mechanism as such, but that;s a different matter

Agreed. And we shouldn't forget to include Johnathan in the discussion, since this should also end up in XeTeX. :-) Best Martin

Hans Hagen

11:43 a.m.

Martin Schröder wrote:

...

2008/2/11, Hans Hagen :

...
...
I agree that a name like \pdfsynchronize is more user friendly, but the advantage of synctex is that it can also be used in the CLI... as I said the string "synctex" is also some kind of keyword.

LAURENS Jérôme wrote: the policy for pdftex is not to introduce new primitives unless preceded by \pdf ... (not that i care that much because macro packages can deal with it but for latex it was considered a problem as there are many packages out there we don't know of), so such is life ...

This isn't really an issue anymore with \pdfprimitive. :-)

huh? that's not so much different from a bunch of \let \mysavedwhatever \whatever and that was rejected (\primitive is not something that end users will use and i never could convince the team to not use the \pdf primitive als, if so, we should start rename the existing \pdfwhatevers

...

...
anyhow, even if we use a non prefixed primitive, i'd like a more verbose one

\synchronizeoutputstate

or, actually, it belongs to the trace family

\traceoutputpositions

or so; since we're talking tex here, the tex in \synctex is somewhat strange; of course you can call your mechanism as such, but that;s a different matter

Agreed. And we shouldn't forget to include Johnathan in the discussion, since this should also end up in XeTeX. :-)

(btw, is there an indication of the speed penalty and influence on the memory footprint when this feature is not enabled?) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Taco Hoekwater

11:53 a.m.

Hans Hagen wrote:

...

(btw, is there an indication of the speed penalty and influence on the memory footprint when this feature is not enabled?)

my guesstimate is less than half of a percent speed penalty, and about a 1-2% node memory size increase, for plain tex. Speed penalty for latex/context will be considerably lower still, because they spend much more time interpreting macros. The speed penalty will be higher when enable, of course, but it makes no difference for memory consumption. Best wishes, Taco

LAURENS Jérôme

7:15 p.m.

Le 11 févr. 08 à 11:23, Hans Hagen a écrit :

...

the policy for pdftex is not to introduce new primitives unless preceded by \pdf ... (not that i care that much because macro packages can deal with it but for latex it was considered a problem as there are many packages out there we don't know of), so such is life ...

anyhow, even if we use a non prefixed primitive, i'd like a more verbose one

\synchronizeoutputstate

this one seems ambiguous

...

or, actually, it belongs to the trace family

\traceoutputpositions

or so; since we're talking tex here, the tex in \synctex is somewhat strange; of course you can call your mechanism as such, but that;s a different matter

Couldn't we adopt an intermediate position? \tracesynctexpositions where you consider synctex as a whole word, just like mutex, cortex, vortex, vertex and of course latex In fact, due to performance reasons, only approximate output positions are traced, the output position unit is tex sp, but the synctex position unit is 8192 sp. I would expect a \traceoutputpositions to really report the output position, not just something approximate. \tracingsynctex is another choice. More texish, but I would not choose it. JL

Philip Taylor (Webmaster)

7:22 p.m.

If we /are/ going to abbreviate "synchroni{s|z}e" to "sync", could we please abbreviate it to "synch", otherwise there are going to be a lot of unintentional errors in coding ... ** Phil. -------- [*] The "ch" group in "synchroni{s|z}e" is actually Greek "chi", whence indivisible.

LAURENS Jérôme

11:44 p.m.

Le 11 févr. 08 à 19:22, Philip Taylor (Webmaster) a écrit :

...

If we /are/ going to abbreviate "synchroni{s|z}e" to "sync", could we please abbreviate it to "synch", otherwise there are going to be a lot of unintentional errors in coding ...

synctex is the way to go

...

** Phil. -------- [*] The "ch" group in "synchroni{s|z}e" is actually Greek "chi", whence indivisible.

Hans Hagen

7:47 p.m.

LAURENS Jérôme wrote:

...

Couldn't we adopt an intermediate position?

\tracesynctexpositions

i see no reason for 'tex' in the name, after all, we are running tex; also, nowadays there is no need to have short names (many new primitives have verbose names)

...

where you consider synctex as a whole word, just like mutex, cortex, vortex, vertex and of course latex In fact, due to performance reasons, only approximate output positions are traced, the output position unit is tex sp, but the synctex position unit is 8192 sp. I would expect a \traceoutputpositions to really report the output position, not just something approximate.

well, i'm more thinking along the line ... \traceoutputpositions = 0 : disabled \traceoutputpositions = 1 : low resolution \traceoutputpositions = 2 : medium resolution (8192 resolution) \traceoutputpositions = 3 : high resolution although position are always approximations (take for instance complex ligatures orgenerated code) i wonder if we should divert from the full sp resolution; i'd say: output the full number (no devision) and provide resolution in the amount of sync points (if this feature also ends up in luatex, then it will be full resolution if only because all dimensions accessible at the lua end are in scaled points) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

LAURENS Jérôme

11:42 p.m.

Le 11 févr. 08 à 19:47, Hans Hagen a écrit :

...

LAURENS Jérôme wrote:

...
Couldn't we adopt an intermediate position? \tracesynctexpositions

i see no reason for 'tex' in the name, after all, we are running tex; also, nowadays there is no need to have short names (many new primitives have verbose names)

...
where you consider synctex as a whole word, just like mutex, cortex, vortex, vertex and of course latex In fact, due to performance reasons, only approximate output positions are traced, the output position unit is tex sp, but the synctex position unit is 8192 sp. I would expect a \traceoutputpositions to really report the output position, not just something approximate.

well, i'm more thinking along the line ...

\traceoutputpositions = 0 : disabled \traceoutputpositions = 1 : low resolution \traceoutputpositions = 2 : medium resolution (8192 resolution) \traceoutputpositions = 3 : high resolution

Unfortunately, this is not what synctex is designed for. It is more like \tracesynctexpositions=0:disabled \tracesynctexpositions=1:math, kern, glue, hbox \tracesynctexpositions=2:add noads \tracesynctexpositions=3:add some whatsits All this is dedicated to the Editor/Viewer synchronization, and it is not advisable to use it for something else. Roughly speaking, here is how synctex works: 1 - some nodes have been extended to -always- store synchronization information 2 - if some flag is enabled, this information is used to help synchronization. Point 1 belongs to the data model, it is public and can be used by any other tex extension. Point 2 is a controller, which means a piece of code that uses the data; it is -private- to synctex. The \synctex (or \tracesynctexpositions) primitive controls the behaviour of this synctex controller. So it is dedicated to point 2 and this is why it must be named accordingly. If someone plans to use point 1 for another tex extension, then it will be possible to use your \traceoutputpositions terminology if this is relevant.

...

although position are always approximations (take for instance complex ligatures orgenerated code) i wonder if we should divert from the full sp resolution; i'd say: output the full number (no devision) and provide resolution in the amount of sync points

ok, the question is about precision

...

(if this feature also ends up in luatex, then it will be full resolution if only because all dimensions accessible at the lua end are in scaled points)

no, you are confusing the data and the controller. Actually, the dimensions are available in full dimensions, but this is a deliberate choice of the controller to truncate them to a lower precision. If the controller is implemented in lua, it will also decide to truncate simply because it is a matter of efficiency. The purpose of the lower precision in synctex end is simply to reduce the size of the auxiliary output file. If we use full resolution, then the size increases by 40%, which is - very- significant because I/O operations are involved. 8192 is the best choice for that purpose. I hope this explanation will enlight the design of synctex, and make you understand how tex and synctex are tied together. JL

Hans Hagen

12 Feb 12 Feb

9:26 a.m.

LAURENS Jérôme wrote:

...

\tracesynctexpositions=0:disabled \tracesynctexpositions=1:math, kern, glue, hbox \tracesynctexpositions=2:add noads \tracesynctexpositions=3:add some whatsits

so, variant 2 and 3 are influencing the node lists (I assume that this does not influence the outout i.e. no inhibiting of the \last... primitives etc (sometimes whatsits can get in the way) (btw, i then assume that there is also an implicit position insertion possible, like \injectsuchapositionhere or so)

...

If someone plans to use point 1 for another tex extension, then it will be possible to use your \traceoutputpositions terminology if this is relevant.

in the case of luatex (as an example) >= 2 would not work out well because then all the user code that operates on a node list should take these extra nodes (whatsites) into acount (ignore them and such); so, in the case of luatex we would end up with just case 1 (but then in all nodes). If needed user code can inject additional nodes (like 2/3). At the end the info needed for synchronization can be written to file by a lua function that loops over the resulting lists (of a callback in the backend that produces the output), totally under user control. Of course then, someone can write a sync compatible format.

...

Actually, the dimensions are available in full dimensions, but this is a deliberate choice of the controller to truncate them to a lower precision.

i assumed as much -)

...

If the controller is implemented in lua, it will also decide to truncate simply because it is a matter of efficiency.

it depends ... a bit of caching and a few more bytes may be cheaper than calculations

...

The purpose of the lower precision in synctex end is simply to reduce the size of the auxiliary output file. If we use full resolution, then the size increases by 40%, which is - very- significant because I/O operations are involved. 8192 is the best choice for that purpose.

(i can even imagine that outputting base points is an alternative because after all, viewers work in such dimensions)

...

I hope this explanation will enlight the design of synctex, and make you understand how tex and synctex are tied together.

sure, but what we want to keep an eye on (at least for luatex) is that new functionality is open, not interfering, under user control etc anyhow, pdftex is a good testbed for the sync feature (i suppose that xetex also needs some tweaking when method 2/3 comes inti play because it may manipulate node lists differently from pdftex and has additional node types) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Taco Hoekwater

9:39 a.m.

Hans Hagen wrote:

...

LAURENS Jérôme wrote:

...
\tracesynctexpositions=0:disabled \tracesynctexpositions=1:math, kern, glue, hbox \tracesynctexpositions=2:add noads \tracesynctexpositions=3:add some whatsits

so, variant 2 and 3 are influencing the node lists (I assume that this does not influence the outout i.e. no inhibiting of the \last... primitives etc (sometimes whatsits can get in the way)

I assume they will just be adding more output for other node types (read "add" as an abbreviation for "additionally, produce information for .."). The point of this patch, iiuc, is that there is no need for node list manupilations.

...

...
The purpose of the lower precision in synctex end is simply to reduce the size of the auxiliary output file. If we use full resolution, then the size increases by 40%, which is - very- significant because I/O operations are involved. 8192 is the best choice for that purpose.

It would be interesting to have the full precision as an option. A macro package for two-pass page break optimization may want to use the synctex file also , and such a package would really need full precision to avoid rounding errors. Best wishes, Taco

LAURENS Jérôme

3:08 p.m.

Le 12 févr. 08 à 09:39, Taco Hoekwater a écrit :

...

...
...
The purpose of the lower precision in synctex end is simply to reduce the size of the auxiliary output file. If we use full resolution, then the size increases by 40%, which is - very- significant because I/O operations are involved. 8192 is the best choice for that purpose.

It would be interesting to have the full precision as an option. A macro package for two-pass page break optimization may want to use the synctex file also , and such a package would really need full precision to avoid rounding errors.

I am not in favour of adding complexity to synctex out of its original purpose. I can't see how .synctex would benefit a page breaker. Conversely, basing extension on the synctex file will add limitations to both the extension and synctex. can you elaborate?

Taco Hoekwater

3:23 p.m.

LAURENS Jérôme wrote:

...

Le 12 févr. 08 à 09:39, Taco Hoekwater a écrit :

...
...
...
The purpose of the lower precision in synctex end is simply to reduce the size of the auxiliary output file. If we use full resolution, then the size increases by 40%, which is - very- significant because I/O operations are involved. 8192 is the best choice for that purpose. It would be interesting to have the full precision as an option. A macro package for two-pass page break optimization may want to use the synctex file also , and such a package would really need full precision to avoid rounding errors.

I am not in favour of adding complexity to synctex out of its original purpose. I can't see how .synctex would benefit a page breaker. Conversely, basing extension on the synctex file will add limitations to both the extension and synctex.

can you elaborate?

I could, but no time. And it was not important anyway. Best wishes, Taco

LAURENS Jérôme

10:53 a.m.

Le 12 févr. 08 à 09:26, Hans Hagen a écrit :

...

LAURENS Jérôme wrote:

...
\tracesynctexpositions=0:disabled \tracesynctexpositions=1:math, kern, glue, hbox \tracesynctexpositions=2:add noads \tracesynctexpositions=3:add some whatsits

so, variant 2 and 3 are influencing the node lists (I assume that this does not influence the outout i.e. no inhibiting of the \last... primitives etc (sometimes whatsits can get in the way)

Actually, absolutely no new node is created, unlike srcltx, src- specials and pdfsync This is why there won't be compatibility problems with already existing macros.

...

(btw, i then assume that there is also an implicit position insertion possible, like \injectsuchapositionhere or so)

not yet. The idea would be something like \marksynctexposition{blah} or \synctexcomment{blah} where blah is something related to the content, but I do not know yet how it should be designed With synctex, you are absolutely sure that no new node has been created. This is what makes the difference with srcltx, src specials and pdfsync and this is what ensures that other packages won't break.

...

...
If someone plans to use point 1 for another tex extension, then it will be possible to use your \traceoutputpositions terminology if this is relevant.

in the case of luatex (as an example) >= 2 would not work out well because then all the user code that operates on a node list should take these extra nodes (whatsites) into acount (ignore them and such); so, in the case of luatex we would end up with just case 1 (but then in all nodes). If needed user code can inject additional nodes (like 2/3). At the end the info needed for synchronization can be written to file by a lua function that loops over the resulting lists (of a callback in the backend that produces the output), totally under user control. Of course then, someone can write a sync compatible format.

No, this is not how the patch works. The purpose is not to create new nodes nor manipulate node lists, the purpose is only to observe existing nodes. If you extend the size of some whatsit nodes, they will become observable, and there will be more info in the output. I guess the size of the node is not something publicly available in luatex, otherwise there might be problems when changing the node size in the core of the engine.

...

...
Actually, the dimensions are available in full dimensions, but this is a deliberate choice of the controller to truncate them to a lower precision.

i assumed as much -)

...
If the controller is implemented in lua, it will also decide to truncate simply because it is a matter of efficiency.

it depends ... a bit of caching and a few more bytes may be cheaper than calculations

All the problem lies into inter process communication. If lua is able to talk directly to the viewer, the editor or the external synctex controller, then truncation is useless. But as long as IPC is made through text files, then truncation is a real benefit. And I am afraid it is difficult to avoid text files in that particular situation because we have no control on viewers nor editors. JL

Hans Hagen

11:10 a.m.

LAURENS Jérôme wrote:

...

Le 12 févr. 08 à 09:26, Hans Hagen a écrit :

...
LAURENS Jérôme wrote:

...
\tracesynctexpositions=0:disabled \tracesynctexpositions=1:math, kern, glue, hbox \tracesynctexpositions=2:add noads \tracesynctexpositions=3:add some whatsits so, variant 2 and 3 are influencing the node lists (I assume that this does not influence the outout i.e. no inhibiting of the \last... primitives etc (sometimes whatsits can get in the way)

Actually, absolutely no new node is created, unlike srcltx, src- specials and pdfsync This is why there won't be compatibility problems with already existing macros.

ok, so 2 is not 'add noad's (which are nodes -) but 'add info to more nodes'

...

...
(btw, i then assume that there is also an implicit position insertion possible, like \injectsuchapositionhere or so)

not yet.

maybe it makes sense to add such a primitive

...

The idea would be something like \marksynctexposition{blah} or \synctexcomment{blah} where blah is something related to the content, but I do not know yet how it should be designed

hm, adding labels would add more info to the log file (which you want to keep small)

...

With synctex, you are absolutely sure that no new node has been created. This is what makes the difference with srcltx, src specials and pdfsync and this is what ensures that other packages won't break.

ok, good

...

I guess the size of the node is not something publicly available in luatex, otherwise there might be problems when changing the node size in the core of the engine.

luatex has reimplemented nodes completely so any sync feature in there would work differently: i.e. extra fields in each node, accessible as such (from lua) and even manipulateable; the file location aspect is even a bit more tricky because one can overload the file handler and as such positional information that tracks back to the source also needs to carry (probably indirect) information about that (for instance: input can come from zip files, or input sequences can be transformed into other sequences); in that respect the writing of tracking info to a file may depends on what the macro package does with the input handling (of course this can be hidden from users by appropriate functions) but there is no need to worry about that now (after all, this is the pdftex dev list) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

LAURENS Jérôme

12:36 p.m.

Le 12 févr. 08 à 11:10, Hans Hagen a écrit :

...

LAURENS Jérôme wrote:

...
Le 12 févr. 08 à 09:26, Hans Hagen a écrit :

...
LAURENS Jérôme wrote:

...
\tracesynctexpositions=0:disabled \tracesynctexpositions=1:math, kern, glue, hbox \tracesynctexpositions=2:add noads \tracesynctexpositions=3:add some whatsits so, variant 2 and 3 are influencing the node lists (I assume that this does not influence the outout i.e. no inhibiting of the \last... primitives etc (sometimes whatsits can get in the way) Actually, absolutely no new node is created, unlike srcltx, src- specials and pdfsync This is why there won't be compatibility problems with already existing macros.

ok, so 2 is not 'add noad's (which are nodes -) but 'add info to more nodes'

yes, I meant "add more nodes to the list of observable nodes"...

...

...
...
(btw, i then assume that there is also an implicit position insertion possible, like \injectsuchapositionhere or so)

not yet.

maybe it makes sense to add such a primitive

...

...
The idea would be something like \marksynctexposition{blah} or \synctexcomment{blah} where blah is something related to the content, but I do not know yet how it should be designed

hm, adding labels would add more info to the log file (which you want to keep small)

The preliminary explanations about synchronization (packed with the early versions of the patch) is available at http://itexmac.sourceforge.net/synchronize+Frames.pdf If you take a look at pages 5-9, you will see exactly the information the synctex auxiliary file provides. This is far better than former specials or pdfsync. I can't measure for sure, but to give just an idea, I would say that with this information, I can synchronize 99% of the characters, and 99.9% of the lines. So I am not sure that a \injectsuchapositionhere would be of great help. Maybe adding meta information/comment would help.

Hans Hagen

1:07 p.m.

LAURENS Jérôme wrote:

...

http://itexmac.sourceforge.net/synchronize+Frames.pdf

If you take a look at pages 5-9, you will see exactly the information the synctex auxiliary file provides. This is far better than former specials or pdfsync. I can't measure for sure, but to give just an idea, I would say that with this information, I can synchronize 99% of the characters, and 99.9% of the lines.

btw, this all depends on the kind of document, (e.g. xml input where data is filtered from the tree is trikcy), but for traditionally edited tex docs this is probably true (apart from tables of contents, lists, generated info, but ther one can disable the mechanism)

...

So I am not sure that a \injectsuchapositionhere would be of great help. Maybe adding meta information/comment would help.

it depends; i use \pdfsavepos cum suis lots of time (sometimes docs with many thousands of positions, and megabytes of multipass files) but this is different from a sync features (positional info carries much more info around) just wondering ... why do you use h: and g: ? i can also imagine that you just use the node id (hlists and glues are just a few types) ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

LAURENS Jérôme

3:55 p.m.

Le 12 févr. 08 à 13:07, Hans Hagen a écrit :

...

just wondering ... why do you use h: and g: ? i can also imagine that you just use the node id (hlists and glues are just a few types)

hlists are useful because any output text belongs to an hlist. But hlists are not always created at parse time, which means that their associate line number is not accurate. On the contrary, glue nodes (like kern and math ones) are created at parse time, which means that their associate line number is accurate. Moreover, glues and kerns are naturally created during the process, no need to force anything. Finally, TeX assumes that kern, math and glue nodes have the same size. Basically, hlist are used to find where the text lies, the glue, kern and math nodes are used to find the real line number inside the boxes. JL

Martin Schröder

9 Feb 9 Feb

1:17 p.m.

2008/2/9, Taco Hoekwater :

...

LAURENS Jérôme wrote:

...
Open source, whether GPL2 or something else, I don't know for sure because I was not given any hint about pdftex development rules.

GPL2 is fine, just making sure.

GPL2++, actually.

...

...
...
* the attachments seem missing in sarovar.org at the moment

All the attachments have been removed, there might be a good reason, should I upload once again?

I don't know why they were removed in the first place. Martin, is that something you did?

Seems to be a Sarovar problem. Note that it does not say that there are no attachements; it just doesn't display them. The last line of the page is "Existing Files:" -- wherever they are. :-( Best Martin

6349

Age (days ago)

6353

Last active (days ago)

List overview

Download

28 comments

6 participants

participants (6)

Hans Hagen
LAURENS Jérôme
Martin Schröder
Philip Taylor (Webmaster)
Taco Hoekwater
The Thanh Han