[Dev-luatex] Dump sharing - redundant copying/allocations

Hans Hagen j.hagen at xs4all.nl
Wed Jul 21 16:31:29 CEST 2021

On 7/21/2021 4:08 PM, Michal Vlasák wrote:
> On Wed Jul 21, 2021 at 3:57 PM CEST, Hans Hagen wrote:
>> even if one would handle bytes in the lua bytecode, the bytecode itself
>> is not portable (and i'm not even sure how luajit stuff fits in because
>> luajit is even more platform specific
> LuaJIT is actually really nice in this regard:
>      "The generated bytecode is portable and can be loaded on any
>      architecture that LuaJIT supports, independent of word size or
>      endianess. However the bytecode compatibility versions must match."

Ok, btw, these bytecode compatibility versions are not guaranteed the 
same within intermediate updates (so for instance during the 5.4 dev 
stage they changed .. i actually took care of that but didn't want to 
patch the official code - with a sub number - any more so i dropped that)

>> one observation is that using macros instead of functions for
>> performance makes little sense in a program like tex where one jumps
>> over memory space all the time (compilers are quite okay in optimizing),
>> but there can be differences between versions of e.g. gcc
> I think that modern compilers are good with inlining, one can get more
> espcially when functions are marked static. So I incline towards
> functions rather than macros.

also, local optimization is better

>> in general, loss of performance in a tex engine is more due to the way
>> macros are composed (or user styles for that matter)
>> another one is the performance of the console, i.e. kind of font,
>> buffer, refresh delays defaults (i noticed that linux has large delays
>> so that's the fastest, the new windows terminal is also fast) .. now
>> that one is really measureable .. just try to run with piping the log
>> to a file (all understandable) .. squeezing microseconds out of the
>> binary can easily be nilled that way
> Yeah, you are right, even for 18 lines of console output I mesaure more
> noticable difference than with the mallocs and byte swapping.
it has to do with the fact that tex outputs on a char by char basis with 
different criteria for log and console, and most consoles accumulate 
some before flushing, sometimes upto 200 ms

the old windows console output per-char so that one was hurt most, but 
there were plenty ways around that; on osx fancy font features in a 
console could also work out bad


                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl

More information about the dev-luatex mailing list