[Dev-luatex] Memory leak in string.explode()?

Hans Hagen pragma at wxs.nl
Wed Nov 7 11:33:48 CET 2012


On 11/7/2012 11:21 AM, Taco Hoekwater wrote:
> On 11/07/2012 11:07 AM, Hans Hagen wrote:
>> On 11/7/2012 10:33 AM, luigi scarso wrote:
>>>
>>>
>>> On Wed, Nov 7, 2012 at 10:07 AM, Hans Hagen <pragma at wxs.nl
>>> <mailto:pragma at wxs.nl>> wrote:
>>>
>>>     On 11/7/2012 1:50 AM, Reinhard Kotucha wrote:
>>>
>>>         What I don't grok is why it's 20 times slower to load a file
>>> at once
>>>         than in little chunks.  But what I'm mostly concerned about is
>>>         memory
>>>         consumption.  In xosview I can see that the garbage collector
>>>         reduces
>>>         the consumed memory from time to time, but it doesn't seem to be
>>>         overly hungry.  Even worse, I have the impression that memory
>>>         consumption grows exponentially with time.  With a slightly
>>> larger
>>>         test file my system (4GB RAM) would certainly run out of memory.
>>>
>>>
>>>     I think 20 times is somewhat off at your end because here I get
>>> this:
>>>
>>> Out of memory here  with a testfile of  662M
>>> Linux 32bit, 4GByte, PAE extension
>>>
>>> # time ./read_blocks.lua
>>>
>>> real    0m2.458s
>>> user    0m0.372s
>>> sys    0m1.084s
>>> # time ./read_whole_file.lua
>>> not enough memory
>>>
>>> real    0m17.125s
>>> user    0m11.737s
>>> sys    0m4.292s
>>>
>>> # texlua -v
>>> This is LuaTeX, Version beta-0.70.1-2012052416 (rev 4277)
>>
>> Indeed not enough mem on my laptop for a 600M+ test.
>>
>> Windows 8, 32 bit:
>>
>> -- all      1.082   34176584    120272.328125
>> -- chunked  0.668   34176584    169908.59667969
>> -- once     1.065   34176584    111757.03710938
>>
>> -- all      7.078   136706337   535063.34863281
>> -- chunked  3.441   136706337   787195.56933594
>> -- once     6.621   136706337   501559.83691406
>>
>> the larger values for 'all' and 'once' still puzzle me.
>
> malloc time, perhaps. The 'fast' *a loader does a couple of fseek()
> ftell()s to find the file size, then malloc()s the whole string
> before feeding it to Lua, then free-ing it again. There is a
> fairly large copy in the feed process that I cannot avoid without
> using lua internals instead of the published API.
>
> Btw, on my SSD disk, there is no noticeable difference between all
> three cases for an 85MB file.

Here (also ssd, but relatively slow sata as it's an 6 year old laptop):

-- all      2.015   85000000    291368
-- chunked  1.140   85000000    268040
-- once     1.997   85000000    291119

Can you explain the

     collectgarbage("collect")
     local m = collectgarbage("count")
     local t = os.clock()
     local f = io.open(name,'rb')
     local n = f:seek("end")
     f:seek("set",0)
     local d = f:read(n)
     f:close()
     print("once",os.clock()-t,#d,collectgarbage("count")-m)

is seek slow? Doesn't seem so, as

     local n = lfs.attributes(name,"size")

gives the same timings. So, maybe the chunker is also mallocing but on 
smaller chunks.

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------


More information about the dev-luatex mailing list