[Dev-luatex] Memory leak in string.explode()?

Reinhard Kotucha reinhard.kotucha at web.de
Wed Nov 7 01:50:54 CET 2012


On 2012-11-05 at 10:00:29 +0100, Taco Hoekwater wrote:

 > On 11/05/2012 07:40 AM, minux wrote:
 > >
 > > On Mon, Nov 5, 2012 at 9:32 AM, Reinhard Kotucha
 > > <reinhard.kotucha at web.de <mailto:reinhard.kotucha at web.de>> wrote:
 > >
 > >     ---------------------------------
 > >     #! /usr/bin/env texlua
 > >     --*- Lua -*-
 > >
 > >     s='abc def'; t=s.explode(' +')
 > >     s=' abc def'; t=s:explode(' +')
 > >     s='abc def '; t=s:explode(' +')
 > >     ---------------------------------
 > >
 > >     Each of the tree lines results in a core dump with rev 4468.  They
 > >     don't crash in current TeX Live, though there are no complaints about
 > >     the syntax error in the first line (s is a string here, not a table).
 > >
 > > Please try the simple patch attached.
 > 
 > Thanks, applied. Sorry, I must have been half sleeping yesterday.

Thanks, also to minux (whoever this is).

It works now, but I encountered another issue which is obviously
inherited from mainstream Lua.

If I read a file in blocks of 8kB and add each block to a table,
memory consumption is quite small, though the table finally contains
the whole file.  Furthermore, loading a file this way is very fast.

If I read the whole file into a string at once with f:read("*all"), an
enormous amount of memory is used and it's incredibly slow.

I would expect that reading a file to a string is faster than to add
little chunks to a table.  I would also expect that memory consumption
is similar because finally the file is in memory (only once, hopefully).

However, these files proved me wrong:


read_blocks.lua
-------------------------------------------------------
#! /usr/bin/env texlua
--*- Lua -*-

filename='testfile'

function pause ()
   print('press Return')
   io.read('*line')
end

filecontents={}

fh=assert(io.open(filename, 'r'))
while true do
   local block = fh:read(2^13)
   if not block then break end
   filecontents[#filecontents+1]=block
end
fh:close()

--[[
out=assert(io.open('out', 'w'))
for i,v in ipairs(filecontents) do
   out:write(v)
end
out:close()
--]]

pause()
-------------------------------------------------------




read_whole_file.lua
-------------------------------------------------------
#! /usr/bin/env texlua
--*- Lua -*-

filename='testfile'

function pause ()
   print('press Return')
   io.read('*line')
end

filecontents=''

fh=assert(io.open(filename, 'r'))
filecontents=fh:read('*all')
fh:close()

--[[
out=assert(io.open('out', 'w'))
out:write(filecontents)
out:close()
--]]

pause()
-------------------------------------------------------

The the stuff commented out allows you to write memory content back to
a file and compare it with the original, just to be sure that both
scripts do the same thing.

pause() allows you to inspect the memory usage before the program
ends.  Comment it out in order to determine speed.  With a 600MB
testfile I get:

$ time ./read_blocks.lua 
real	0m0.665s
user	0m0.318s
sys	0m0.346s

$ time ./read_whole_file.lua 
real	0m13.794s
user	0m10.733s
sys	0m3.055s

What I don't grok is why it's 20 times slower to load a file at once
than in little chunks.  But what I'm mostly concerned about is memory
consumption.  In xosview I can see that the garbage collector reduces
the consumed memory from time to time, but it doesn't seem to be
overly hungry.  Even worse, I have the impression that memory
consumption grows exponentially with time.  With a slightly larger
test file my system (4GB RAM) would certainly run out of memory.

Regards,
  Reinhard

-- 
----------------------------------------------------------------------------
Reinhard Kotucha                                      Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover                              mailto:reinhard.kotucha at web.de
----------------------------------------------------------------------------
Microsoft isn't the answer. Microsoft is the question, and the answer is NO.
----------------------------------------------------------------------------


More information about the dev-luatex mailing list