Memory leak in string.explode()?

Reinhard Kotucha

4 Nov 2012 4 Nov '12

3 a.m.

Dear Sirs, please try this file: ----------------------------------------------------------- #! /usr/bin/env texlua --*- Lua -*- local testfile=('testfile') local use_lpeg=false function split (s, sep) sep = lpeg.P(sep) local elem = lpeg.C((1 - sep)^0) local p = lpeg.Ct(elem * (sep * elem)^0) return lpeg.match(p, s) end fh=assert(io.open(testfile, 'r')) while true do local line, rest = fh:read(2^13, '*line') if not line then break end if rest then line = line..rest end if use_lpeg then local tab = split(line, '\n') else local tab = line:explode('\n') end end fh:close() ----------------------------------------------------------- Reading the file into memory this way is extremely fast and efficient in regard to memory consumption. I have to split the string "line" into lines. This can be done either by string.explode() or split(). When I use the lpeg based function split(), everything is fine, though it's slower than string.explode(). But when I use string.explode(), I see in xosview that memory consumption is steadily growing while the program is running. Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------

Show replies by date

Taco Hoekwater

4 Nov 4 Nov

5:08 p.m.

On 11/04/2012 03:00 AM, Reinhard Kotucha wrote:

...

When I use the lpeg based function split(), everything is fine, though it's slower than string.explode(). But when I use string.explode(), I see in xosview that memory consumption is steadily growing while the program is running.

Yup, it leaks. Should be fixed now in repository trunk. Best wishes, Taco

Reinhard Kotucha

11:42 p.m.

On 2012-11-04 at 17:08:34 +0100, Taco Hoekwater wrote:

...

On 11/04/2012 03:00 AM, Reinhard Kotucha wrote:

...
When I use the lpeg based function split(), everything is fine, though it's slower than string.explode(). But when I use string.explode(), I see in xosview that memory consumption is steadily growing while the program is running.

Yup, it leaks. Should be fixed now in repository trunk.

Thanks, Taco. Works properly now. The svn version is even much faster than the one in TL. Factor is abt. sqrt(2). Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------

Reinhard Kotucha

5 Nov 5 Nov

2:32 a.m.

On 2012-11-04 at 23:42:04 +0100, Reinhard Kotucha wrote:

...

On 2012-11-04 at 17:08:34 +0100, Taco Hoekwater wrote:

...
On 11/04/2012 03:00 AM, Reinhard Kotucha wrote:

...
When I use the lpeg based function split(), everything is fine, though it's slower than string.explode(). But when I use string.explode(), I see in xosview that memory consumption is steadily growing while the program is running.

Yup, it leaks. Should be fixed now in repository trunk.

Thanks, Taco. Works properly now. The svn version is even much faster than the one in TL. Factor is abt. sqrt(2).

Hi Taco, there's bad news though: --------------------------------- #! /usr/bin/env texlua --*- Lua -*- s='abc def'; t=s.explode(' +') s=' abc def'; t=s:explode(' +') s='abc def '; t=s:explode(' +') --------------------------------- Each of the tree lines results in a core dump with rev 4468. They don't crash in current TeX Live, though there are no complaints about the syntax error in the first line (s is a string here, not a table). Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------

minux

7:40 a.m.

On Mon, Nov 5, 2012 at 9:32 AM, Reinhard Kotucha wrote:

...

--------------------------------- #! /usr/bin/env texlua --*- Lua -*-

s='abc def'; t=s.explode(' +') s=' abc def'; t=s:explode(' +') s='abc def '; t=s:explode(' +') ---------------------------------

Each of the tree lines results in a core dump with rev 4468. They don't crash in current TeX Live, though there are no complaints about the syntax error in the first line (s is a string here, not a table).

Please try the simple patch attached.

Taco Hoekwater

10 a.m.

On 11/05/2012 07:40 AM, minux wrote:

...

On Mon, Nov 5, 2012 at 9:32 AM, Reinhard Kotucha mailto:reinhard.kotucha@web.de> wrote:

--------------------------------- #! /usr/bin/env texlua --*- Lua -*-

s='abc def'; t=s.explode(' +') s=' abc def'; t=s:explode(' +') s='abc def '; t=s:explode(' +') ---------------------------------

Each of the tree lines results in a core dump with rev 4468. They don't crash in current TeX Live, though there are no complaints about the syntax error in the first line (s is a string here, not a table).

Please try the simple patch attached.

Thanks, applied. Sorry, I must have been half sleeping yesterday. Best wishes, Taco

Reinhard Kotucha

7 Nov 7 Nov

1:50 a.m.

On 2012-11-05 at 10:00:29 +0100, Taco Hoekwater wrote:

...

On 11/05/2012 07:40 AM, minux wrote:

...
On Mon, Nov 5, 2012 at 9:32 AM, Reinhard Kotucha mailto:reinhard.kotucha@web.de> wrote:

--------------------------------- #! /usr/bin/env texlua --*- Lua -*-

s='abc def'; t=s.explode(' +') s=' abc def'; t=s:explode(' +') s='abc def '; t=s:explode(' +') ---------------------------------

Each of the tree lines results in a core dump with rev 4468. They don't crash in current TeX Live, though there are no complaints about the syntax error in the first line (s is a string here, not a table).

Please try the simple patch attached.

Thanks, applied. Sorry, I must have been half sleeping yesterday.

Thanks, also to minux (whoever this is). It works now, but I encountered another issue which is obviously inherited from mainstream Lua. If I read a file in blocks of 8kB and add each block to a table, memory consumption is quite small, though the table finally contains the whole file. Furthermore, loading a file this way is very fast. If I read the whole file into a string at once with f:read("*all"), an enormous amount of memory is used and it's incredibly slow. I would expect that reading a file to a string is faster than to add little chunks to a table. I would also expect that memory consumption is similar because finally the file is in memory (only once, hopefully). However, these files proved me wrong: read_blocks.lua ------------------------------------------------------- #! /usr/bin/env texlua --*- Lua -*- filename='testfile' function pause () print('press Return') io.read('*line') end filecontents={} fh=assert(io.open(filename, 'r')) while true do local block = fh:read(2^13) if not block then break end filecontents[#filecontents+1]=block end fh:close() --[[ out=assert(io.open('out', 'w')) for i,v in ipairs(filecontents) do out:write(v) end out:close() --]] pause() ------------------------------------------------------- read_whole_file.lua ------------------------------------------------------- #! /usr/bin/env texlua --*- Lua -*- filename='testfile' function pause () print('press Return') io.read('*line') end filecontents='' fh=assert(io.open(filename, 'r')) filecontents=fh:read('*all') fh:close() --[[ out=assert(io.open('out', 'w')) out:write(filecontents) out:close() --]] pause() ------------------------------------------------------- The the stuff commented out allows you to write memory content back to a file and compare it with the original, just to be sure that both scripts do the same thing. pause() allows you to inspect the memory usage before the program ends. Comment it out in order to determine speed. With a 600MB testfile I get: $ time ./read_blocks.lua real 0m0.665s user 0m0.318s sys 0m0.346s $ time ./read_whole_file.lua real 0m13.794s user 0m10.733s sys 0m3.055s What I don't grok is why it's 20 times slower to load a file at once than in little chunks. But what I'm mostly concerned about is memory consumption. In xosview I can see that the garbage collector reduces the consumed memory from time to time, but it doesn't seem to be overly hungry. Even worse, I have the impression that memory consumption grows exponentially with time. With a slightly larger test file my system (4GB RAM) would certainly run out of memory. Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------

Taco Hoekwater

9:59 a.m.

On 11/07/2012 01:50 AM, Reinhard Kotucha wrote:

...

What I don't grok is why it's 20 times slower to load a file at once than in little chunks.

Traditional Lua reads *a file contents in BUFSIZ chunks, that are then concatenated with the buffer that is being built up internally. The resulting [re|m]alloc()s slow down the reading (a lot). The patched version we use in luatex reads files < 100MB in one block, and falls back to standard lua behaviour in other cases. The 100MB ceiling is arbitrary, and I could remove the limit. To be honest, I am not quite sure any more why I put that limit in there in the first place. On the garbage collector: it does not do very well if you increase the memory requirements fast. That is just the way sweeping GC's work. It would be nice if lua would do reference counting instead, but that is a lot of work and quite hard to get right, resulting in, afaik, all attempts at implementing that being abandoned before reaching production quality. Best wishes, Taco

Hans Hagen

10:07 a.m.

On 11/7/2012 1:50 AM, Reinhard Kotucha wrote:

...

What I don't grok is why it's 20 times slower to load a file at once than in little chunks. But what I'm mostly concerned about is memory consumption. In xosview I can see that the garbage collector reduces the consumed memory from time to time, but it doesn't seem to be overly hungry. Even worse, I have the impression that memory consumption grows exponentially with time. With a slightly larger test file my system (4GB RAM) would certainly run out of memory.

I think 20 times is somewhat off at your end because here I get this: all 1.078 34176584 120272.328125 chunked 0.707 34176584 169908.59667969 once 1.129 34176584 111757.03710938 with: do collectgarbage("collect") local m = collectgarbage("count") local t = os.clock() local f = io.open("all.xml",'rb') local d = f:read("*all") f:close() print("all",os.clock()-t,#d,collectgarbage("count")-m) end do collectgarbage("collect") local m = collectgarbage("count") local d = { } local t = os.clock() local f = io.open("all.xml",'rb') while true do local r = f:read(2^13) if not r then break else d[#d+1] = r end end f:close() d = table.concat(d) print("chunked",os.clock()-t,#d,collectgarbage("count")-m) end do collectgarbage("collect") local m = collectgarbage("count") local t = os.clock() local f = io.open("all.xml",'rb') local n = f:seek("end") f:seek("set",0) local d = f:read(n) f:close() print("once",os.clock()-t,#d,collectgarbage("count")-m) end When doing such tests, make sure that you do a garbage collection run and clean up old variables. If I remember right, luatex has a patched buffersize (and fast loading when using "all") because we did similar tests in the beginning. What I don't understand is that the *all is so much slower. In fact, it should be faster because only one string has to be hashed, unless deep down, lua collects small snippets and hashes them. Actually, I always load with "all" because in the beginning it was way faster, so something is messed up. The chuncked approach uses more memory. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Taco Hoekwater

10:33 a.m.

On 11/07/2012 10:07 AM, Hans Hagen wrote:

...

Actually, I always load with "all" because in the beginning it was way faster, so something is messed up. The chuncked approach uses more memory.

See my post. You need a file > 100MB to see the slowdown. Best wishes, Taco

Hans Hagen

10:57 a.m.

On 11/7/2012 10:33 AM, Taco Hoekwater wrote:

...

On 11/07/2012 10:07 AM, Hans Hagen wrote:

...
Actually, I always load with "all" because in the beginning it was way faster, so something is messed up. The chuncked approach uses more memory.

See my post. You need a file > 100MB to see the slowdown.

yes, but i still wonder why the < 100M for *all is slower, as mem can be allocated in one go Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

luigi scarso

10:33 a.m.

On Wed, Nov 7, 2012 at 10:07 AM, Hans Hagen wrote:

...

On 11/7/2012 1:50 AM, Reinhard Kotucha wrote:

What I don't grok is why it's 20 times slower to load a file at once

...
than in little chunks. But what I'm mostly concerned about is memory consumption. In xosview I can see that the garbage collector reduces the consumed memory from time to time, but it doesn't seem to be overly hungry. Even worse, I have the impression that memory consumption grows exponentially with time. With a slightly larger test file my system (4GB RAM) would certainly run out of memory.

I think 20 times is somewhat off at your end because here I get this:

Out of memory here with a testfile of 662M Linux 32bit, 4GByte, PAE extension # time ./read_blocks.lua real 0m2.458s user 0m0.372s sys 0m1.084s # time ./read_whole_file.lua not enough memory real 0m17.125s user 0m11.737s sys 0m4.292s # texlua -v This is LuaTeX, Version beta-0.70.1-2012052416 (rev 4277) (i.e. not the patched one) -- luigi

Hans Hagen

11:07 a.m.

On 11/7/2012 10:33 AM, luigi scarso wrote:

...

On Wed, Nov 7, 2012 at 10:07 AM, Hans Hagen mailto:pragma@wxs.nl> wrote:

On 11/7/2012 1:50 AM, Reinhard Kotucha wrote:

What I don't grok is why it's 20 times slower to load a file at once than in little chunks. But what I'm mostly concerned about is memory consumption. In xosview I can see that the garbage collector reduces the consumed memory from time to time, but it doesn't seem to be overly hungry. Even worse, I have the impression that memory consumption grows exponentially with time. With a slightly larger test file my system (4GB RAM) would certainly run out of memory.

I think 20 times is somewhat off at your end because here I get this:

Out of memory here with a testfile of 662M Linux 32bit, 4GByte, PAE extension

# time ./read_blocks.lua

real 0m2.458s user 0m0.372s sys 0m1.084s # time ./read_whole_file.lua not enough memory

real 0m17.125s user 0m11.737s sys 0m4.292s

# texlua -v This is LuaTeX, Version beta-0.70.1-2012052416 (rev 4277)

Indeed not enough mem on my laptop for a 600M+ test. Windows 8, 32 bit: -- all 1.082 34176584 120272.328125 -- chunked 0.668 34176584 169908.59667969 -- once 1.065 34176584 111757.03710938 -- all 7.078 136706337 535063.34863281 -- chunked 3.441 136706337 787195.56933594 -- once 6.621 136706337 501559.83691406 the larger values for 'all' and 'once' still puzzle me. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

luigi scarso

11:12 a.m.

On Wed, Nov 7, 2012 at 11:07 AM, Hans Hagen wrote:

...

Out of memory here with a testfile of 662M

...
Linux 32bit, 4GByte, PAE extension

# time ./read_blocks.lua

real 0m2.458s user 0m0.372s sys 0m1.084s # time ./read_whole_file.lua not enough memory

real 0m17.125s user 0m11.737s sys 0m4.292s

# texlua -v This is LuaTeX, Version beta-0.70.1-2012052416 (rev 4277)

Indeed not enough mem on my laptop for a 600M+ test.

Windows 8, 32 bit:

-- all 1.082 34176584 120272.328125 -- chunked 0.668 34176584 169908.59667969 -- once 1.065 34176584 111757.03710938

-- all 7.078 136706337 535063.34863281 -- chunked 3.441 136706337 787195.56933594 -- once 6.621 136706337 501559.83691406

the larger values for 'all' and 'once' still puzzle me.

Note that I have a not enough memory only with read_whole_file.lua read_blocks.lua is ok (and fast) -- luigi

Taco Hoekwater

11:21 a.m.

On 11/07/2012 11:07 AM, Hans Hagen wrote:

...

On 11/7/2012 10:33 AM, luigi scarso wrote:

...
On Wed, Nov 7, 2012 at 10:07 AM, Hans Hagen mailto:pragma@wxs.nl> wrote:

On 11/7/2012 1:50 AM, Reinhard Kotucha wrote:

What I don't grok is why it's 20 times slower to load a file at once than in little chunks. But what I'm mostly concerned about is memory consumption. In xosview I can see that the garbage collector reduces the consumed memory from time to time, but it doesn't seem to be overly hungry. Even worse, I have the impression that memory consumption grows exponentially with time. With a slightly larger test file my system (4GB RAM) would certainly run out of memory.

I think 20 times is somewhat off at your end because here I get this:

Out of memory here with a testfile of 662M Linux 32bit, 4GByte, PAE extension

# time ./read_blocks.lua

real 0m2.458s user 0m0.372s sys 0m1.084s # time ./read_whole_file.lua not enough memory

real 0m17.125s user 0m11.737s sys 0m4.292s

# texlua -v This is LuaTeX, Version beta-0.70.1-2012052416 (rev 4277)

Indeed not enough mem on my laptop for a 600M+ test.

Windows 8, 32 bit:

-- all 1.082 34176584 120272.328125 -- chunked 0.668 34176584 169908.59667969 -- once 1.065 34176584 111757.03710938

-- all 7.078 136706337 535063.34863281 -- chunked 3.441 136706337 787195.56933594 -- once 6.621 136706337 501559.83691406

the larger values for 'all' and 'once' still puzzle me.

malloc time, perhaps. The 'fast' *a loader does a couple of fseek() ftell()s to find the file size, then malloc()s the whole string before feeding it to Lua, then free-ing it again. There is a fairly large copy in the feed process that I cannot avoid without using lua internals instead of the published API. Btw, on my SSD disk, there is no noticeable difference between all three cases for an 85MB file.

Hans Hagen

11:33 a.m.

On 11/7/2012 11:21 AM, Taco Hoekwater wrote:

...

On 11/07/2012 11:07 AM, Hans Hagen wrote:

...
On 11/7/2012 10:33 AM, luigi scarso wrote:

...
On Wed, Nov 7, 2012 at 10:07 AM, Hans Hagen mailto:pragma@wxs.nl> wrote:

On 11/7/2012 1:50 AM, Reinhard Kotucha wrote:

What I don't grok is why it's 20 times slower to load a file at once than in little chunks. But what I'm mostly concerned about is memory consumption. In xosview I can see that the garbage collector reduces the consumed memory from time to time, but it doesn't seem to be overly hungry. Even worse, I have the impression that memory consumption grows exponentially with time. With a slightly larger test file my system (4GB RAM) would certainly run out of memory.

I think 20 times is somewhat off at your end because here I get this:

Out of memory here with a testfile of 662M Linux 32bit, 4GByte, PAE extension

# time ./read_blocks.lua

real 0m2.458s user 0m0.372s sys 0m1.084s # time ./read_whole_file.lua not enough memory

real 0m17.125s user 0m11.737s sys 0m4.292s

# texlua -v This is LuaTeX, Version beta-0.70.1-2012052416 (rev 4277)

Indeed not enough mem on my laptop for a 600M+ test.

Windows 8, 32 bit:

-- all 1.082 34176584 120272.328125 -- chunked 0.668 34176584 169908.59667969 -- once 1.065 34176584 111757.03710938

-- all 7.078 136706337 535063.34863281 -- chunked 3.441 136706337 787195.56933594 -- once 6.621 136706337 501559.83691406

the larger values for 'all' and 'once' still puzzle me.

malloc time, perhaps. The 'fast' *a loader does a couple of fseek() ftell()s to find the file size, then malloc()s the whole string before feeding it to Lua, then free-ing it again. There is a fairly large copy in the feed process that I cannot avoid without using lua internals instead of the published API.

Btw, on my SSD disk, there is no noticeable difference between all three cases for an 85MB file.

Here (also ssd, but relatively slow sata as it's an 6 year old laptop): -- all 2.015 85000000 291368 -- chunked 1.140 85000000 268040 -- once 1.997 85000000 291119 Can you explain the collectgarbage("collect") local m = collectgarbage("count") local t = os.clock() local f = io.open(name,'rb') local n = f:seek("end") f:seek("set",0) local d = f:read(n) f:close() print("once",os.clock()-t,#d,collectgarbage("count")-m) is seek slow? Doesn't seem so, as local n = lfs.attributes(name,"size") gives the same timings. So, maybe the chunker is also mallocing but on smaller chunks. ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Taco Hoekwater

11:53 a.m.

On 11/07/2012 11:33 AM, Hans Hagen wrote:

...

-- all 2.015 85000000 291368 -- chunked 1.140 85000000 268040 -- once 1.997 85000000 291119

Your 'all' and 'once' cases condense to the same C function, so there should not be a noticeable difference. Best wishes, Taco

Hans Hagen

11:43 a.m.

On 11/7/2012 11:21 AM, Taco Hoekwater wrote: at my end 2^24 is the most efficient (in time) block size ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Hans Hagen

1:08 p.m.

Hi Reinhard, At my end, this works best: function io.readall(f) local size = f:seek("end") if size == 0 then return "" elseif size < 1024*1024 then f:seek("set",0) return f:read('*all') else local done = f:seek("set",0) if size < 1024*1024 then step = 1024 * 1024 elseif size > 16*1024*1024 then step = 16*1024*1024 else step = math.floor(size/(1024*1024)) * 1024 * 1024 / 8 end local data = { } while true do local r = f:read(step) if not r then return table.concat(data) else data[#data+1] = r end end end end usage: local f = io.open(name) if f then data = io.readall(f) f:close() end upto 50% faster and often less mem usage Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Hans Hagen

1:10 p.m.

On 11/7/2012 1:08 PM, Hans Hagen wrote:

...

Hi Reinhard,

At my end, this works best:

.. btw, speed is not so much an issue (because network speed, disk speed, os caching plays a role too and often manipulating that large amounts of data takes way more processing time) but the less mem consumption side effect is nice Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Reinhard Kotucha

8 Nov 8 Nov

2:05 a.m.

On 2012-11-07 at 13:08:33 +0100, Hans Hagen wrote:

...

Hi Reinhard,

At my end, this works best:

function io.readall(f) local size = f:seek("end") if size == 0 then return "" elseif size < 1024*1024 then f:seek("set",0) return f:read('*all') else local done = f:seek("set",0) if size < 1024*1024 then step = 1024 * 1024 elseif size > 16*1024*1024 then step = 16*1024*1024 else step = math.floor(size/(1024*1024)) * 1024 * 1024 / 8 end local data = { } while true do local r = f:read(step) if not r then return table.concat(data) else data[#data+1] = r end end end end

usage:

local f = io.open(name) if f then data = io.readall(f) f:close() end

upto 50% faster and often less mem usage

Thank you, Hans. Here it's faster than reading the file at once but still slower than reading 8k Blocks. It also consumes as much memory as reading the file at once (and memory consumption grows exponentially), but I could reduce memory consumption significantly replacing return table.concat(data) with return data table.concat() keeps the file twice in memory, once as a table and once as a string.

...

btw, speed is not so much an issue (because network speed, disk speed, os caching plays a role too and often manipulating that large amounts of data takes way more processing time) but the less mem consumption side effect is nice

Yes, memory consumption is a problem on my machine at work. I'm running Linux in a virtual machine under 32-bit Windows. Windows can only use 3GB of memory and uses 800MB itself. Though I can assign more than 3GB to the VM, I suppose that I actually have less than 2.2GB and the rest is provided by a swap file. Furthermore, multi tasking/multi user systems can only work if no program assumes that it's the only one which is running. Speed is important in many cases. And I think that if you're writing a function you want to use in various scripts, it's worthwhile to evaluate the parameters carefully. The idea I had was to write a function which allows to read a text file efficiently. It should also be flexible and easy to use. In Lua it's convenient to read a file either line-by-line or at once. Both are not efficient. The first is extremely slow when lines are short and the latter consumes a lot of memory. And in many cases you don't even need the content of the whole file. What I have so far is a function which reads a block and [the rest of] a line within an endless loop. Each chunk is split into lines. It takes two arguments, the file name and a function. For each chunk, the function is run on each line. Thus I'm able to filter the data and not everything has to be stored in memory. ------------------------------------------------ #! /usr/bin/env texlua --*- Lua -*- function readfile (filename, fun) local lineno=1 fh=assert(io.open(filename, 'r')) while true do local line, rest = fh:read(2^13, '*line') if not line then break end if rest then line = line..rest end local tab = line:explode('\n') for i, v in ipairs(tab) do fun(v, lineno) lineno=lineno+1 end end fh:close() end function process_line (line, n) print(n, line) end readfile ('testfile', process_line) ------------------------------------------------ Memory consumption is either 8kB or the length of the longest line unless you store lines in a string or table. Almost no extra memory is needed if you manipulate each line somehow and write the result to another file. The only files I encountered which are really large are CSV-like files which contain rows and columns of numbers, but the function process_line() allows me to select only the rows and columns I want to pass to pgfplots, for example.

...

at my end 2^24 is the most efficient (in time) block size

I found out that 2^13 is most efficient. But I suppose that the most important thing is that it's an integer multiple of a filesystem data block. Since Taco provided os.type() and os.name(), it's possible to to make the chunk size system dependent. But I fear that the actual hardware (SSD vs. magnetic disk) has a bigger impact than the OS. Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------

Hans Hagen

11:36 a.m.

On 11/8/2012 2:05 AM, Reinhard Kotucha wrote:

...

Thank you, Hans. Here it's faster than reading the file at once but still slower than reading 8k Blocks. It also consumes as much memory as reading the file at once (and memory consumption grows exponentially), but I could reduce memory consumption significantly replacing

return table.concat(data)

with

return data

table.concat() keeps the file twice in memory, once as a table and once as a string.

but if you want to compare the *all with blockwise loading you need to do the concat because otherwise you compare differen things; it's the concat that is costly (more than twice as much as the loading)

...

Yes, memory consumption is a problem on my machine at work. I'm running Linux in a virtual machine under 32-bit Windows. Windows can only use 3GB of memory and uses 800MB itself. Though I can assign more than 3GB to the VM, I suppose that I actually have less than 2.2GB and the rest is provided by a swap file. Furthermore, multi tasking/multi user systems can only work if no program assumes that it's the only one which is running.

ah, but using a vm is making comparison problematic because in many cases a vm's file handling can be faster than in bare metal (tex uses one core only but in a vm the second core kicks in for some management tasks)

...

Speed is important in many cases. And I think that if you're writing a function you want to use in various scripts, it's worthwhile to evaluate the parameters carefully.

sure, i do lots of speed/efficiency tests

...

The idea I had was to write a function which allows to read a text file efficiently. It should also be flexible and easy to use.

yes, but keep in mind that there are many parameters that influences it, like caching (an initial make format - fresh machine startup - can for instance take 5 times more time than a successive one and the same is true with this kind of tests)

...

In Lua it's convenient to read a file either line-by-line or at once. Both are not efficient. The first is extremely slow when lines are short and the latter consumes a lot of memory. And in many cases you don't even need the content of the whole file.

line based reading needs to parse lines; it's faster to read the whole file with "rb" and loop over lines with for s in string.gmatch("(.-)\n") do or something similar

...

What I have so far is a function which reads a block and [the rest of] a line within an endless loop. Each chunk is split into lines. It takes two arguments, the file name and a function. For each chunk, the function is run on each line. Thus I'm able to filter the data and not everything has to be stored in memory.

------------------------------------------------ #! /usr/bin/env texlua --*- Lua -*-

function readfile (filename, fun) local lineno=1 fh=assert(io.open(filename, 'r')) while true do local line, rest = fh:read(2^13, '*line') if not line then break end if rest then line = line..rest end local tab = line:explode('\n') for i, v in ipairs(tab) do fun(v, lineno) lineno=lineno+1 end end fh:close() end

function process_line (line, n) print(n, line) end

readfile ('testfile', process_line)

you still store the exploded tab

...

------------------------------------------------

Memory consumption is either 8kB or the length of the longest line unless you store lines in a string or table. Almost no extra memory

you do store them as the explode splits a max 2^13 chunk into lines

...

is needed if you manipulate each line somehow and write the result to another file. The only files I encountered which are really large are CSV-like files which contain rows and columns of numbers, but the function process_line() allows me to select only the rows and columns I want to pass to pgfplots, for example.

...
at my end 2^24 is the most efficient (in time) block size

I found out that 2^13 is most efficient. But I suppose that the most important thing is that it's an integer multiple of a filesystem data block. Since Taco provided os.type() and os.name(), it's possible to to make the chunk size system dependent. But I fear that the actual hardware (SSD vs. magnetic disk) has a bigger impact than the OS.

it's not os dependent but filesystem dependent and often disk sector dependent here's one that does not need the split local chunksize = 2^13 -- needs to be larger than last line ! local chunksize = 2^12 -- quite okay function processlinebyline(filename,action) local filehandle = io.open(filename,'rb') if not filehandle then return end local linenumber = 0 local cursor = 0 local lastcursor = nil while true do filehandle:seek("set",cursor) if lastcursor == cursor then -- we can also wnd up here when a line is too long to fit in the -- buffer local line = filehandle:read(chunksize) if line then linenumber = linenumber + 1 action(line,linenumber) end filehandle:close() return else local buffer = filehandle:read(chunksize) if not buffer then filehandle:close() return end local grab = string.gmatch(buffer,"([^\n\r]-)(\r?\n)") local line, eoline = grab() lastcursor = cursor while line do local next, eonext = grab() if next then linenumber = linenumber + 1 if action(line,linenumber) then filehandle:close() return end cursor = cursor + #line + #eoline line = next eoline = eonext lastcursor = nil else break end end end end end function processline(line,n) if n > 100 and n < 200 then print(n,#line,line) -- return true -- quits the loop end end processlinebyline('somefile.txt',processline) -- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------

Reinhard Kotucha

10 Nov 10 Nov

12:02 a.m.

On 2012-11-08 at 11:36:37 +0100, Hans Hagen wrote:

...

On 11/8/2012 2:05 AM, Reinhard Kotucha wrote:

...
Thank you, Hans. Here it's faster than reading the file at once but still slower than reading 8k Blocks. It also consumes as much memory as reading the file at once (and memory consumption grows exponentially), but I could reduce memory consumption significantly replacing

return table.concat(data)

with

return data

table.concat() keeps the file twice in memory, once as a table and once as a string.

but if you want to compare the *all with blockwise loading you need to do the concat because otherwise you compare differen things; it's the concat that is costly (more than twice as much as the loading)

Yes, I removed it in order to confirm that it's responsible for the the memory consumption.

...

...
Yes, memory consumption is a problem on my machine at work. I'm running Linux in a virtual machine under 32-bit Windows. Windows can only use 3GB of memory and uses 800MB itself. Though I can assign more than 3GB to the VM, I suppose that I actually have less than 2.2GB and the rest is provided by a swap file. Furthermore, multi tasking/multi user systems can only work if no program assumes that it's the only one which is running.

ah, but using a vm is making comparison problematic because in many cases a vm's file handling can be faster than in bare metal (tex uses one core only but in a vm the second core kicks in for some management tasks)

Sorry, forgot to mention that I did all the comparisons on my 64-bit Linux box with 4GB RAM at home. Another problem at work is that I failed to compile xosview under CentOS. So I don't see when the system is swapping, which might happen frequently on the VM.

...

...
Speed is important in many cases. And I think that if you're writing a function you want to use in various scripts, it's worthwhile to evaluate the parameters carefully.

sure, i do lots of speed/efficiency tests

I know. However, I just installed Subversion and copiled the latest SVN version of LuaTeX on my Raspberry Pi. If you or anybody else is interested in benchmarks, just send me your test files.

...

...
The idea I had was to write a function which allows to read a text file efficiently. It should also be flexible and easy to use.

yes, but keep in mind that there are many parameters that influences it, like caching (an initial make format - fresh machine startup - can for instance take 5 times more time than a successive one and the same is true with this kind of tests)

When using the cache, I usually clear it first and then run the script several times. I also obey xosview in order to make sure that no other processes interfere. I think that an empty cache is what you have after a fresh startup. And the most important thing is that no web-browser is running when doing benchmarks.

...

...
In Lua it's convenient to read a file either line-by-line or at once. Both are not efficient. The first is extremely slow when lines are short and the latter consumes a lot of memory. And in many cases you don't even need the content of the whole file.

line based reading needs to parse lines; it's faster to read the whole file with "rb" and loop over lines with

for s in string.gmatch("(.-)\n") do

or something similar

Hmm, something similar is Taco's string.explode() function. It's much faster than regular expressions, so I prefer it. What I didn't consider yet is that the separator can only be either \r or \n and I have to know in advance which linebreaks are used. But I have some ideas how to solve the problem.

...

...
What I have so far is a function which reads a block and [the rest of] a line within an endless loop. Each chunk is split into lines. It takes two arguments, the file name and a function. For each chunk, the function is run on each line. Thus I'm able to filter the data and not everything has to be stored in memory.

------------------------------------------------ #! /usr/bin/env texlua --*- Lua -*-

function readfile (filename, fun) local lineno=1 fh=assert(io.open(filename, 'r')) while true do local line, rest = fh:read(2^13, '*line') if not line then break end if rest then line = line..rest end local tab = line:explode('\n') for i, v in ipairs(tab) do fun(v, lineno) lineno=lineno+1 end end fh:close() end

function process_line (line, n) print(n, line) end

readfile ('testfile', process_line)

you still store the exploded tab

...
------------------------------------------------

Memory consumption is either 8kB or the length of the longest line unless you store lines in a string or table. Almost no extra memory

you do store them as the explode splits a max 2^13 chunk into lines

Sure. But as far as I can see it doesn't hurt. The table is overwritten whenever a new chunk is processed. Thus, things don't accumulate. I don't know what happens when I overwrite a table. Maybe the new one allocates new memory and the old one is left to the garbage collector. But if this is the case, then the garbage collector does a pretty good job. The function is very fast and memory cunsumption isn't even visible in xosview. BTW, the f:read(BUFFER, '*line') concept can be less efficient if lines are extremely long...

...

...
is needed if you manipulate each line somehow and write the result to another file. The only files I encountered which are really large are CSV-like files which contain rows and columns of numbers, but the function process_line() allows me to select only the rows and columns I want to pass to pgfplots, for example.

...
at my end 2^24 is the most efficient (in time) block size

I found out that 2^13 is most efficient. But I suppose that the most important thing is that it's an integer multiple of a filesystem data block. Since Taco provided os.type() and os.name(), it's possible to to make the chunk size system dependent. But I fear that the actual hardware (SSD vs. magnetic disk) has a bigger impact than the OS.

it's not os dependent but filesystem dependent and often disk sector dependent

here's one that does not need the split

Well, it splits the file though: string.gmatch(buffer,"([^\n\r]-)(\r?\n)") I suppose that the most promising approach is to use regexps in order to determine the linebreak style, abort, and read the file again using Taco's function. Anyway, our discussion is obviously off-topic here. Hans, I'll inform you about the results by private mail. If anybody else is interested in the results, just drop me a line. Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------

Hans Hagen

12:17 a.m.

On 11/10/2012 12:02 AM, Reinhard Kotucha wrote:

...

I know. However, I just installed Subversion and copiled the latest SVN version of LuaTeX on my Raspberry Pi. If you or anybody else is interested in benchmarks, just send me your test files.

Interesting (i have one laying around). Did you use a 'real disk' or the small card?

...

When using the cache, I usually clear it first and then run the script several times. I also obey xosview in order to make sure that no other processes interfere. I think that an empty cache is what you have after a fresh startup. And the most important thing is that no web-browser is running when doing benchmarks.

Indeed, and no thunderbird either -)

...

Hmm, something similar is Taco's string.explode() function. It's much faster than regular expressions, so I prefer it. What I didn't

Right, The reason for introducing string.explode is simple splitting.

...

Sure. But as far as I can see it doesn't hurt. The table is overwritten whenever a new chunk is processed. Thus, things don't accumulate. I don't know what happens when I overwrite a table.

But you still store the data twice (and I thought that you wanted to limit mem consumption)

...

Maybe the new one allocates new memory and the old one is left to the garbage collector. But if this is the case, then the garbage collector does a pretty good job. The function is very fast and memory cunsumption isn't even visible in xosview.

it sometimes helps to do a sweep: collectgarbage("collect"), you could play with doing that after (say) every 5 buffer loads

...

BTW, the f:read(BUFFER, '*line') concept can be less efficient if lines are extremely long...

inefficient anyway, just as the 'lines' method

...

Well, it splits the file though:

string.gmatch(buffer,"([^\n\r]-)(\r?\n)")

I suppose that the most promising approach is to use regexps in order to determine the linebreak style, abort, and read the file again using Taco's function.

yes, that sounds best (you could look at the last few characters of the file, assuming that the log files ends each line with a newline)

...

Anyway, our discussion is obviously off-topic here. Hans, I'll inform you about the results by private mail. If anybody else is interested in the results, just drop me a line.

Reinhard Kotucha

11 Nov 11 Nov

12:10 a.m.

On 2012-11-10 at 00:17:01 +0100, Hans Hagen wrote:

...

On 11/10/2012 12:02 AM, Reinhard Kotucha wrote:

...
I know. However, I just installed Subversion and copiled the latest SVN version of LuaTeX on my Raspberry Pi. If you or anybody else is interested in benchmarks, just send me your test files.

Interesting (i have one laying around). Did you use a 'real disk' or the small card?

The system is on a 4GB SD card. In order to use TeX Live, I simply mount a 16GB USB stick permanently. Extending memory this way is possible here because I access the Raspberry via ssh. I could also attach my SATA docking station to the second USB port. Another option is to mount an NFS partition from another machine. BTW, if you're using the Debian image, it's good to know that ssh is already enabled and all you have to do is make a port scan on the range of IP numbers your DHCP server assigns. Quite useful if you don't have an HDMI cable and/or USB keyboard at hand. Regards, Reinhard -- ---------------------------------------------------------------------------- Reinhard Kotucha Phone: +49-511-3373112 Marschnerstr. 25 D-30167 Hannover mailto:reinhard.kotucha@web.de ---------------------------------------------------------------------------- Microsoft isn't the answer. Microsoft is the question, and the answer is NO. ----------------------------------------------------------------------------

4634

Age (days ago)

4640

Last active (days ago)

List overview

Download

24 comments

5 participants

participants (5)

Hans Hagen
luigi scarso
minux
Reinhard Kotucha
Taco Hoekwater

Memory leak in string.explode()?

tags

participants (5)