Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)

From: Sandro Santilli <strk(at)keybit(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-19 08:57:02
Message-ID: 20140319085702.GF4042@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

A valgrind view on the matter (over a single call, in --single mode):

==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 203 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B80A: cdissect (regexec.c:673)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240== by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240== by 0x5BB471: ExecResult (nodeResult.c:155)
==21240== by 0x5A1E77: ExecProcNode (execProcnode.c:373)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 204 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B7EE: cdissect (regexec.c:671)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240== by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240== by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 205 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B80A: cdissect (regexec.c:673)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 206 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B7EE: cdissect (regexec.c:671)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 207 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B97C: cdissect (regexec.c:975)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 227 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B80A: cdissect (regexec.c:673)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240== by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240== by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 228 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B7EE: cdissect (regexec.c:671)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 229 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B80A: cdissect (regexec.c:673)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 230 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B512: cdissect (regexec.c:903)
==21240== by 0x64BAF8: cdissect (regexec.c:1050)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==
==21240== LEAK SUMMARY:
==21240== definitely lost: 69,064 bytes in 12 blocks
==21240== indirectly lost: 38,143 bytes in 102 blocks
==21240== possibly lost: 0 bytes in 0 blocks
==21240== still reachable: 1,317,683 bytes in 342 blocks

--strk;

On Wed, Mar 19, 2014 at 08:53:59AM +0100, Sandro Santilli wrote:
> On Tue, Mar 18, 2014 at 07:41:32PM -0400, Tom Lane wrote:
> > Sandro Santilli <strk(at)keybit(dot)net> writes:
> > > Tom: I saw you pushed a fix for the interruptability with this:
> > > https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0
> >
> > > But we also noticed a memory leak in the regepx_replace call, did you notice
> > > that in your tests ? Same regexp as reported. Do you need another testcase ?
> >
> > I see no memory growth while running your original example against HEAD.
>
> This is what I'm running:
>
> SELECT regexp_matches('test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' );
>
> And this is what I see as the process memory:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:00 0 5693 172446 2908 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 172718 4224 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 173250 4488 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 173670 5016 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 174070 5280 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 174494 5544 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 174762 5808 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 175690 6600 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 179006 9240 0.0 postgres: strk strk [local] idle
>
> This is as of commit 63817f86b57fc3d29b57787bca9d786218b7ee25
>
> Or, simpler, from this state:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:00 0 5693 1274962 869832 5.3 postgres: strk strk [local] idle
>
> I run this:
>
> SELECT count(*) from (
> select regexp_matches( 'test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' )
> FROM generate_series(1, 10000)
> ) as f;
>
> And get here:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:00 0 5693 1274962 869832 5.3 postgres: strk strk [local] idle
> 5359 ? Rs 0:00 0 5693 1555894 1090536 6.6 postgres: strk strk [local] SELECT
> 5359 ? Ss 0:00 0 5693 2269682 1650744 10.0 postgres: strk strk [local] idle
>
> The memory is only released on session close.
> This is with generate_series going up to 1e5:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:00 0 5693 2269682 1650744 10.0 postgres: strk strk [local] idle
> 5359 ? Rs 0:00 0 5693 2300726 1675560 10.2 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:01 0 5693 6438626 4923024 30.0 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:02 0 5693 10591778 8182368 50.0 postgres: strk strk [local] SELECT
> 5359 ? Ss 0:03 0 5693 12216566 9457488 57.8 postgres: strk strk [local] idle
>
> Up to 1e6 the story ends with a crash, after:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:03 0 5693 12216566 9457488 57.8 postgres: strk strk [local] idle
> 5359 ? Rs 0:04 0 5693 15666674 12166392 74.3 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:05 0 5693 19403254 15099056 92.3 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:06 0 5693 20284546 15366412 93.9 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:07 1 5693 21665438 15333312 93.7 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:08 1 5693 23065426 15295472 93.5 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:09 1 5693 24468334 15256592 93.2 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:10 1 5693 25869382 15225544 93.0 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:11 1 5693 27280514 15210012 92.9 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:12 1 5693 28684350 15173964 92.7 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:12 1 5693 29683014 15100680 92.3 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:13 1 5693 30298130 15051580 92.0 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:13 1 5693 30979690 15088992 92.2 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:13 1 5693 31758466 15139792 92.5 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:14 1 5693 32406206 15111128 92.3 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:14 1 5693 33045722 15131676 92.5 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:14 1 5693 33479010 15135036 92.5 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:15 1 5693 33847174 15146748 92.6 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:15 1 5693 34186830 15131648 92.5 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:15 1 5693 34339878 15119460 92.4 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:15 1 5693 34502474 15139816 92.5 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:15 1 5693 34625554 15143068 92.5 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:15 1833 0 0 0 0.0 [postgres]
>
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> The configure switches:
>
> ./configure \
> --prefix=/home/postgresql-9.3.4 \
> --with-libxml \
> --with-python \
> --enable-nls \
> --enable-cassert \
> --enable-debug \
> --with-ossp-uuid
>
> --strk;
>
> ()  ASCII ribbon campaign -- Keep it simple !
> /\  http://strk.keybit.net/rants/ascii_mails.txt

--

()  ASCII ribbon campaign -- Keep it simple !
/\  http://strk.keybit.net/rants/ascii_mails.txt

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Greg Stark 2014-03-19 10:57:10 Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Previous Message Jesse Denardo 2014-03-19 08:20:21 Re: BUG #9616: Materialized view with indexes unable to load from pg_dump

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-03-19 09:52:26 Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Previous Message Pavel Stehule 2014-03-19 08:45:21 Review: plpgsql.extra_warnings, plpgsql.extra_errors