Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)

From: Sandro Santilli <strk(at)keybit(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-19 07:53:59
Message-ID: 20140319075359.GA4042@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Tue, Mar 18, 2014 at 07:41:32PM -0400, Tom Lane wrote:
> Sandro Santilli <strk(at)keybit(dot)net> writes:
> > Tom: I saw you pushed a fix for the interruptability with this:
> > https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0
>
> > But we also noticed a memory leak in the regepx_replace call, did you notice
> > that in your tests ? Same regexp as reported. Do you need another testcase ?
>
> I see no memory growth while running your original example against HEAD.

This is what I'm running:

SELECT regexp_matches('test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' );

And this is what I see as the process memory:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:00 0 5693 172446 2908 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 172718 4224 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 173250 4488 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 173670 5016 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 174070 5280 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 174494 5544 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 174762 5808 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 175690 6600 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 179006 9240 0.0 postgres: strk strk [local] idle

This is as of commit 63817f86b57fc3d29b57787bca9d786218b7ee25

Or, simpler, from this state:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:00 0 5693 1274962 869832 5.3 postgres: strk strk [local] idle

I run this:

SELECT count(*) from (
select regexp_matches( 'test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' )
FROM generate_series(1, 10000)
) as f;

And get here:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:00 0 5693 1274962 869832 5.3 postgres: strk strk [local] idle
5359 ? Rs 0:00 0 5693 1555894 1090536 6.6 postgres: strk strk [local] SELECT
5359 ? Ss 0:00 0 5693 2269682 1650744 10.0 postgres: strk strk [local] idle

The memory is only released on session close.
This is with generate_series going up to 1e5:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:00 0 5693 2269682 1650744 10.0 postgres: strk strk [local] idle
5359 ? Rs 0:00 0 5693 2300726 1675560 10.2 postgres: strk strk [local] SELECT
5359 ? Rs 0:01 0 5693 6438626 4923024 30.0 postgres: strk strk [local] SELECT
5359 ? Rs 0:02 0 5693 10591778 8182368 50.0 postgres: strk strk [local] SELECT
5359 ? Ss 0:03 0 5693 12216566 9457488 57.8 postgres: strk strk [local] idle

Up to 1e6 the story ends with a crash, after:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:03 0 5693 12216566 9457488 57.8 postgres: strk strk [local] idle
5359 ? Rs 0:04 0 5693 15666674 12166392 74.3 postgres: strk strk [local] SELECT
5359 ? Rs 0:05 0 5693 19403254 15099056 92.3 postgres: strk strk [local] SELECT
5359 ? Rs 0:06 0 5693 20284546 15366412 93.9 postgres: strk strk [local] SELECT
5359 ? Rs 0:07 1 5693 21665438 15333312 93.7 postgres: strk strk [local] SELECT
5359 ? Rs 0:08 1 5693 23065426 15295472 93.5 postgres: strk strk [local] SELECT
5359 ? Rs 0:09 1 5693 24468334 15256592 93.2 postgres: strk strk [local] SELECT
5359 ? Rs 0:10 1 5693 25869382 15225544 93.0 postgres: strk strk [local] SELECT
5359 ? Rs 0:11 1 5693 27280514 15210012 92.9 postgres: strk strk [local] SELECT
5359 ? Rs 0:12 1 5693 28684350 15173964 92.7 postgres: strk strk [local] SELECT
5359 ? Ds 0:12 1 5693 29683014 15100680 92.3 postgres: strk strk [local] SELECT
5359 ? Rs 0:13 1 5693 30298130 15051580 92.0 postgres: strk strk [local] SELECT
5359 ? Rs 0:13 1 5693 30979690 15088992 92.2 postgres: strk strk [local] SELECT
5359 ? Rs 0:13 1 5693 31758466 15139792 92.5 postgres: strk strk [local] SELECT
5359 ? Ds 0:14 1 5693 32406206 15111128 92.3 postgres: strk strk [local] SELECT
5359 ? Ds 0:14 1 5693 33045722 15131676 92.5 postgres: strk strk [local] SELECT
5359 ? Ds 0:14 1 5693 33479010 15135036 92.5 postgres: strk strk [local] SELECT
5359 ? Rs 0:15 1 5693 33847174 15146748 92.6 postgres: strk strk [local] SELECT
5359 ? Rs 0:15 1 5693 34186830 15131648 92.5 postgres: strk strk [local] SELECT
5359 ? Rs 0:15 1 5693 34339878 15119460 92.4 postgres: strk strk [local] SELECT
5359 ? Ds 0:15 1 5693 34502474 15139816 92.5 postgres: strk strk [local] SELECT
5359 ? Ds 0:15 1 5693 34625554 15143068 92.5 postgres: strk strk [local] SELECT
5359 ? Rs 0:15 1833 0 0 0 0.0 [postgres]

The connection to the server was lost. Attempting reset: Failed.
!>

The configure switches:

./configure \
--prefix=/home/postgresql-9.3.4 \
--with-libxml \
--with-python \
--enable-nls \
--enable-cassert \
--enable-debug \
--with-ossp-uuid

--strk;

()  ASCII ribbon campaign -- Keep it simple !
/\  http://strk.keybit.net/rants/ascii_mails.txt

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jesse Denardo 2014-03-19 08:20:21 Re: BUG #9616: Materialized view with indexes unable to load from pg_dump
Previous Message Markus Nussdorfer 2014-03-19 05:36:02 Re: BUG #8660: RPM installation of 9.2.6 have dependency problem

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-03-19 07:59:19 Re: pg_archivecleanup bug
Previous Message Amit Kapila 2014-03-19 05:45:58 Re: B-tree descend for insertion locking