Re: [PATCH] Prevent pg_dump running out of memory

Lists: pgsql-patches
From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: pgsql-patches(at)postgresql(dot)org
Subject: [PATCH] Prevent pg_dump running out of memory
Date: 2001-08-27 14:03:51
Message-ID: 20010828000351.C32309@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Just a patch to clean up a bug in pg_dump whose sole purpose is to confuse
users. Why should -d crash pg_dump just because you have a big table? I
couldn't find this listed anywhere, not even on the TODO list. So if some
change to the library fixed this, I apologise.

This patch replaces the simple SELECT * with a cursor that fetches 1,000 rows
at a time. The 1,000 was chosen because it was small enough to test but I
think realisitically 10,000 wouldn't be too much.

Also, it seems there is no regression test for pg_dump. Is this intentional
or has noone come up with a good way to test it?

http://svana.org/kleptog/pgsql/pgsql-pg_dump.patch (also attached)

Please CC any replies.

P.S. For those people waiting for the timing patch, I'm just dealing with a
little issue involving getting a flag from ExplainOneQuery to ExecInitNode.
I think I may have an answer but it needs testing.
--
Martijn van Oosterhout <kleptog(at)svana(dot)org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

Attachment Content-Type Size
pgsql-pg_dump.patch text/plain 5.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Prevent pg_dump running out of memory
Date: 2001-08-27 14:42:34
Message-ID: 7481.998923354@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> [ use a cursor for pg_dump -d ]

Excellent idea. Thanks!

> Also, it seems there is no regression test for pg_dump. Is this intentional
> or has noone come up with a good way to test it?

The latter. We certainly need one...

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Prevent pg_dump running out of memory
Date: 2001-08-27 15:06:22
Message-ID: 20010828010622.D32309@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Mon, Aug 27, 2001 at 10:42:34AM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > [ use a cursor for pg_dump -d ]
>
> Excellent idea. Thanks!

No problems.

> > Also, it seems there is no regression test for pg_dump. Is this intentional
> > or has noone come up with a good way to test it?
>
> The latter. We certainly need one...

The only thing I can think of right now is for other parts of the regression
tests to leave various tables, triggers, etc lying around. Then somewhere
near the end, do a pg_dump regress | psql newdb (for various options of
pg_dump) and then somehow compare the two databases.

The comparison would be the tricky part because you should avoid using
pg_dump since that is what you are testing...

--
Martijn van Oosterhout <kleptog(at)svana(dot)org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Prevent pg_dump running out of memory
Date: 2001-08-27 20:36:50
Message-ID: 21677.998944610@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> This patch replaces the simple SELECT * with a cursor that fetches 1,000 rows
> at a time. The 1,000 was chosen because it was small enough to test but I
> think realisitically 10,000 wouldn't be too much.

Applied. However, I went the other way: I reduced the 1000 to 100.
I doubt it'll make much difference in performance, and with TOAST
available I can believe people might have multi-megabyte rows.

I also added a cursor CLOSE command, just to be a neatnik. It's
probably not critical given that the same cursor name is used each
time, but still...

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Prevent pg_dump running out of memory
Date: 2001-08-28 02:15:05
Message-ID: 20010828121504.A2731@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Mon, Aug 27, 2001 at 04:36:50PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > This patch replaces the simple SELECT * with a cursor that fetches 1,000 rows
> > at a time. The 1,000 was chosen because it was small enough to test but I
> > think realisitically 10,000 wouldn't be too much.
>
> Applied. However, I went the other way: I reduced the 1000 to 100.
> I doubt it'll make much difference in performance, and with TOAST
> available I can believe people might have multi-megabyte rows.

Good point. Just as long as you're not getting one row at a time it's a win.

> I also added a cursor CLOSE command, just to be a neatnik. It's
> probably not critical given that the same cursor name is used each
> time, but still...

Doh! I even thought of that while writing it but I couldn't remember the
command. I tend to let cursors die at the end of the transaction.

--
Martijn van Oosterhout <kleptog(at)svana(dot)org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.