Out of swap space & memory

Lists: pgsql-general
From: "Kevin Bartz" <kbartz(at)loyaltymatrix(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Out of swap space & memory
Date: 2004-08-03 03:08:45
Message-ID: 20040803031430.DD2F03FCB8@omta16.mta.everyone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I'm de-duping a 45-million-row table (about 4 gigs) by creating a new table
with the unique rows of the first table (should be about 3 gigs). I'm on a
64-bit SuSE Linux with 2 gigs of memory and another gig of swap space. I
tracked postmaster's use of memory and swap space after I run the query, and
I noticed that as soon as postmaster first uses up all available memory and
swap space, I get this:

2004-08-02 19:35:52 LOG: checkpoint record is at 7/4AA2F590
2004-08-02 19:35:52 LOG: redo record is at 7/4AA2F590; undo record is at
0/0; shutdown TRUE
2004-08-02 19:35:52 LOG: next transaction ID: 740; next OID: 116212470
2004-08-02 19:35:52 LOG: database system is ready
2004-08-02 19:38:04 LOG: server process (PID 25302) was terminated by
signal 9
2004-08-02 19:38:04 LOG: terminating any other active server processes
2004-08-02 19:38:04 LOG: all server processes terminated; reinitializing
2004-08-02 19:38:04 FATAL: the database system is starting up
2004-08-02 19:38:04 LOG: database system was interrupted at 2004-08-02
19:35:52 PDT
2004-08-02 19:38:05 LOG: checkpoint record is at 7/4AA2F590
2004-08-02 19:38:05 LOG: redo record is at 7/4AA2F590; undo record is at
0/0; shutdown TRUE
2004-08-02 19:38:05 LOG: next transaction ID: 740; next OID: 116212470
2004-08-02 19:38:05 LOG: database system was not properly shut down;
automatic recovery in progress
2004-08-02 19:38:05 LOG: redo starts at 7/4AA2F5D8
2004-08-02 19:38:05 LOG: record with zero length at 7/4AA4D5C0
2004-08-02 19:38:05 LOG: redo done at 7/4AA4B560
2004-08-02 19:38:07 LOG: database system is ready

And the query never completes. This happens reproducibly and consistently. I
like to think that two gigs should be enough memory. What is killing the
postmaster process and how can I stop it? And, more importantly, is there
any way I can run this query? Thanks for any help you can provide,

Kevin


From: mike g <mike(at)thegodshalls(dot)com>
To: Kevin Bartz <kbartz(at)loyaltymatrix(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Out of swap space & memory
Date: 2004-08-03 03:39:33
Message-ID: 1091504373.16772.16.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

What version of Postgresql are you running?

I believe this was a problem in 7.4.1 but fixed by 7.4.3 if I remember
correctly.

Mike

On Mon, 2004-08-02 at 22:08, Kevin Bartz wrote:
> I'm de-duping a 45-million-row table (about 4 gigs) by creating a new table
> with the unique rows of the first table (should be about 3 gigs). I'm on a
> 64-bit SuSE Linux with 2 gigs of memory and another gig of swap space. I
> tracked postmaster's use of memory and swap space after I run the query, and
> I noticed that as soon as postmaster first uses up all available memory and
> swap space, I get this:
>
> 2004-08-02 19:35:52 LOG: checkpoint record is at 7/4AA2F590
> 2004-08-02 19:35:52 LOG: redo record is at 7/4AA2F590; undo record is at
> 0/0; shutdown TRUE
> 2004-08-02 19:35:52 LOG: next transaction ID: 740; next OID: 116212470
> 2004-08-02 19:35:52 LOG: database system is ready
> 2004-08-02 19:38:04 LOG: server process (PID 25302) was terminated by
> signal 9
> 2004-08-02 19:38:04 LOG: terminating any other active server processes
> 2004-08-02 19:38:04 LOG: all server processes terminated; reinitializing
> 2004-08-02 19:38:04 FATAL: the database system is starting up
> 2004-08-02 19:38:04 LOG: database system was interrupted at 2004-08-02
> 19:35:52 PDT
> 2004-08-02 19:38:05 LOG: checkpoint record is at 7/4AA2F590
> 2004-08-02 19:38:05 LOG: redo record is at 7/4AA2F590; undo record is at
> 0/0; shutdown TRUE
> 2004-08-02 19:38:05 LOG: next transaction ID: 740; next OID: 116212470
> 2004-08-02 19:38:05 LOG: database system was not properly shut down;
> automatic recovery in progress
> 2004-08-02 19:38:05 LOG: redo starts at 7/4AA2F5D8
> 2004-08-02 19:38:05 LOG: record with zero length at 7/4AA4D5C0
> 2004-08-02 19:38:05 LOG: redo done at 7/4AA4B560
> 2004-08-02 19:38:07 LOG: database system is ready
>
> And the query never completes. This happens reproducibly and consistently. I
> like to think that two gigs should be enough memory. What is killing the
> postmaster process and how can I stop it? And, more importantly, is there
> any way I can run this query? Thanks for any help you can provide,
>
> Kevin
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html


From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: "Kevin Bartz" <kbartz(at)loyaltymatrix(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Out of swap space & memory
Date: 2004-08-03 10:04:05
Message-ID: rkoug0dvl8f86s9c6gg17kk7aeus0k2vm4@email.aon.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Mon, 2 Aug 2004 20:08:45 -0700, "Kevin Bartz"
<kbartz(at)loyaltymatrix(dot)com> wrote:
>is there any way I can run this query?

What query? You didn't show us your SQL.

Servus
Manfred


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Bartz" <kbartz(at)loyaltymatrix(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Out of swap space & memory
Date: 2004-08-03 14:39:39
Message-ID: 22881.1091543979@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

"Kevin Bartz" <kbartz(at)loyaltymatrix(dot)com> writes:
> I'm de-duping a 45-million-row table (about 4 gigs) by creating a new table
> with the unique rows of the first table (should be about 3 gigs). I'm on a
> 64-bit SuSE Linux with 2 gigs of memory and another gig of swap space. I
> tracked postmaster's use of memory and swap space after I run the query, and
> I noticed that as soon as postmaster first uses up all available memory and
> swap space, I get this:

> 2004-08-02 19:38:04 LOG: server process (PID 25302) was terminated by
> signal 9

This is the infamous "out of memory kill" that is perpetrated by some
versions of Linux after the kernel realizes that it has given out memory
it does not have. Google for "OOM kill" and you'll find info. It's an
extremely good idea to disable this kernel bug^H^H^Hfeature, as a lot of
the time the process that gets zapped is not the one that was actually
consuming all the RAM, but some innocent bystander. The first time the
OOM killer takes out your postmaster, your mail daemon, or some other
highly critical process, you'll wish you had turned it off. (You turn
it off by adjusting kernel settings so that it won't give out more
memory than it has in the first place.)

However, that doesn't really answer your problem, which is why your
query is consuming unreasonable amounts of RAM and what you can do about
it. What PG version is this, what is the query *exactly*, what does
EXPLAIN show for the query, and what nondefault postgresql.conf settings
are you using?

regards, tom lane