Lists: | pgsql-general |
---|
From: | "Chris Spotts" <rfusca(at)gmail(dot)com> |
---|---|
To: | <pgsql-general(at)postgresql(dot)org> |
Subject: | array_agg crash? |
Date: | 2009-07-21 13:12:19 |
Message-ID: | 00c301ca0a04$e1eccd20$a5c66760$@com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
I had one simple query that kept crashing the connection. It crashes after
several minutes.
Tried restarting, it still error'd at the same place.
Tried recreating the table it was selecting from, it still error'd at the
same place.
I rewrote the query with an ARRAY subselect and it finished flawlessly in a
few seconds.
There is about 4 million records in the table its selecting from. No array
ends up with more than 4 elements.
For some reason, the log indicates this is causing an issue with
autovacuum.it says it was -9'd, but it wasn't by or any other physical
person.
Here is the problem query.
create table public.temptrips
as
select trip_id,array_agg(customer_upload_id)
from
trip_ids_to_customer_upload_ids
group by trip_id;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
LOG: 00000: autovacuum launcher process (PID 10264) was terminated by
signal 9: Killed
2009-07-21 08:44:26 EDT - LOCATION: LogChildExit, postmaster.c:2673
2009-07-21 08:44:26 EDT - LOG: 00000: terminating any other active
server processes
2009-07-21 08:44:26 EDT - LOCATION: HandleChildCrash, postmaster.c:2500
2009-07-21 08:44:26 EDT - WARNING: 57P02: terminating connection because
of crash of another server process
2009-07-21 08:44:26 EDT - DETAIL: The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2009-07-21 08:44:26 EDT - HINT: In a moment you should be able to
reconnect to the database and repeat your command.
2009-07-21 08:44:26 EDT - LOCATION: quickdie, postgres.c:2495
2009-07-21 08:44:27 EDT - postgres postgres /usr/local/pgsql/bin/postmaster
FATAL: 57P03: the database system is in recovery mode
2009-07-21 08:44:27 EDT - postgres postgres /usr/local/pgsql/bin/postmaster
LOCATION: ProcessStartupPacket, postmaster.c:1721
2009-07-21 08:44:26 EDT - postgres postgres startup WARNING: 57P02:
terminating connection because of crash of another server process
2009-07-21 08:44:27 EDT - postgres postgres startup DETAIL: The postmaster
has commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2009-07-21 08:44:27 EDT - postgres postgres startup HINT: In a moment you
should be able to reconnect to the database and repeat your command.
2009-07-21 08:44:27 EDT - postgres postgres startup LOCATION: quickdie,
postgres.c:2495
2009-07-21 08:44:28 EDT - LOG: 00000: all server processes terminated;
reinitializing
2009-07-21 08:44:28 EDT - LOCATION: PostmasterStateMachine,
postmaster.c:2858
2009-07-21 08:44:28 EDT - LOG: 00000: database system was interrupted;
last known up at 2009-07-21 08:41:32 EDT
2009-07-21 08:44:28 EDT - LOCATION: StartupXLOG, xlog.c:5236
2009-07-21 08:44:28 EDT - LOG: 00000: database system was not properly
shut down; automatic recovery in progress
2009-07-21 08:44:28 EDT - LOCATION: StartupXLOG, xlog.c:5410
2009-07-21 08:44:28 EDT - LOG: 00000: redo starts at 76/4380AC70
2009-07-21 08:44:28 EDT - LOCATION: StartupXLOG, xlog.c:5493
2009-07-21 08:44:29 EDT - LOG: 00000: record with zero length at
76/438869D0
2009-07-21 08:44:29 EDT - LOCATION: ReadRecord, xlog.c:3532
2009-07-21 08:44:29 EDT - LOG: 00000: redo done at 76/438869A0
2009-07-21 08:44:29 EDT - LOCATION: StartupXLOG, xlog.c:5625
2009-07-21 08:44:29 EDT - LOG: 00000: last completed transaction was at
log time 2009-07-21 08:41:49.707423-04
2009-07-21 08:44:29 EDT - LOCATION: StartupXLOG, xlog.c:5629
2009-07-21 08:44:30 EDT - LOG: 00000: autovacuum launcher started
2009-07-21 08:44:30 EDT - LOCATION: AutoVacLauncherMain,
autovacuum.c:529
2009-07-21 08:44:30 EDT - LOG: 00000: database system is ready to accept
connections
2009-07-21 08:44:30 EDT - LOCATION: reaper, postmaster.c:2272
It looks like this is causing the autovacuum to crash, what could cause
this?
postgres=# select version();
version
----------------------------------------------------------------------------
------------------------------------------
PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
3.4.6 20060404 (Red Hat 3.4.6-8.0.1), 64-bit
(1 row)
Chris
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Chris Spotts" <rfusca(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: array_agg crash? |
Date: | 2009-07-21 13:55:05 |
Message-ID: | 11654.1248184505@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
"Chris Spotts" <rfusca(at)gmail(dot)com> writes:
> LOG: 00000: autovacuum launcher process (PID 10264) was terminated by
> signal 9: Killed
Looks like the OOM killer is loose on your system. Disable memory
overcommit in the kernel and things will get better.
http://www.postgresql.org/docs/8.4/static/kernel-resources.html#AEN24152
regards, tom lane
From: | "Chris Spotts" <rfusca(at)gmail(dot)com> |
---|---|
To: | "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: array_agg crash? |
Date: | 2009-07-21 14:26:25 |
Message-ID: | 00cb01ca0a0f$3bd06220$b3712660$@com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
> "Chris Spotts" <rfusca(at)gmail(dot)com> writes:
> > LOG: 00000: autovacuum launcher process (PID 10264) was terminated
> by
> > signal 9: Killed
>
> Looks like the OOM killer is loose on your system. Disable memory
> overcommit in the kernel and things will get better.
> http://www.postgresql.org/docs/8.4/static/kernel-
> resources.html#AEN24152
>
> regards, tom lane
[Spotts, Christopher]
That seems to have fixed it in the sense that it doesn't crash, but I killed
it after 15 minutes and no results...rather I tried to kill it, but it looks
like I'm going to have to -9 it...
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Chris Spotts" <rfusca(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: array_agg crash? |
Date: | 2009-07-21 15:42:07 |
Message-ID: | 13990.1248190927@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
"Chris Spotts" <rfusca(at)gmail(dot)com> writes:
> That seems to have fixed it in the sense that it doesn't crash, but I killed
> it after 15 minutes and no results...rather I tried to kill it, but it looks
> like I'm going to have to -9 it...
How big were the arrays you were trying to push around here? I tried
interrupting a similar query and it came right back; but if you were
building some really enormous arrays I could see the final array
build loop taking awhile, and there's no CHECK_FOR_INTERRUPTS in it...
regards, tom lane
From: | "Chris Spotts" <rfusca(at)gmail(dot)com> |
---|---|
To: | "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: array_agg crash? |
Date: | 2009-07-21 16:57:10 |
Message-ID: | 00e401ca0a24$4b130ac0$e1392040$@com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
> killed
> > it after 15 minutes and no results...rather I tried to kill it, but
> it looks
> > like I'm going to have to -9 it...
>
> How big were the arrays you were trying to push around here? I tried
> interrupting a similar query and it came right back; but if you were
> building some really enormous arrays I could see the final array
> build loop taking awhile, and there's no CHECK_FOR_INTERRUPTS in it...
>
> regards, tom lane
[Spotts, Christopher]
Like I'd said originally, there were no arrays that ended up being more 4
elements long - all integers. The vast majority of them were 1 or 2 long.
If it was having trouble allocating the memory for arrays, why doesn't using
a ARRAY(subselect) yield the same issue?
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Chris Spotts" <rfusca(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: array_agg crash? |
Date: | 2009-07-21 17:02:57 |
Message-ID: | 15291.1248195777@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
"Chris Spotts" <rfusca(at)gmail(dot)com> writes:
>>> killed
>>> it after 15 minutes and no results...rather I tried to kill it, but
>>> it looks like I'm going to have to -9 it...
>>
>> How big were the arrays you were trying to push around here?
> Like I'd said originally, there were no arrays that ended up being more 4
> elements long - all integers. The vast majority of them were 1 or 2 long.
Hm, maybe the problem is not so much array size as too many arrays. How
many groups are you expecting in that query? Does the plan for the
array_agg query show hash or group aggregation?
regards, tom lane
From: | "Chris Spotts" <rfusca(at)gmail(dot)com> |
---|---|
To: | "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: array_agg crash? |
Date: | 2009-07-21 17:09:13 |
Message-ID: | 00ef01ca0a25$fa4f7720$eeee6560$@com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
> >> How big were the arrays you were trying to push around here?
>
> > Like I'd said originally, there were no arrays that ended up being
> more 4
> > elements long - all integers. The vast majority of them were 1 or 2
> long.
>
> Hm, maybe the problem is not so much array size as too many arrays.
> How
> many groups are you expecting in that query? Does the plan for the
> array_agg query show hash or group aggregation?
>
> regards, tom lane
[Spotts, Christopher]
GroupAggregate
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Chris Spotts" <rfusca(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: array_agg crash? |
Date: | 2009-07-21 17:16:05 |
Message-ID: | 15511.1248196565@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
"Chris Spotts" <rfusca(at)gmail(dot)com> writes:
>> many groups are you expecting in that query? Does the plan for the
>> array_agg query show hash or group aggregation?
> GroupAggregate
Huh, there's no reason it should take much memory then. Maybe you've
found a memory leak. Can you put together a self-contained test case?
regards, tom lane
From: | "Chris Spotts" <rfusca(at)gmail(dot)com> |
---|---|
To: | "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: array_agg crash? |
Date: | 2009-07-21 17:28:54 |
Message-ID: | 00f001ca0a28$b98aee60$2ca0cb20$@com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Tuesday, July 21, 2009 12:16 PM
> To: Chris Spotts
> Cc: pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] array_agg crash?
>
> "Chris Spotts" <rfusca(at)gmail(dot)com> writes:
> >> many groups are you expecting in that query? Does the plan for the
> >> array_agg query show hash or group aggregation?
>
> > GroupAggregate
>
> Huh, there's no reason it should take much memory then. Maybe you've
> found a memory leak. Can you put together a self-contained test case?
>
> regards, tom lane
[Spotts, Christopher]
I'll try to put something together tonight and let you know tomorrow.
From: | Chris Spotts <rfusca(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: array_agg crash? |
Date: | 2009-07-22 06:18:26 |
Message-ID: | 4A66AF32.2090506@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
Tom Lane wrote:
> "Chris Spotts" <rfusca(at)gmail(dot)com> writes:
>
>>> many groups are you expecting in that query? Does the plan for the
>>> array_agg query show hash or group aggregation?
>>>
>
>
>> GroupAggregate
>>
>
> Huh, there's no reason it should take much memory then. Maybe you've
> found a memory leak. Can you put together a self-contained test case?
>
> regards, tom lane
>
What do you want specifically as far as details for the test case? I
exported just the table that that was reading from. Installed a new
clean virtual machine ubuntu (jaunty) and then installed 8.4.0.
Imported the table and definition. Ran the same query and the same
thing happened.
Table its selecting from is:
Table "public.trip_ids_to_customer_upload_ids"
Column | Type | Modifiers | Storage | Description
--------------------+---------+-----------+---------+-------------
trip_id | bigint | | plain |
customer_upload_id | integer | | plain |
Indexes:
"trips_customer_id" btree (trip_id, customer_upload_id)
Has OIDs: no
There is 3801347 rows in the table. There are 3773039 distinct trip_id
values. So you can see that the vast majority of rows here are just a
single element array.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Chris Spotts <rfusca(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: array_agg crash? |
Date: | 2009-07-22 12:50:17 |
Message-ID: | 18106.1248267017@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
Chris Spotts <rfusca(at)gmail(dot)com> writes:
> What do you want specifically as far as details for the test case? I
> exported just the table that that was reading from. Installed a new
> clean virtual machine ubuntu (jaunty) and then installed 8.4.0.
> Imported the table and definition. Ran the same query and the same
> thing happened.
The table dump and the query would be enough then. Can you send it
to me off-list?
regards, tom lane