Re: NOT IN query takes forever

Lists: pgsql-generalpgsql-performance
From: "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>
To: "Marius Andreiana" <mandreiana(at)rdslink(dot)ro>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: NOT IN query takes forever
Date: 2004-08-03 12:05:23
Message-ID: 6EE64EF3AB31D5448D0007DD34EEB34101AF1A@Herge.rcsinc.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-performance

> Trying to run this query:
> EXPLAIN ANALYSE
> select * FROM trans
> WHERE query_id NOT IN (select query_id FROM query)
>
> but it will remain like that forever (cancelled after 30 min).

explain analyze actually runs the query to do timings. Just run explain
and see what you come up with. More than likely there is a nestloop in
there which is causing the long query time.

Try bumping up shared buffers some and sort mem as much as you safely
can.

Merlin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>
Cc: "Marius Andreiana" <mandreiana(at)rdslink(dot)ro>, pgsql-performance(at)postgresql(dot)org
Subject: Re: NOT IN query takes forever
Date: 2004-08-03 14:59:17
Message-ID: 23045.1091545157@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-performance

"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com> writes:
> Try bumping up shared buffers some and sort mem as much as you safely
> can.

sort_mem is probably the issue here. The only reasonable way to do NOT
IN is with a hash table, and the default setting of sort_mem is probably
too small to support a 137042-element table.

regards, tom lane


From: Marius Andreiana <mandreiana(at)rdslink(dot)ro>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: NOT IN query takes forever
Date: 2004-08-03 16:02:42
Message-ID: 1091548963.6915.2.camel@marte.biciclete.ro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-performance

On Tue, 2004-08-03 at 08:05 -0400, Merlin Moncure wrote:
> > Trying to run this query:
> > EXPLAIN ANALYSE
> > select * FROM trans
> > WHERE query_id NOT IN (select query_id FROM query)
> >
> > but it will remain like that forever (cancelled after 30 min).
>
> explain analyze actually runs the query to do timings. Just run explain
> and see what you come up with. More than likely there is a nestloop in
> there which is causing the long query time.
>
> Try bumping up shared buffers some and sort mem as much as you safely
> can.
Thank you, that did it!

With
shared_buffers = 3000 # min 16, at least max_connections*2, 8KB each
sort_mem = 128000 # min 64, size in KB

it takes <3 seconds (my hardware is not server-class).

--
Marius Andreiana
Galuna - Solutii Linux in Romania
http://www.galuna.ro


From: Gaetano Mendola <mendola(at)bigfoot(dot)com>
To: Marius Andreiana <mandreiana(at)rdslink(dot)ro>
Subject: Re: NOT IN query takes forever
Date: 2004-08-03 17:28:27
Message-ID: 410FCB3B.5050300@bigfoot.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-performance

Marius Andreiana wrote:

> On Tue, 2004-08-03 at 08:05 -0400, Merlin Moncure wrote:
>
>>>Trying to run this query:
>>>EXPLAIN ANALYSE
>>>select * FROM trans
>>>WHERE query_id NOT IN (select query_id FROM query)
>>>
>>>but it will remain like that forever (cancelled after 30 min).
>>
>>explain analyze actually runs the query to do timings. Just run explain
>>and see what you come up with. More than likely there is a nestloop in
>>there which is causing the long query time.
>>
>>Try bumping up shared buffers some and sort mem as much as you safely
>>can.
>
> Thank you, that did it!
>
> With
> shared_buffers = 3000 # min 16, at least max_connections*2, 8KB each
> sort_mem = 128000 # min 64, size in KB

128 MB for sort_mem is too much, consider that in this way each backend can
use 128 MB for sort operations...
Also shared_buffers = 3000 means 24MB that is not balanced with the 128MB
needed for sort...
Try to bump up 128 MB for shared_buffer ( may be you need to instruct your
OS to allow that ammount of shared memory usage ) and 24MB for sort_mem.

Regards
Gaetano Mendola


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>
Cc: Marius Andreiana <mandreiana(at)rdslink(dot)ro>, pgsql-performance(at)postgresql(dot)org
Subject: Re: NOT IN query takes forever
Date: 2004-08-04 01:32:53
Message-ID: 41103CC5.7030207@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-performance

> explain analyze actually runs the query to do timings. Just run explain
> and see what you come up with. More than likely there is a nestloop in
> there which is causing the long query time.
>
> Try bumping up shared buffers some and sort mem as much as you safely
> can.

Just use an EXISTS query I suggest.

Chris


From: Marius Andreiana <mandreiana(at)rdslink(dot)ro>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: [PERFORM] NOT IN query takes forever
Date: 2004-08-04 05:40:52
Message-ID: 1091598052.2647.7.camel@marte.biciclete.ro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-performance

On Tue, 2004-08-03 at 19:28 +0200, Gaetano Mendola wrote:
> > With
> > shared_buffers = 3000 # min 16, at least max_connections*2, 8KB each
> > sort_mem = 128000 # min 64, size in KB
>
> 128 MB for sort_mem is too much, consider that in this way each backend can
> use 128 MB for sort operations...
> Also shared_buffers = 3000 means 24MB that is not balanced with the 128MB
> needed for sort...
> Try to bump up 128 MB for shared_buffer ( may be you need to instruct your
> OS to allow that ammount of shared memory usage ) and 24MB for sort_mem.
Thanks for the advice. I increased shmmax to allow shared_buffers to be
128mb and set sort_mem to 24mb.

--
Marius Andreiana
Galuna - Solutii Linux in Romania
http://www.galuna.ro