Re: slow IN() clause for many cases

From: Andrew - Supernews <andrew+nonews(at)supernews(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: slow IN() clause for many cases
Date: 2005-10-14 07:40:12
Message-ID: slrndkuo6s.2db7.andrew+nonews@trinity.supernews.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2005-10-12, Andrew - Supernews <andrew+nonews(at)supernews(dot)com> wrote:
> On 2005-10-12, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Andrew - Supernews <andrew+nonews(at)supernews(dot)com> writes:
>>> As the number of items in the IN clause increases, the planning time grows
>>> rather radically.
>>
>> I was looking at this yesterday. There is some O(N^2) behavior in
>> create_bitmap_subplan, stemming from trying to remove duplicated qual
>> conditions. That strikes me as a relatively useless activity, and I was
>> thinking the easiest answer might be to just delete that "optimization".
>
> Well, the behaviour is definitely bad.
>
> For comparison, on 8.0 the IN (list of 1000 items) version plans only about
> four times slower than IN (select array...), and again the execution times
> are comparable. But for the case of a real-world app that uses IN () a lot
> (dspam), I've had reports that the performance improvements from switching
> to the array method are even more substantial than my raw timings suggest.

Trying this again, on the same data, with the latest planner change shows
that the plan time for IN (list of 1000 items) is now much better, 47ms
rather than nearly five seconds, but that's still substantially longer than
the execution time in the case where the data is in cache. I'm not sure why,
but the execution time for that query has improved too, it's now about 20%
faster for the bitmap-OR than for the array method.

With everything in cache, selecting 1000 random rows from a 200k row table,
I get:

for IN (list): planning time 47ms, execution time 27ms
array/nestloop: planning time 8ms, execution time 34ms

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2005-10-14 08:47:32 Re: Allowed timezone values
Previous Message Thomas Hallgren 2005-10-14 07:30:44 Link problems with HEAD