Re: BUG #7627: Bad query plan when using LIMIT

Lists: pgsql-bugs
From: edward(at)clericare(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #7627: Bad query plan when using LIMIT
Date: 2012-10-29 22:51:01
Message-ID: E1TSyAf-0006st-DI@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 7627
Logged by: Edward Faulkner
Email address: edward(at)clericare(dot)com
PostgreSQL version: 9.2.1
Operating system: OSX 10.8.2
Description:

The following two queries differ only by adding "LIMIT 1", but the one with
the limit gets radically worse performance. I've done VACUUM FULL, VACUUM
ANALYZE, and REINDEX DATABASE and there are no modifications since.

EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
DESC;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=4665.10..4666.59 rows=595 width=173) (actual
time=27.936..28.034 rows=543 loops=1)
Sort Key: public.commits.tree_high
Sort Method: quicksort Memory: 169kB
-> Nested Loop (cost=60.93..4637.68 rows=595 width=173) (actual
time=2.000..26.658 rows=543 loops=1)
-> HashAggregate (cost=60.93..66.98 rows=605 width=28) (actual
time=1.880..2.214 rows=605 loops=1)
-> Subquery Scan on "ANY_subquery" (cost=0.00..59.42
rows=605 width=28) (actual time=0.034..1.231 rows=605 loops=1)
-> Limit (cost=0.00..53.37 rows=605 width=32) (actual
time=0.032..0.941 rows=605 loops=1)
-> Index Scan using commit_tree on commits
(cost=0.00..13481.52 rows=152837 width=32) (actual time=0.031..0.799
rows=605 loops=1)
-> Index Scan using commits_pkey on commits (cost=0.00..7.54
rows=1 width=173) (actual time=0.038..0.039 rows=1 loops=605)
Index Cond: ((id)::text = ("ANY_subquery".id)::text)
Filter: (tree_other IS NULL)
Rows Removed by Filter: 0
Total runtime: 28.210 ms
(13 rows)

EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
DESC LIMIT 1;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..2314.68 rows=1 width=173) (actual
time=46626.438..46626.439 rows=1 loops=1)
-> Nested Loop Semi Join (cost=0.00..1377233.62 rows=595 width=173)
(actual time=46626.437..46626.437 rows=1 loops=1)
Join Filter: ((public.commits.id)::text =
("ANY_subquery".id)::text)
Rows Removed by Join Filter: 90573339
-> Index Scan Backward using commit_tree on commits
(cost=0.00..13481.52 rows=150269 width=173) (actual time=0.025..406.336
rows=149708 loops=1)
Filter: (tree_other IS NULL)
Rows Removed by Filter: 2525
-> Materialize (cost=0.00..62.44 rows=605 width=28) (actual
time=0.000..0.084 rows=605 loops=149708)
-> Subquery Scan on "ANY_subquery" (cost=0.00..59.42
rows=605 width=28) (actual time=0.027..1.166 rows=605 loops=1)
-> Limit (cost=0.00..53.37 rows=605 width=32) (actual
time=0.026..0.965 rows=605 loops=1)
-> Index Scan using commit_tree on commits
(cost=0.00..13481.52 rows=152837 width=32) (actual time=0.026..0.828
rows=605 loops=1)
Total runtime: 46626.562 ms
(12 rows)

It's possible to work around the problem like this:

EXPLAIN ANALYZE WITH candidates AS (SELECT * FROM commits WHERE id IN
(SELECT id FROM commits ORDER BY tree_high LIMIT 605 ) AND tree_other IS
NULL) SELECT * FROM candidates ORDER BY tree_high DESC LIMIT 1;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=4652.56..4652.56 rows=1 width=436) (actual time=29.369..29.370
rows=1 loops=1)
CTE candidates
-> Nested Loop (cost=60.93..4637.68 rows=595 width=173) (actual
time=2.008..27.454 rows=543 loops=1)
-> HashAggregate (cost=60.93..66.98 rows=605 width=28) (actual
time=1.891..2.271 rows=605 loops=1)
-> Subquery Scan on "ANY_subquery" (cost=0.00..59.42
rows=605 width=28) (actual time=0.032..1.237 rows=605 loops=1)
-> Limit (cost=0.00..53.37 rows=605 width=32)
(actual time=0.031..0.909 rows=605 loops=1)
-> Index Scan using commit_tree on commits
(cost=0.00..13481.52 rows=152837 width=32) (actual time=0.030..0.799
rows=605 loops=1)
-> Index Scan using commits_pkey on commits (cost=0.00..7.54
rows=1 width=173) (actual time=0.039..0.040 rows=1 loops=605)
Index Cond: ((id)::text = ("ANY_subquery".id)::text)
Filter: (tree_other IS NULL)
Rows Removed by Filter: 0
-> Sort (cost=14.88..16.36 rows=595 width=436) (actual
time=29.367..29.367 rows=1 loops=1)
Sort Key: candidates.tree_high
Sort Method: top-N heapsort Memory: 25kB
-> CTE Scan on candidates (cost=0.00..11.90 rows=595 width=436)
(actual time=2.015..28.850 rows=543 loops=1)
Total runtime: 29.562 ms
(16 rows)

So is there something I could be doing wrong, or is this a bug?


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: edward(at)clericare(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #7627: Bad query plan when using LIMIT
Date: 2012-10-30 05:27:12
Message-ID: 2841.1351574832@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

edward(at)clericare(dot)com writes:
> The following two queries differ only by adding "LIMIT 1", but the one with
> the limit gets radically worse performance. I've done VACUUM FULL, VACUUM
> ANALYZE, and REINDEX DATABASE and there are no modifications since.

> EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
> ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
> DESC;

> EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
> ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
> DESC LIMIT 1;

I think what's happening there is that the planner supposes that an
indexscan in "tree_high DESC" order will find rows matching the IN
condition uniformly distributed in the scan order --- but, because of
the construction of the IN clause, they're actually going to be
pessimally located at the very end of that scan order. So it ends up
forming all of the nestloop result, when it had expected to have to
compute only 1/595 of it. We've discussed dialing down the planner's
optimism about limit plans to not assume perfect independence of filter
conditions, but I don't think anyone would advocate for having it assume
the worst possible case, which is what you've got here unfortunately.

I can't help thinking that there's a way to express this problem without
the peculiar self-join, but I'm too tired to think of a good one right
now. The best I can do is a window function:

select last_value(id) over (order by tree_high), last_value(...) ...
from (select * from commits order by tree_high limit 605) ss
where tree_other is null;

but it'd be pretty tedious to write out the last_value construct for
each column you want, and anyway this seems less than elegant even
aside from that objection.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: edward(at)clericare(dot)com, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #7627: Bad query plan when using LIMIT
Date: 2012-10-30 09:23:17
Message-ID: CA+U5nMK_TatpM9r7Z9OzZ7Kv-6cmE8Pdw_wrRnyxUypDkAFADQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 30 October 2012 05:27, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> edward(at)clericare(dot)com writes:
>> The following two queries differ only by adding "LIMIT 1", but the one with
>> the limit gets radically worse performance. I've done VACUUM FULL, VACUUM
>> ANALYZE, and REINDEX DATABASE and there are no modifications since.
>
>> EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
>> ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
>> DESC;
>
>> EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
>> ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
>> DESC LIMIT 1;
>
> I think what's happening there is that the planner supposes that an
> indexscan in "tree_high DESC" order will find rows matching the IN
> condition uniformly distributed in the scan order --- but, because of
> the construction of the IN clause, they're actually going to be
> pessimally located at the very end of that scan order. So it ends up
> forming all of the nestloop result, when it had expected to have to
> compute only 1/595 of it. We've discussed dialing down the planner's
> optimism about limit plans to not assume perfect independence of filter
> conditions, but I don't think anyone would advocate for having it assume
> the worst possible case, which is what you've got here unfortunately.

Very bad plans with LIMIT are frequent. This is bad for us because
adding LIMIT usually/is supposed to make queries faster, not slower.

We need to do something.

LIMIT optimistically assumes the very best case, and that best case
gets better the number of rows filtered away.

If we are scanning N rows with LIMIT M we must assume that the action
relates in some way to N, rather than assuming the LIMIT is more and
more effective as N/M increases.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services