Proposed fixes for planner regressions from June to release

Lists: pgsql-patches
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-patches(at)postgreSQL(dot)org
Subject: Proposed fixes for planner regressions from June to release
Date: 2006-12-10 19:57:44
Message-ID: 13943.1165780664@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Arjen van der Meijden reported here:
http://archives.postgresql.org/pgsql-performance/2006-12/msg00041.php
that 8.2rc1 was noticeably slower on his query mix than a CVS pull of
3-June-2006 had been. I've been looking into it with him off-list,
and I believe that the problem is one of poor planning due to
misestimation of costs for queries with large IN-lists (that is,
indexable ScalarArrayOpExpr conditions with large array constants).
Specifically the planner tends to choose bitmapscans when it shouldn't.
This code is all new in 8.2 so it's not exactly surprising that there
might be a few glitches. His development snapshot coincidentally
avoided the problem because that was just before we tweaked the planner
to favor indexscans (including bitmap scans) more on the inside of
nestloops.

The attached proposed patch brings Arjen's query mix back to being as
fast or faster than the 3-June snapshot. There are four elements to the
patch; in order of increasing controversialness they are:

1. Fix a flat-out logic error in genericcostestimate()'s handling of the
CPU cost estimate for a scan involving a ScalarArrayOpExpr indexqual
(which results in multiple physical indexscans within the BitmapIndexScan
plan node). numIndexTuples is per physical scan, so it has to be
multiplied by num_sa_scans to get the correct per-plan-node-execution
cost estimate.

2. Fix another logic error that occurred when a ScalarArrayOpExpr qual
applies to a lower-order index column that doesn't have equality quals
on all the index columns to its left. In this situation the
ScalarArrayOpExpr doesn't contribute to reducing the fraction of the
index that has to be scanned, but genericcostestimate() was still
dividing numIndexTuples by num_sa_scans. This is a case of "premature
optimization is the root of all evil" :-( ... I was trying to avoid
calculating num_sa_scans twice, but in fact it's necessary to compute it
separately in btcostestimate and genericcostestimate, since lower-order
ScalarArrayOpExprs should be included in the count for the latter's
purposes and not the former's. So, do the correction of numIndexTuples
in btcostestimate, and don't let genericcostestimate adjust a passed-in
estimate.

3. In costsize.c, charge a small amount extra per tuple retrieved by a
bitmap indexscan (I set it at 0.1 * cpu_operator_cost), on the grounds
that entering the tuple into the bitmap should cost something. The real
reason for doing this though is that for the case where a nestloop with
inner indexscan expects to retrieve a single tuple from the inner
relation on each iteration, 8.2 release is computing exactly the same
cost (within roundoff error) to do the inner scan as a plain or bitmap
indexscan --- and depending on how the roundoff error goes, it not
infrequently chooses the bitmap scan. This is obviously silly, since a
bitmap scan has more overhead than a plain indexscan, and no way to win
when retrieving a single heap tuple. There is more than one way we
could deal with this problem, but adding an explicit CPU-cost charge for
the extra overhead seems reasonable.

4. In genericcostestimate(), add a CPU-cost charge to reflect the CPU
effort involved in initiating an indexscan (for instance, the analysis
of the index keys that btree does). To some extent this charge also
covers the cost of descending the btree. Before 8.2 we had effectively
priced those costs in by counting an explicit disk access to the btree
metapage, which was clearly too high --- but it seems that zero is too
low, too. I set the number at 100 * cpu_operator_cost, which may sound
high but it seems about right on the strength of Arjen's measurements.
It's possible that we should make it vary with the number of indexquals;
anyone have an opinion about that?

Comments in general? I'm a bit worried that these changes will bias us
against indexscans too much, but there's no sign of that in Arjen's
results.

regards, tom lane

Attachment Content-Type Size
unknown_filename text/plain 6.0 KB

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-patches(at)postgreSQL(dot)org>
Subject: Re: Proposed fixes for planner regressions from June torelease
Date: 2006-12-11 19:59:38
Message-ID: 1165867179.3816.97.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Sun, 2006-12-10 at 14:57 -0500, Tom Lane wrote:
> Comments in general? I'm a bit worried that these changes will bias
> us
> against indexscans too much, but there's no sign of that in Arjen's
> results.

Are Arjen's tests on prepared queries? Does that change anything? We shy
away from indexes on prepared queries too much already, important when
the consequence of doing so is an O(N) seqscan rather than an O(logN)
indexscan.

The cost of initiating an index scan is a cause for concern, but it
seems reasonable to get it accurate. I'd like to perform some of that
work at planning time, not at scan time, when it is possible for us to
do so. Simple indexed, planned queries shouldn't need to pay that cost
repeatedly.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: pgsql-patches(at)postgreSQL(dot)org
Subject: Re: Proposed fixes for planner regressions from June torelease
Date: 2006-12-11 20:33:57
Message-ID: 15646.1165869237@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
> Are Arjen's tests on prepared queries?

No.

> Does that change anything? We shy
> away from indexes on prepared queries too much already, important when
> the consequence of doing so is an O(N) seqscan rather than an O(logN)
> indexscan.

Changing that will require far more extensive changes than twiddling a
few cost estimates.

> The cost of initiating an index scan is a cause for concern, but it
> seems reasonable to get it accurate. I'd like to perform some of that
> work at planning time, not at scan time, when it is possible for us to
> do so. Simple indexed, planned queries shouldn't need to pay that cost
> repeatedly.

Isn't this opinion in direct contradiction to your previous paragraph?
If you want the thing to be more flexible about plans involving unknown
parameter values, then you have to push work towards the runtime end,
not towards the planner.

regards, tom lane


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-patches(at)postgreSQL(dot)org>
Subject: Re: Proposed fixes for planner regressions from Junetorelease
Date: 2006-12-11 21:00:37
Message-ID: 1165870837.3816.113.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Mon, 2006-12-11 at 15:33 -0500, Tom Lane wrote:
> "Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:

> > Does that change anything? We shy
> > away from indexes on prepared queries too much already, important when
> > the consequence of doing so is an O(N) seqscan rather than an O(logN)
> > indexscan.
>
> Changing that will require far more extensive changes than twiddling a
> few cost estimates.

Of course and I wasn't expecting those changes here. I should have
raised it on -hackers.

> > The cost of initiating an index scan is a cause for concern, but it
> > seems reasonable to get it accurate. I'd like to perform some of that
> > work at planning time, not at scan time, when it is possible for us to
> > do so. Simple indexed, planned queries shouldn't need to pay that cost
> > repeatedly.
>
> Isn't this opinion in direct contradiction to your previous paragraph?
> If you want the thing to be more flexible about plans involving unknown
> parameter values, then you have to push work towards the runtime end,
> not towards the planner.

In general, perhaps, but there are some things that can be done at
planning time, such as deciding on unique scans and analyzing columns
for the scan.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com