Re: Union within View vs.Union of Views

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Jeff Larsen <jlar310(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Union within View vs.Union of Views
Date: 2007-11-04 01:38:35
Message-ID: 3248.1194140315@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
> Jeff Larsen wrote:
>> If it's possible to consider this abstractly, is there any particular
>> reason why there is such a vast difference in performance?

> That's surprising. The planner knows how to push down WHERE conditions
> to parts of a UNION ALL, and should be able to generate the same plan in
> both cases.

There are a bunch of special cases where it can't do that, though.
Look into src/backend/optimizer/path/allpaths.c, particularly
subquery_is_pushdown_safe:

* Conditions checked here:
*
* 1. If the subquery has a LIMIT clause, we must not push down any quals,
* since that could change the set of rows returned.
*
* 2. If the subquery contains EXCEPT or EXCEPT ALL set ops we cannot push
* quals into it, because that would change the results.
*
* 3. For subqueries using UNION/UNION ALL/INTERSECT/INTERSECT ALL, we can
* push quals into each component query, but the quals can only reference
* subquery columns that suffer no type coercions in the set operation.
* Otherwise there are possible semantic gotchas. So, we check the
* component queries to see if any of them have different output types;
* differentTypes[k] is set true if column k has different type in any
* component.

and qual_is_pushdown_safe:

* Conditions checked here:
*
* 1. The qual must not contain any subselects (mainly because I'm not sure
* it will work correctly: sublinks will already have been transformed into
* subplans in the qual, but not in the subquery).
*
* 2. The qual must not refer to the whole-row output of the subquery
* (since there is no easy way to name that within the subquery itself).
*
* 3. The qual must not refer to any subquery output columns that were
* found to have inconsistent types across a set operation tree by
* subquery_is_pushdown_safe().
*
* 4. If the subquery uses DISTINCT ON, we must not push down any quals that
* refer to non-DISTINCT output columns, because that could change the set
* of rows returned. This condition is vacuous for DISTINCT, because then
* there are no non-DISTINCT output columns, but unfortunately it's fairly
* expensive to tell the difference between DISTINCT and DISTINCT ON in the
* parsetree representation. It's cheaper to just make sure all the Vars
* in the qual refer to DISTINCT columns.
*
* 5. We must not push down any quals that refer to subselect outputs that
* return sets, else we'd introduce functions-returning-sets into the
* subquery's WHERE/HAVING quals.
*
* 6. We must not push down any quals that refer to subselect outputs that
* contain volatile functions, for fear of introducing strange results due
* to multiple evaluation of a volatile function.

Idly looking at this, I'm suddenly wondering whether the prohibition on
pushing into an EXCEPT is necessary. If a qual eliminates rows from the
EXCEPT's output, can't we just eliminate those same rows from the
inputs?

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-11-04 01:45:37 Re: "MixedCase sensitive quoted" names
Previous Message Whatever Deep 2007-11-03 23:40:42 "MixedCase sensitive quoted" names