Re: Proposed patch for qual pushdown into UNION/INTERSECT

Lists: pgsql-patches
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-patches(at)postgreSQL(dot)org
Subject: Proposed patch for qual pushdown into UNION/INTERSECT
Date: 2002-08-29 07:45:22
Message-ID: 23670.1030607122@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Seems to work in preliminary testing, but I'd like someone else to
give it a try before I decide it works. Patch is against CVS tip.

regards, tom lane

*** src/backend/optimizer/path/allpaths.c.orig Thu Jun 20 16:29:29 2002
--- src/backend/optimizer/path/allpaths.c Thu Aug 29 03:40:36 2002
***************
*** 46,51 ****
--- 46,56 ----
RangeTblEntry *rte);
static RelOptInfo *make_one_rel_by_joins(Query *root, int levels_needed,
List *initial_rels);
+ static bool subquery_is_pushdown_safe(Query *subquery, Query *topquery);
+ static bool recurse_pushdown_safe(Node *setOp, Query *topquery);
+ static void subquery_push_qual(Query *subquery, Index rti, Node *qual);
+ static void recurse_push_qual(Node *setOp, Query *topquery,
+ Index rti, Node *qual);


/*
***************
*** 297,327 ****
* generate a better plan for the subquery than evaluating all the
* subquery output rows and then filtering them.
*
! * There are several cases where we cannot push down clauses:
! *
! * 1. If the subquery contains set ops (UNION/INTERSECT/EXCEPT) we do not
! * push down any qual clauses, since the planner doesn't support quals
! * at the top level of a setop. (With suitable analysis we could try
! * to push the quals down into the component queries of the setop, but
! * getting it right seems nontrivial. Work on this later.)
! *
! * 2. If the subquery has a LIMIT clause or a DISTINCT ON clause, we must
! * not push down any quals, since that could change the set of rows
! * returned. (Actually, we could push down quals into a DISTINCT ON
! * subquery if they refer only to DISTINCT-ed output columns, but
! * checking that seems more work than it's worth. In any case, a
! * plain DISTINCT is safe to push down past.)
! *
! * 3. If the subquery has any functions returning sets in its target list,
! * we do not push down any quals, since the quals
! * might refer to those tlist items, which would mean we'd introduce
! * functions-returning-sets into the subquery's WHERE/HAVING quals.
! * (It'd be sufficient to not push down quals that refer to those
! * particular tlist items, but that's much clumsier to check.)
! *
! * 4. We do not push down clauses that contain subselects, mainly because
! * I'm not sure it will work correctly (the subplan hasn't yet
! * transformed sublinks to subselects).
*
* Non-pushed-down clauses will get evaluated as qpquals of the
* SubqueryScan node.
--- 302,312 ----
* generate a better plan for the subquery than evaluating all the
* subquery output rows and then filtering them.
*
! * There are several cases where we cannot push down clauses.
! * Restrictions involving the subquery are checked by
! * subquery_is_pushdown_safe(). Also, we do not push down clauses that
! * contain subselects, mainly because I'm not sure it will work correctly
! * (the subplan hasn't yet transformed sublinks to subselects).
*
* Non-pushed-down clauses will get evaluated as qpquals of the
* SubqueryScan node.
***************
*** 329,339 ****
* XXX Are there any cases where we want to make a policy decision not to
* push down, because it'd result in a worse plan?
*/
! if (subquery->setOperations == NULL &&
! subquery->limitOffset == NULL &&
! subquery->limitCount == NULL &&
! !has_distinct_on_clause(subquery) &&
! !expression_returns_set((Node *) subquery->targetList))
{
/* OK to consider pushing down individual quals */
List *upperrestrictlist = NIL;
--- 314,321 ----
* XXX Are there any cases where we want to make a policy decision not to
* push down, because it'd result in a worse plan?
*/
! if (rel->baserestrictinfo != NIL &&
! subquery_is_pushdown_safe(subquery, subquery))
{
/* OK to consider pushing down individual quals */
List *upperrestrictlist = NIL;
***************
*** 351,375 ****
}
else
{
! /*
! * We need to replace Vars in the clause (which must refer
! * to outputs of the subquery) with copies of the
! * subquery's targetlist expressions. Note that at this
! * point, any uplevel Vars in the clause should have been
! * replaced with Params, so they need no work.
! */
! clause = ResolveNew(clause, rti, 0,
! subquery->targetList,
! CMD_SELECT, 0);
! subquery->havingQual = make_and_qual(subquery->havingQual,
! clause);
!
! /*
! * We need not change the subquery's hasAggs or
! * hasSublinks flags, since we can't be pushing down any
! * aggregates that weren't there before, and we don't push
! * down subselects at all.
! */
}
}
rel->baserestrictinfo = upperrestrictlist;
--- 333,340 ----
}
else
{
! /* Push it down */
! subquery_push_qual(subquery, rti, clause);
}
}
rel->baserestrictinfo = upperrestrictlist;
***************
*** 547,553 ****
--- 512,694 ----
}

/*****************************************************************************
+ * PUSHING QUALS DOWN INTO SUBQUERIES
+ *****************************************************************************/
+
+ /*
+ * subquery_is_pushdown_safe - is a subquery safe for pushing down quals?
+ *
+ * subquery is the particular component query being checked. topquery
+ * is the top component of a set-operations tree (the same Query if no
+ * set-op is involved).
+ *
+ * Conditions checked here:
+ *
+ * 1. If the subquery has a LIMIT clause or a DISTINCT ON clause, we must
+ * not push down any quals, since that could change the set of rows
+ * returned. (Actually, we could push down quals into a DISTINCT ON
+ * subquery if they refer only to DISTINCT-ed output columns, but
+ * checking that seems more work than it's worth. In any case, a
+ * plain DISTINCT is safe to push down past.)
*
+ * 2. If the subquery has any functions returning sets in its target list,
+ * we do not push down any quals, since the quals
+ * might refer to those tlist items, which would mean we'd introduce
+ * functions-returning-sets into the subquery's WHERE/HAVING quals.
+ * (It'd be sufficient to not push down quals that refer to those
+ * particular tlist items, but that's much clumsier to check.)
+ *
+ * 3. If the subquery contains EXCEPT or EXCEPT ALL set ops we cannot push
+ * quals into it, because that would change the results. For subqueries
+ * using UNION/UNION ALL/INTERSECT/INTERSECT ALL, we can push the quals
+ * into each component query, so long as all the component queries share
+ * identical output types. (That restriction could probably be relaxed,
+ * but it would take much more code to include type coercion code into
+ * the quals, and I'm also concerned about possible semantic gotchas.)
+ */
+ static bool
+ subquery_is_pushdown_safe(Query *subquery, Query *topquery)
+ {
+ SetOperationStmt *topop;
+
+ /* Check points 1 and 2 */
+ if (subquery->limitOffset != NULL ||
+ subquery->limitCount != NULL ||
+ has_distinct_on_clause(subquery) ||
+ expression_returns_set((Node *) subquery->targetList))
+ return false;
+
+ /* Are we at top level, or looking at a setop component? */
+ if (subquery == topquery)
+ {
+ /* Top level, so check any component queries */
+ if (subquery->setOperations != NULL)
+ if (!recurse_pushdown_safe(subquery->setOperations, topquery))
+ return false;
+ }
+ else
+ {
+ /* Setop component must not have more components (too weird) */
+ if (subquery->setOperations != NULL)
+ return false;
+ /* Setop component output types must match top level */
+ topop = (SetOperationStmt *) topquery->setOperations;
+ Assert(topop && IsA(topop, SetOperationStmt));
+ if (!tlist_same_datatypes(subquery->targetList,
+ topop->colTypes,
+ true))
+ return false;
+
+ }
+ return true;
+ }
+
+ /*
+ * Helper routine to recurse through setOperations tree
+ */
+ static bool
+ recurse_pushdown_safe(Node *setOp, Query *topquery)
+ {
+ if (IsA(setOp, RangeTblRef))
+ {
+ RangeTblRef *rtr = (RangeTblRef *) setOp;
+ RangeTblEntry *rte = rt_fetch(rtr->rtindex, topquery->rtable);
+ Query *subquery = rte->subquery;
+
+ Assert(subquery != NULL);
+ return subquery_is_pushdown_safe(subquery, topquery);
+ }
+ else if (IsA(setOp, SetOperationStmt))
+ {
+ SetOperationStmt *op = (SetOperationStmt *) setOp;
+
+ /* EXCEPT is no good */
+ if (op->op == SETOP_EXCEPT)
+ return false;
+ /* Else recurse */
+ if (!recurse_pushdown_safe(op->larg, topquery))
+ return false;
+ if (!recurse_pushdown_safe(op->rarg, topquery))
+ return false;
+ }
+ else
+ {
+ elog(ERROR, "recurse_pushdown_safe: unexpected node %d",
+ (int) nodeTag(setOp));
+ }
+ return true;
+ }
+
+ /*
+ * subquery_push_qual - push down a qual that we have determined is safe
+ */
+ static void
+ subquery_push_qual(Query *subquery, Index rti, Node *qual)
+ {
+ if (subquery->setOperations != NULL)
+ {
+ /* Recurse to push it separately to each component query */
+ recurse_push_qual(subquery->setOperations, subquery, rti, qual);
+ }
+ else
+ {
+ /*
+ * We need to replace Vars in the qual (which must refer
+ * to outputs of the subquery) with copies of the
+ * subquery's targetlist expressions. Note that at this
+ * point, any uplevel Vars in the qual should have been
+ * replaced with Params, so they need no work.
+ *
+ * This step also ensures that when we are pushing into a setop
+ * tree, each component query gets its own copy of the qual.
+ */
+ qual = ResolveNew(qual, rti, 0,
+ subquery->targetList,
+ CMD_SELECT, 0);
+ subquery->havingQual = make_and_qual(subquery->havingQual,
+ qual);
+
+ /*
+ * We need not change the subquery's hasAggs or
+ * hasSublinks flags, since we can't be pushing down any
+ * aggregates that weren't there before, and we don't push
+ * down subselects at all.
+ */
+ }
+ }
+
+ /*
+ * Helper routine to recurse through setOperations tree
+ */
+ static void
+ recurse_push_qual(Node *setOp, Query *topquery,
+ Index rti, Node *qual)
+ {
+ if (IsA(setOp, RangeTblRef))
+ {
+ RangeTblRef *rtr = (RangeTblRef *) setOp;
+ RangeTblEntry *rte = rt_fetch(rtr->rtindex, topquery->rtable);
+ Query *subquery = rte->subquery;
+
+ Assert(subquery != NULL);
+ subquery_push_qual(subquery, rti, qual);
+ }
+ else if (IsA(setOp, SetOperationStmt))
+ {
+ SetOperationStmt *op = (SetOperationStmt *) setOp;
+
+ recurse_push_qual(op->larg, topquery, rti, qual);
+ recurse_push_qual(op->rarg, topquery, rti, qual);
+ }
+ else
+ {
+ elog(ERROR, "recurse_push_qual: unexpected node %d",
+ (int) nodeTag(setOp));
+ }
+ }
+
+ /*****************************************************************************
+ * DEBUG SUPPORT
*****************************************************************************/

#ifdef OPTIMIZER_DEBUG
*** src/backend/optimizer/prep/prepunion.c.orig Fri Aug 2 14:15:06 2002
--- src/backend/optimizer/prep/prepunion.c Thu Aug 29 03:16:12 2002
***************
*** 66,72 ****
static List *generate_append_tlist(List *colTypes, bool flag,
List *input_plans,
List *refnames_tlist);
- static bool tlist_same_datatypes(List *tlist, List *colTypes, bool junkOK);
static Node *adjust_inherited_attrs_mutator(Node *node,
adjust_inherited_attrs_context *context);

--- 66,71 ----
***************
*** 579,585 ****
* Resjunk columns are ignored if junkOK is true; otherwise presence of
* a resjunk column will always cause a 'false' result.
*/
! static bool
tlist_same_datatypes(List *tlist, List *colTypes, bool junkOK)
{
List *i;
--- 578,584 ----
* Resjunk columns are ignored if junkOK is true; otherwise presence of
* a resjunk column will always cause a 'false' result.
*/
! bool
tlist_same_datatypes(List *tlist, List *colTypes, bool junkOK)
{
List *i;
*** src/include/optimizer/prep.h.orig Thu Jun 20 16:29:51 2002
--- src/include/optimizer/prep.h Thu Aug 29 03:16:03 2002
***************
*** 43,46 ****
--- 43,48 ----
Index old_rt_index, Oid old_relid,
Index new_rt_index, Oid new_relid);

+ extern bool tlist_same_datatypes(List *tlist, List *colTypes, bool junkOK);
+
#endif /* PREP_H */


From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-patches(at)postgreSQL(dot)org>
Subject: Re: Proposed patch for qual pushdown into UNION/INTERSECT
Date: 2002-08-29 14:58:30
Message-ID: 20020829074423.M96919-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Thu, 29 Aug 2002, Tom Lane wrote:

> Seems to work in preliminary testing, but I'd like someone else to
> give it a try before I decide it works. Patch is against CVS tip.

It seems to work for me as well. I didn't do complicated data, but
some complicated query structures.

I do wonder if queries like
select * from ((select * from bar1 union all select * from bar2
union all (select * from bar1 except select * from bar2)) intersect select
* from bar1) as foo where a>4;

can push the a>4 restriction into the right side of the intersect which it
doesn't seem to right now, but I haven't sat down to really think
ramifiactions all the way through (mostly because I'm getting ready to go
on long weekend vacation). Or for that matter into the left portions of
the union all and union.

Pushing to the parts of the except is not allowed, but I don't think
that means that it can't go to the other parts of the expression when
the output from the except is itself used in an union/intersect.
So instead of something like (if I'm reading the explain correctly)
Subquery (filter a>4)
Intersect
Append
Subquery
Subquery
Except
Subquery
Subquery
Subquery

We could probably do:
Subquery
Intersect
Append
Subquery (passing down a>4)
Subquery (passing down a>4)
<something> (filter a>4)
Except
Subquery
Subquery
Subquery (passing down a>4)
and get the same effect, but I think
the mixed case is probably rare enough
to not worry about for now.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: pgsql-patches(at)postgreSQL(dot)org
Subject: Re: Proposed patch for qual pushdown into UNION/INTERSECT
Date: 2002-08-29 15:24:59
Message-ID: 3577.1030634699@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com> writes:
> I do wonder if queries like
> select * from ((select * from bar1 union all select * from bar2
> union all (select * from bar1 except select * from bar2)) intersect select
> * from bar1) as foo where a>4;

> can push the a>4 restriction into the right side of the intersect which it
> doesn't seem to right now,

Not without a lot more work, which I haven't got time for now.

> We could probably do:
> Subquery
> Intersect
> Append
> Subquery (passing down a>4)
> Subquery (passing down a>4)
> <something> (filter a>4)
> Except
> Subquery
> Subquery
> Subquery (passing down a>4)
> and get the same effect,

Yeah, the problem is the planner doesn't currently cope with attaching
quals to intermediate levels of a set-op tree, so there's no way to
do the above.

> but I think
> the mixed case is probably rare enough
> to not worry about for now.

Agreed. This is still a step forward.

regards, tom lane


From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-patches(at)postgreSQL(dot)org>
Subject: Re: Proposed patch for qual pushdown into UNION/INTERSECT
Date: 2002-08-29 15:46:00
Message-ID: 20020829082316.Q97253-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Thu, 29 Aug 2002, Tom Lane wrote:

Actually, hadn't we figured that pushing to the left side of except was
safe? It's only in pushing to the right that we ran into a problem since
that could make the query return more rows than it should.

I only did a little minimal testing, so I'm not 100% sure about this
patch to the patch, but it still seemed to give the right results for
my test (admittedly simple) queries.

*** t1.patch.old Thu Aug 29 09:32:38 2002
--- t1.patch Thu Aug 29 09:34:58 2002
***************
*** 150,159 ****
+ * particular tlist items, but that's much clumsier to check.)
+ *
+ * 3. If the subquery contains EXCEPT or EXCEPT ALL set ops we cannot push
! + * quals into it, because that would change the results. For subqueries
! + * using UNION/UNION ALL/INTERSECT/INTERSECT ALL, we can push the quals
! + * into each component query, so long as all the component queries share
! + * identical output types. (That restriction could probably be relaxed,
+ * but it would take much more code to include type coercion code into
+ * the quals, and I'm also concerned about possible semantic gotchas.)
+ */
--- 150,159 ----
+ * particular tlist items, but that's much clumsier to check.)
+ *
+ * 3. If the subquery contains EXCEPT or EXCEPT ALL set ops we cannot push
! + * quals into it's right hand side, because that would change the results.
! + * For subqueries using UNION/UNION ALL/INTERSECT/INTERSECT ALL, we can push
! + * the quals into each component query, so long as all the component queries
! + * share identical output types. (That restriction could probably be relaxed,
+ * but it would take much more code to include type coercion code into
+ * the quals, and I'm also concerned about possible semantic gotchas.)
+ */
***************
*** 213,226 ****
+ {
+ SetOperationStmt *op = (SetOperationStmt *) setOp;
+
- + /* EXCEPT is no good */
- + if (op->op == SETOP_EXCEPT)
- + return false;
+ /* Else recurse */
+ if (!recurse_pushdown_safe(op->larg, topquery))
+ return false;
! + if (!recurse_pushdown_safe(op->rarg, topquery))
! + return false;
+ }
+ else
+ {
--- 213,224 ----
+ {
+ SetOperationStmt *op = (SetOperationStmt *) setOp;
+
+ /* Else recurse */
+ if (!recurse_pushdown_safe(op->larg, topquery))
+ return false;
! + if (op->op != SETOP_EXCEPT)
! + if (!recurse_pushdown_safe(op->rarg, topquery))
! + return false;
+ }
+ else
+ {
***************
*** 289,295 ****
+ SetOperationStmt *op = (SetOperationStmt *) setOp;
+
+ recurse_push_qual(op->larg, topquery, rti, qual);
! + recurse_push_qual(op->rarg, topquery, rti, qual);
+ }
+ else
+ {
--- 287,294 ----
+ SetOperationStmt *op = (SetOperationStmt *) setOp;
+
+ recurse_push_qual(op->larg, topquery, rti, qual);
! + if (op->op != SETOP_EXCEPT)
! + recurse_push_qual(op->rarg, topquery, rti, qual);
+ }
+ else
+ {


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: pgsql-patches(at)postgreSQL(dot)org
Subject: Re: Proposed patch for qual pushdown into UNION/INTERSECT
Date: 2002-08-29 16:02:59
Message-ID: 8152.1030636979@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com> writes:
> Actually, hadn't we figured that pushing to the left side of except was
> safe?

It's not safe for EXCEPT ALL (you might get the wrong number of
duplicates out), and I'm not entirely convinced about EXCEPT.

Also, the point of pushing down is to speed up the subqueries, so I'm
not sure there's much win if you can't push to both sides.

All in all I'd just as soon leave the EXCEPT case alone.

regards, tom lane