Re: Extending constraint exclusion for implied constraints/conditions

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Extending constraint exclusion for implied constraints/conditions
Date: 2014-07-08 04:54:25
Message-ID: CAFjFpRdK27_D9_ahPfLVpLjog1njr+eKix_23sQaVALDXa7G4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 7, 2014 at 7:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> writes:
> > Right now, constraint exclusion code looks only at the provided
> conditions.
> > If we want avoid table scan based on constraints in the above example, it
> > will need to look at the implied conditions as well. E.g. val2 < 30 AND
> val
> > = val2 => val < 30. Then the constraint exclusion can see that val < 30
> AND
> > val > 30 are contradictory and infer that the result is going to be
> empty.
> > We will need to add information about the transitive inferences between
> > operators. Can we do that in PostgreSQL? Will the benefits be worth the
> > code, that needs to be added?
>
> I doubt it. The extra code isn't the problem so much, it's the extra
> planning cycles spent trying to make proofs that will usually fail.
> What you propose will create a combinatorial explosion in the number
> of proof paths to be considered.
>

I can understand that it will create combinatorial explosion in the number
of predicates that need to be examined by the constraint exclusion. I do
not understand where come the paths gets involved here. The constraint
exclusion kicks in before paths are created

220 static void
221 set_rel_size(PlannerInfo *root, RelOptInfo *rel,
222 Index rti, RangeTblEntry *rte)
223 {
224 if (rel->reloptkind == RELOPT_BASEREL &&
225 relation_excluded_by_constraints(root, rel, rte))
226 {
227 /*
228 * We proved we don't need to scan the rel via constraint
exclusion,
229 * so set up a single dummy path for it. Here we only check
this for
230 * regular baserels; if it's an otherrel, CE was already
checked in
231 * set_append_rel_pathlist().
232 *
233 * In this case, we go ahead and set up the relation's path
right away
234 * instead of leaving it for set_rel_pathlist to do. This is
because
235 * we don't have a convention for marking a rel as dummy
except by
236 * assigning a dummy path to it.
237 */
238 set_dummy_rel_pathlist(rel);
239 }

and does not create paths for relations excluded by constraints

295 /*
296 * set_rel_pathlist
297 * Build access paths for a base relation
298 */
299 static void
300 set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
301 Index rti, RangeTblEntry *rte)
302 {
303 if (IS_DUMMY_REL(rel))
304 {
305 /* We already proved the relation empty, so nothing more to do
*/
306 }

Same in the case of join in mark_join_rel()

663 * JOIN_INNER.
664 */
665 switch (sjinfo->jointype)
666 {
667 case JOIN_INNER:
668 if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
669 restriction_is_constant_false(restrictlist, false))
670 {
671 mark_dummy_rel(joinrel);
672 break;
673 }
674 add_paths_to_joinrel(root, joinrel, rel1, rel2,
675 JOIN_INNER, sjinfo,
676 restrictlist);
677 add_paths_to_joinrel(root, joinrel, rel2, rel1,
678 JOIN_INNER, sjinfo,
679 restrictlist);
680 break;

BTW, on a side note, I noticed, we use mark_dummy_rel() for joinrels and
set_dummy_rel_pathlist() for baserel. Shouldn't we be using the same
function everywhere for the same functionality (e.g. adding an append path
with no child-paths).

For larger relations, the time spent in constraint exclusion might be
lesser than the time taken by actual table/index scan and that to when such
a scan is not going to produce any rows. So, we might want to apply the
technique only when the estimated number of rows/pages are above a certain
threshold and may be when the GUC is ON (like we do it today).

> > I can see some more benefits. We can use it to eliminate joins where the
> > constraints on joining relations may cause the join to be empty e.g.
>
> ... and applying constraint exclusion to join relations would make that
> problem even worse.
>
>
The same case goes with joins, where potentially, the crossproduct of two
tables is huge.

> regards, tom lane
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashoke 2014-07-08 05:22:00 Modifying update_attstats of analyze.c for C Strings
Previous Message Craig Ringer 2014-07-08 04:49:56 Re: [PATCH] Improve bytea error messages