Re: WIP Join Removal

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: List pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP Join Removal
Date: 2008-09-02 16:28:10
Message-ID: 15179.1220372890@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> As discussed on 26 June, "Join Removal/Vertical Partitioning", here's a
> patch to remove joins in certain circumstances.

Some points not made in the thread so far:

> + if (checkrel->rtekind != RTE_RELATION)
> + return;

This isn't right, or at least isn't sufficient, because rtekind is
undefined for joinrels. I think it should be something like

/*
* Since we can't prove anything unless keeprel has a unique
* index, give up immediately if it's a join or not a table.
*/
if (checkrel->reloptkind == RELOPT_JOINREL ||
checkrel->rtekind != RTE_RELATION)
return;

Maybe you should also check for checkrel->indexlist == NIL here, since
there's no point in doing the attr_needed bookkeeping if no indexes.

> + /*
> + * Check that the query targetlist does not contain any non-junk
> + * target entries for the relation. If it does, we cannot remove join.
> + */
> + if (checkrel->min_attr > 0)
> + minattr = checkrel->min_attr;
> +
> + for (attrno = minattr; attrno < checkrel->max_attr; attrno++)
> + {
> + if (!bms_is_empty(checkrel->attr_needed[attrno]))
> + return;
> + }

This part seems pretty wrong. In the first place, you mustn't
discriminate against system columns like that. (Consider for instance
that the only attr being used from the inner rel is its OID column.)
You're failing to check max_attr too --- I believe the loop ought to run
from min_attr to max_attr inclusive. In the second place, I don't see
how the routine ever succeeds at all if it insists on attr_needed being
empty, because whatever attrs are used in the join condition will surely
have nonempty attr_needed. What you want is to see whether there are
any attrs that are needed *above the join*, which would be something
like
if (!bms_is_subset(checkrel->attr_needed[attrno], joinrel->relids))
fail;
The reference to non-junk in the comment is off base too. Attrs are
used or not, there's no concept of a junk attr.

> + * XXX Seems not worth searching partial indexes because those
> + * are only usable with a appropriate restriction, so the
> + * only way they could ever be usable is if there was a
> + * restriction that exactly matched the index predicate,
> + * which is possible but generally unlikely.

I haven't thought this through entirely, but wouldn't a partial index be
okay if it's marked predOK? You might be right that the case is
unlikely, but if it's only one extra line to support it ...

> + if (removable &&
> + joinrel->cheapest_total_path < keeprel->cheapest_total_path)

This test on cheapest_total_path seems pretty wacko --- even granting
that you meant to test the costs and not compare pointer addresses,
isn't it backward? Also I doubt that the joinrel's cheapest_total_path
field is set yet where you're calling this.

In any case I'm unconvinced that join removal could ever not be a win,
so that test seems unnecessary anyhow.

> + {
> + elog(LOG, "join removed");
> + joinrel->pathlist = keeprel->pathlist;
> + joinrel->joininfo = keeprel->baserestrictinfo;
> + }

Huh? What in the world are you doing to joininfo here? This function
should only be manipulating the path list.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2008-09-02 16:35:21 Re: WIP Join Removal
Previous Message Simon Riggs 2008-09-02 16:24:33 Re: rmgr hooks and contrib/rmgr_hook