Re: join removal

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: join removal
Date: 2009-07-20 02:56:52
Message-ID: 15727.1248058612@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Jul 16, 2009 at 9:02 PM, Greg Stark<stark(at)mit(dot)edu> wrote:
>> I have one big worry though. Currently you're detecting the unique
>> property using the planner's path mechanism. I suppose that works, but
>> it's only an accident of the planner design that the path for the
>> unique index will always be there if it's the join condition. My
>> instinct is that this code should be going back to the raw index info
>> to prove this property.

> I had trouble figuring out where to hook in the logic. In an ideal
> world, it would be nice to detect that the join is removable earlier,
> but it's hard to do that, because it's not until we know the join
> order that we can test whether any attributes from the inner rel are
> used above the level of the join.

Yeah. Ideally this sort of thing would happen in prepjointree.c, but
we don't have nearly enough information at that stage. I think the
approach of treating join removal as a kind of join implementation is
not unreasonable. I think though that we might have to add an actual
"dummy join" path type. The crocks you put into add_path are, well,
crocks.

> But as it is the fact that the join
> can be removed will have to be rediscovered over and over again as
> planning progresses.

Not really. We only consider a given join once.

> As for going back to "the raw index info", that was kind of my
> instinct as well but I couldn't make it work.

You need to work harder --- the way it's being done here is way too
simplistic. It's failing to take any account of whether the index's
opclass has anything to do with the semantics of the join operators.
Even aside from that, I agree with Greg that depending on
the IndexPath to be there is a fatal mistake. Do we want
enable_indexscan = off to disable join removal? Even if we thought
that was okay, it seems entirely likely that the IndexPath could be
discarded on cost grounds before we get to the stage of considering
joins. And it certainly won't scale up to considering removal of
joins above the base level.

I think we want something along the lines of relation_is_distinct_for
with a list of columns and a list of comparison operators, where the
first-cut implementation will be to look for matching indexes.
This will be different from query_is_distinct_for, but it's dealing
with the same sorts of considerations about whether the operator
semantics are the right things.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-07-20 03:47:29 Re: Problem with psql backward compatibility
Previous Message Andrew Dunstan 2009-07-20 02:46:43 Re: conditional dropping of columns/constraints