foreign-key inference & join removal

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: foreign-key inference & join removal
Date: 2009-10-19 01:44:28
Message-ID: 603c8f070910181844k21195e8fv6bb4cee2c5cb1d74@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 17, 2009 at 5:14 PM, I wrote:
> I think that the next project in this area should
> be to try to support removal of INNER joins.  (Removal of SEMI, ANTI,
> and FULL joins seems unlikely ever to be interesting.)  That will
> require teaching the planner about foreign key constraints, which
> interestingly opens up some other interesting optimization
> opportunities.  An INNER join that can never match zero rows can
> alternatively be implemented as a LEFT join, which can be reordered in
> some situations where an inner join can't.  e.g. A LJ (B IJ C ON Pbc)
> ON Pab can be implemented as (A LJ B ON Pab) LJ/IJ C ON Pbc if it is
> the case that for every row in B, there will be at least one row in C
> such that Pbc holds.

A few more thoughts on this topic.

Suppose we define a new join type called "inner_or_left_join". This
means that we've proven that every outer row has at least one join
partner, so that we'll get the same results whichever way we implement
it. We can prove this for either inner joins or left joins, or for
right joins with the sides reversed, by checking that:

(1) The inner rel is a baserel with no restriction clauses.
(2) All the join clauses are merge-joinable.
(3) There is a table on the outer side of the join with a foreign key
constraint referencing the inner table, such that the columns of the
foreign key constraint and the chosen equality operators exactly match
up with the join clauses (no extra columns, no extra join clauses).
(4) All the relevant columns of the outer table are NOT NULL.

I am not quite sure how to check for #3 and #4, nor am I sure where
the best place to put this logic is.

Once this infrastructure is in place, it opens up a number of
interesting optimization opportunities.

1. Inner joins that are relabelled as inner_or_left joins can be
removed using the same logic that we already use for left joins.

2. Inner_or_left joins can be reordered more flexibly than either
inner joins or left joins. Consider:

A LJ (B IJ C ON Pbc) ON Pab

If B IJ C is inner_or_left and Pbc is strict, the joins commute.
Further, I believe that if the LJ is inner_or_left and Pab is strict,
we can commute the joins provided that, in the process, we change the
join between A and B to be an inner join (no longer inner_or_left).
But if neither join is inner_or_left then no reordering is possible.

3. Some path types are implemented only for inner joins but not for
left joins; these can also be used for inner_or_left joins. For
example, consider A LJ B ON Pab. If the join can be proven
inner_or_left, we have the option of treating it as an inner join and
using A rather than B as the inner rel.

Thoughts?

...Robert

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2009-10-19 03:59:51 Re: Reworks for Access Control facilities (r2363)
Previous Message Andrew Chernow 2009-10-19 01:09:13 Re: postgres 8.3.8 and Solaris 10_x86 64 bit problems?