Re: Allowing NOT IN to use ANTI joins

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing NOT IN to use ANTI joins
Date: 2014-06-29 10:48:49
Message-ID: CAApHDvpKHhw6a5cCnWE0hG7WLuWVac7bCeXcD3F9ZYDv3NGwdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 27, 2014 at 6:14 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > If there's no way to tell that the target entry comes from a left join,
> > then would it be a bit too quirky to just do the NOT NULL checking when
> > list_length(query->rtable) == 1 ? or maybe even loop over query->rtable
> and
> > abort if we find an outer join type? it seems a bit hackish, but perhaps
> it
> > would please more people than it would surprise.
>
> Why do you think you can't tell if the column is coming from the wrong
> side of a left join?
>
> It was more of that I couldn't figure out how to do it, rather than
thinking it was not possible.

> I don't recall right now if there is already machinery in the planner for
> this, but we could certainly deconstruct the from-clause enough to tell
> that.
>
> It's not hard to imagine a little function that recursively descends the
> join tree, not bothering to descend into the nullable sides of outer
> joins, and returns "true" if it finds a RangeTableRef for the desired RTE.
> If it doesn't find the RTE in the not-nullable parts of the FROM clause,
> then presumably it's nullable.
>
>
Ok, I've implemented that in the attached. Thanks for the tip.

> The only thing wrong with that approach is if you have to call the
> function many times, in which case discovering the information from
> scratch each time could get expensive.
>
>
I ended up just having the function gather up all RelIds that are on the on
the inner side. So this will just need to be called once per NOT IN clause.

> I believe deconstruct_jointree already keeps track of whether it's
> underneath an outer join, so if we were doing this later than that,
> I'd advocate making sure it saves the needed information. But I suppose
> you're trying to do this before that. It might be that you could
> easily save aside similar information during the earlier steps in
> prepjointree.c. (Sorry for not having examined the patch yet, else
> I'd probably have a more concrete suggestion.)
>
>
This is being done from within pull_up_sublinks, so it's before
deconstruct_jointree.

I've attached an updated version of the patch which seems to now work
properly with outer joins.

I think I'm finally ready for a review again, so I'll update the commitfest
app.

Regards

David Rowley

Attachment Content-Type Size
not_in_anti_join_v0.6.patch application/octet-stream 22.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2014-06-29 11:35:04 Re: [Fwd: Re: proposal: new long psql parameter --on-error-stop]
Previous Message Thomas Munro 2014-06-29 10:11:14 Re: Cluster name in ps output