Re: Patch to support SEMI and ANTI join removal

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch to support SEMI and ANTI join removal
Date: 2014-08-27 18:23:38
Message-ID: 2261.1409163818@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jim Nasby <jim(at)nasby(dot)net> writes:
> On 8/26/14, 8:40 AM, Heikki Linnakangas wrote:
>> Just so everyone is on the same page on what kind of queries this helps with, here are some examples from the added regression tests:
>>
> -- Test join removals for semi and anti joins
> CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
> CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
> -- should remove semi join to b
> EXPLAIN (COSTS OFF)
> SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
> <snip>
> SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);

> I also fail to see a use for examples that are that silly *unless* we're talking machine-generated SQL, but I suspect that normally uses JOINS.

> Where I would expect this to be useful is in cases where we can pre-evaluate some other condition in the subqueries to make the subqueries useless (ie: SELECT id FROM b WHERE 1=1), or where the condition could be passed through (ie: SELECT id FROM b WHERE id=42). Another possibility would be if there's a condition in the subquery that could trigger constraint elimination.

Unless I'm misunderstanding something, pretty much *any* WHERE restriction
in the subquery would defeat this optimization, since it would no longer
be certain that there was a match to an arbitrary outer-query row. So
it seems unlikely to me that this would fire in enough real-world cases
to be worth including. I am definitely not a fan of carrying around
deadwood in the planner.

If the majority of the added code is code that will be needed for
less-bogus optimizations, it might be all right; but I'd kind of want to
see the less-bogus optimizations working first.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Silva 2014-08-27 18:30:22 Re: jsonb format is pessimal for toast compression
Previous Message Jim Nasby 2014-08-27 18:05:45 Re: [RFC, POC] Don't require a NBuffer sized PrivateRefCount array of local buffer pins