teaching Var about NOT NULL

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: teaching Var about NOT NULL
Date: 2014-06-01 14:40:06
Message-ID: CAApHDvpcj9q+7fM6Hu1ACU3W4r9aPmbvLk-x9oA_irdDCbH-YA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

While hacking away at implementing join removal support for ANTI-JOINs I
realised that I couldn't just replace the join with a WHERE false
condition... Let me explain...

With a query such as:

SELECT * FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = b.id);

Where a.b_id has a foreign key on b(id)

I'm working on a join removal patch which will turn this into:

SELECT * FROM a WHERE b_id IS NULL;

This seemed like a bit of a shame since with my test tables b_id is defined
NOT NULL, but there seemed to be no way to tell if I needed to add a WHERE
FALSE or a IS NULL check to the WHERE clause.

I quickly put together the attached patch which adds a "knownnotnull" bool
field to Var which we can set to true when we're completely sure that the
Var cannot contain any NULL values. I'm populating this from
pg_attribute.attnotnull where I can and setting it to false where I can't
see another way to tell for sure that nulls cannot exist.

The only use of knownnotnull that I've added to the patch is to turn a
query such as:

SELECT * FROM a WHERE b_id IS NULL;

To not scan the table, since id is defined as NOT NULL.

postgres=# alter table a alter column b_id drop not null;
ALTER TABLE
postgres=# explain select id from a where b_id is null;
QUERY PLAN
---------------------------------------------------
Seq Scan on a (cost=0.00..31.40 rows=11 width=4)
Filter: (b_id IS NULL)
Planning time: 0.340 ms
(3 rows)

postgres=# alter table a alter column b_id set not null;
ALTER TABLE
postgres=# explain select id from a where b_id is null;
QUERY PLAN
--------------------------------------------------------
Result (cost=0.00..31.40 rows=1 width=4)
One-Time Filter: false
-> Seq Scan on a (cost=0.00..31.40 rows=1 width=4)
Planning time: 0.402 ms
(4 rows)

Having this extra flag could likely help optimise NOT IN(SELECT notnullcol
FROM table) to allow this to become an ANTI-JOIN. It will also help join
optimise join removals a little more.

The patch is just a few minutes old and there's no regression tests yet.
I'd rather have some feedback before I proceed with it.

Regards

David Rowley

Attachment Content-Type Size
var_not_null_v0.1.patch application/octet-stream 16.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2014-06-01 15:49:11 Re: Changeset Extraction v7.6.1
Previous Message Marc Mamin 2014-06-01 13:48:56 Re: "pivot aggregation" with a patched intarray