Re: [PATCH] Add support for IS NULL to btree indexes

Lists: pgsql-patches
From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: pgsql-patches(at)postgresql(dot)org
Subject: [PATCH] Add support for IS NULL to btree indexes
Date: 2005-09-19 17:51:07
Message-ID: 20050919175102.GD18456@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Here is a patch that enables IS NULL to use a btree index. After
mentioning it was really hard, I had a brainwave and here are the 80
lines of code necessary to make it work. What surprised me is that in
various important parts most of the work had actually been done already.

Actually, it's two changes:
- In the btree index code, if SK_ISNULL is set, do the right thing
- If the query has col IS NULL, expand that to col = NULL in the index
quals

The bit to convert col = NULL to an appropriate scan key and everything
else was already in place. It's not overhauling the index am code, just
using the facilities already there.

The only possibly controversial addition of a new scankey flag
SK_INDEXFINDNULL which needs to be set for the index to consider the
scankey. This is to distinguish the cases where (a = NULL) has been
derived from (a IS NULL) and (a = col) where col happens to be NULL
this time round.

It's not entirely perfect, if you look at the explain output, it's
still in the filter list. I couldn't work out how to remove it. Before
you ask, it is working, I traced gdb all the way through the btree
index code and it is doing what it's supposed to. It's the same issue
affecting the IS TRUE/FALSE predicates, they're still in the filter
lists too.

Basically, could some people look over the logic. I think I got it
right but this is the first time I've played with the index code so
maybe I've missed something. If it is agreed to be the way to go, I'll
go back and work out the documentation changes.

Download: http://svana.org/kleptog/pgsql/indexnulls.diff

Have a nice day,

test=# explain analyze select * from z order by t;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Index Scan using zt on z (cost=0.00..65.58 rows=1144 width=44) (actual time=0.034..4.624 rows=1000 loops=1)
Total runtime: 7.684 ms
(2 rows)

test=# explain analyze select * from z where t is null order by t;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Index Scan using zt on z (cost=0.00..5.84 rows=6 width=44) (actual time=0.044..1.442 rows=250 loops=1)
Index Cond: (t = NULL::text)
Filter: (t IS NULL)
Total runtime: 2.279 ms
(4 rows)

--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment Content-Type Size
indexnulls.diff text/plain 6.9 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Add support for IS NULL to btree indexes
Date: 2005-09-19 20:33:27
Message-ID: 29458.1127162007@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> Actually, it's two changes:
> - In the btree index code, if SK_ISNULL is set, do the right thing
> - If the query has col IS NULL, expand that to col =3D NULL in the index
> quals

This is a bad idea, because it translates "x IS NULL" into "x = NULL"
which is under no circumstances the same thing. It might coincidentally
fail to malfunction for btree indexes, depending on the specifics of the
"=" operator in use; but that doesn't make it right. (AFAICS, the
proposed patch simply breaks for non-btree indexes.) Also, it cannot
handle IS NOT NULL.

A proper solution requires explicitly representing IS NULL/IS NOT NULL
as distinct kinds of scankey.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Add support for IS NULL to btree indexes
Date: 2005-09-20 08:00:12
Message-ID: 20050920075957.GA8586@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Mon, Sep 19, 2005 at 04:33:27PM -0400, Tom Lane wrote:
> This is a bad idea, because it translates "x IS NULL" into "x = NULL"
> which is under no circumstances the same thing. It might coincidentally
> fail to malfunction for btree indexes, depending on the specifics of the
> "=" operator in use; but that doesn't make it right. (AFAICS, the
> proposed patch simply breaks for non-btree indexes.) Also, it cannot
> handle IS NOT NULL.

That's point of the extra flag, to distinguish between the two cases.
It works for all types, the actual operator in the operator class is
never invoked (they're strict, the code can't call them with NULL even
if it wanted to). ExecInitIndexScan has always set SK_ISNULL for null
args, even though it never happened in practice. All it does is move to
the point in the index where the NULLs begin and start returning
tuples.

It takes a previously impossible state and makes it return something
useful.

It doesn't handle IS NOT NULL, since I don't think that's worth
indexing anyway, but perhaps for completeness. It the much harder case.

> A proper solution requires explicitly representing IS NULL/IS NOT NULL
> as distinct kinds of scankey.

Well, this patch has a kind for IS NULL. It does have issues with other
indexes, but there's no point working on that until there is a
possibility of acceptance.

The purpose was to demonstrate that it is trivially possible using what
is available.
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.