Re: Patch: add GiST support for BOX @> POINT queries

Lists: pgsql-hackers
From: Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Patch: add GiST support for BOX @> POINT queries
Date: 2011-02-23 23:05:40
Message-ID: AANLkTi=MwZ_UpJ7Kp572fYJ_=vr98bgvcRP8cRmzdvNP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

While playing around with the BOX and POINT datatypes, I was surprised to
note that BOX @> POINT (and likewise POINT <@ BOX) queries were not using
the GiST index I had created on the BOX column. The attached patch adds a
new strategy @>(BOX,POINT) to the box_ops opclass. Internally,
gist_box_consistent simply transforms the POINT into its corresponding BOX.

This is my first Postgres patch, and I wasn't able to figure out how to go
about creating a regression test for this change. (All existing tests do
pass, but none of them seem to specifically test index behaviour.)

I know it is quite late in the CommitFest, should I add this to CF-Next?

-Andrew

Attachment Content-Type Size
gist-boxpoint-support.patch text/x-patch 1.2 KB

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Andrew Tipton" <andrew(dot)t(dot)tipton(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: add GiST support for BOX @> POINT queries
Date: 2011-02-23 23:12:56
Message-ID: 4D654018020000250003AF55@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com> wrote:

> should I add this to CF-Next?

Yes.

-Kevin


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: add GiST support for BOX @> POINT queries
Date: 2011-06-10 10:16:57
Message-ID: BANLkTi=q60HL9OZNAo9yV3ZMCnKjXm+Gvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/2/24 Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>:
> While playing around with the BOX and POINT datatypes, I was surprised to
> note that BOX @> POINT (and likewise POINT <@ BOX) queries were not using
> the GiST index I had created on the BOX column.  The attached patch adds a
> new strategy @>(BOX,POINT) to the box_ops opclass.  Internally,
> gist_box_consistent simply transforms the POINT into its corresponding BOX.
> This is my first Postgres patch, and I wasn't able to figure out how to go
> about creating a regression test for this change.  (All existing tests do
> pass, but none of them seem to specifically test index behaviour.)

I reviewed the patch and worried about hard-wired magic number as
StrategyNumber. At least you should use #define to indicate the
number's meaning.

In addition, the modified gist_box_consistent() is too dangerous;
q_box is declared in the if block locally and is referenced, which
pointer is passed to the outer process of the block. AFAIK if the
local memory of each block is alive outside if block is
platform-dependent.

Isn't it worth adding new consistent function for those purposes? The
approach in the patch as stands looks kludge to me.

Regards,

--
Hitoshi Harada


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: add GiST support for BOX @> POINT queries
Date: 2011-06-14 15:25:08
Message-ID: BANLkTinoQAbdftrjw0D1_H2Q2YGMSQNnxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 10, 2011 at 6:16 AM, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> wrote:
> 2011/2/24 Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>:
>> While playing around with the BOX and POINT datatypes, I was surprised to
>> note that BOX @> POINT (and likewise POINT <@ BOX) queries were not using
>> the GiST index I had created on the BOX column.  The attached patch adds a
>> new strategy @>(BOX,POINT) to the box_ops opclass.  Internally,
>> gist_box_consistent simply transforms the POINT into its corresponding BOX.
>> This is my first Postgres patch, and I wasn't able to figure out how to go
>> about creating a regression test for this change.  (All existing tests do
>> pass, but none of them seem to specifically test index behaviour.)
>
> I reviewed the patch and worried about hard-wired magic number as
> StrategyNumber. At least you should use #define to indicate the
> number's meaning.
>
> In addition, the modified gist_box_consistent() is too dangerous;
> q_box is declared in the if block locally and is referenced, which
> pointer is passed to the outer process of the block. AFAIK if the
> local memory of each block is alive outside if block is
> platform-dependent.
>
> Isn't it worth adding new consistent function for those purposes? The
> approach in the patch as stands looks kludge to me.

Andrew - in case it's not clear, we're waiting on you to respond to
Hitoshi's comments or provide an updated patch.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: add GiST support for BOX @> POINT queries
Date: 2011-06-17 10:43:47
Message-ID: BANLkTi=h_0TnFKm+HqBCs4ogjV7Xe0Z4BA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 10, 2011 at 22:16, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> wrote:
>
> I reviewed the patch and worried about hard-wired magic number as
> StrategyNumber. At least you should use #define to indicate the
> number's meaning.
>
> In addition, the modified gist_box_consistent() is too dangerous;
> q_box is declared in the if block locally and is referenced, which
> pointer is passed to the outer process of the block. AFAIK if the
> local memory of each block is alive outside if block is
> platform-dependent.
>
> Isn't it worth adding new consistent function for those purposes? The
> approach in the patch as stands looks kludge to me.

Thanks for your review. Coming back to this patch after a few months'
time, I have to say it looks pretty hackish to my eyes as well. :)

I've attempted to add a new consistent function,
gist_boxpoint_consistent(), but the GiST subsystem doesn't call it --
it continues to call gist_box_consistent(). My very simple testcase
is:

CREATE TABLE test (key TEXT PRIMARY KEY, boundary BOX NOT NULL);
CREATE INDEX ON test USING gist (boundary);
INSERT INTO test VALUES ('a', '(2,2,5,5)'), ('b', '(4,4,8,8)'), ('c',
'(7,7,11,11)');
SELECT * FROM test WHERE boundary @> '(4,4)'::POINT;

Prior to my patch, this query is executed as a straightforward seqscan.

Once I add a new strategy to pg_amop.h:
+ DATA(insert ( 2593 603 600 7 s 433 783 0 ));

(603 is the BOX oid, 600 is the POINT oid, and 433 is the @> operator oid):
...the plan switches to an index scan and gist_box_consistent() is
called; at this point, the query fails to return the correct results.

But even after adding the new consistent proc to pg_proc.h:
+ DATA(insert OID = 8000 ( gist_boxpoint_consistent PGNSP PGUID 12
1 0 0 f f f t f i 5 0 16 "2281 600 23 26 2281" _null_ _null_ _null_
_null_ gist_boxpoint_consistent _null_ _null_ _null_ ));

And adding it as a new support function in pg_amproc.h:
+ DATA(insert ( 2593 603 600 1 8000 ));
+ DATA(insert ( 2593 603 600 2 2583 ));
+ DATA(insert ( 2593 603 600 3 2579 ));
+ DATA(insert ( 2593 603 600 4 2580 ));
+ DATA(insert ( 2593 603 600 5 2581 ));
+ DATA(insert ( 2593 603 600 6 2582 ));
+ DATA(insert ( 2593 603 600 7 2584 ));

...my gist_boxpoint_consistent() function still doesn't get called.

At this point I'm a bit lost -- while pg_amop.h has plenty of examples
of crosstype comparison operators for btree index methods, there are
none for GiST. Is GiST somehow a special case in this regard?

-Andrew

Attachment Content-Type Size
gist-box_ops-v2.patch text/x-patch 4.0 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: add GiST support for BOX @> POINT queries
Date: 2011-06-17 13:59:40
Message-ID: 15507.1308319180@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com> writes:
> At this point I'm a bit lost -- while pg_amop.h has plenty of examples
> of crosstype comparison operators for btree index methods, there are
> none for GiST. Is GiST somehow a special case in this regard?

AFAIR, GIST doesn't use the concept of a crosstype opclass entry.
It only works with primary opclass entries. You have to set both
amproclefttype and amprocrighttype to the datatype of the indexable
column, regardless of what the other argument actually is.

(I think this implies that you can't have more than one consistent
function per opclass, which means you have to do whatever it is you
have in mind by patching the existing consistent function, not adding
another one alongside it.)

regards, tom lane


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: add GiST support for BOX @> POINT queries
Date: 2011-06-19 14:30:54
Message-ID: BANLkTikDchU7HbUzw-qAJtm2G3WWZrgPow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/6/17 Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>:
> On Fri, Jun 10, 2011 at 22:16, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> wrote:
>>
>> Isn't it worth adding new consistent function for those purposes? The
>> approach in the patch as stands looks kludge to me.
>
> Thanks for your review.  Coming back to this patch after a few months'
> time, I have to say it looks pretty hackish to my eyes as well. :)
>
> I've attempted to add a new consistent function,
> gist_boxpoint_consistent(), but the GiST subsystem doesn't call it --
> it continues to call gist_box_consistent().  My very simple testcase
> is:
>
> CREATE TABLE test (key TEXT PRIMARY KEY, boundary BOX NOT NULL);
> CREATE INDEX ON test USING gist (boundary);
> INSERT INTO test VALUES ('a', '(2,2,5,5)'), ('b', '(4,4,8,8)'), ('c',
> '(7,7,11,11)');
> SELECT * FROM test WHERE boundary @> '(4,4)'::POINT;
>
> Prior to my patch, this query is executed as a straightforward seqscan.
>
> Once I add a new strategy to pg_amop.h:
> + DATA(insert ( 2593   603 600 7 s  433 783 0 ));
>
> (603 is the BOX oid, 600 is the POINT oid, and 433 is the @> operator oid):
> ...the plan switches to an index scan and gist_box_consistent() is
> called;  at this point, the query fails to return the correct results.
>
> But even after adding the new consistent proc to pg_proc.h:
> + DATA(insert OID = 8000 (  gist_boxpoint_consistent    PGNSP PGUID 12
> 1 0 0 f f f t f i 5 0 16 "2281 600 23 26 2281" _null_ _null_ _null_
> _null_   gist_boxpoint_consistent _null_ _null_ _null_ ));
>
> And adding it as a new support function in pg_amproc.h:
> + DATA(insert ( 2593   603 600 1 8000 ));
> + DATA(insert ( 2593   603 600 2 2583 ));
> + DATA(insert ( 2593   603 600 3 2579 ));
> + DATA(insert ( 2593   603 600 4 2580 ));
> + DATA(insert ( 2593   603 600 5 2581 ));
> + DATA(insert ( 2593   603 600 6 2582 ));
> + DATA(insert ( 2593   603 600 7 2584 ));
>
> ...my gist_boxpoint_consistent() function still doesn't get called.
>
> At this point I'm a bit lost -- while pg_amop.h has plenty of examples
> of crosstype comparison operators for btree index methods, there are
> none for GiST.  Is GiST somehow a special case in this regard?

It was I that was lost. As Tom mentioned, GiST indexes have records in
pg_amop in their specialized way. I found gist_point_consistent has
some kind of hack for that and pg_amop for point_ops records have
multiple crosstype for that. So, if I understand correctly your first
approach modifying gist_box_consistent was the right way, although
trivial issues should be fixed. Also, you may want to follow point_ops
when you are worried if the counterpart operator of commutator should
be registered or not.

Looking around those mechanisms, it occurred to me that you mentioned
only box @> point. Why don't you add circly @> point, poly @> point as
well as box? Is that hard?

Regards,

--
Hitoshi Harada


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: add GiST support for BOX @> POINT queries
Date: 2011-07-09 14:29:49
Message-ID: CAP7Qgmkb91Dex3Gm6imngvtTYiSWgXJp01a+DaYs1u3w8UQ0CA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/6/19 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
> 2011/6/17 Andrew Tipton <andrew(dot)t(dot)tipton(at)gmail(dot)com>:
>>
>> At this point I'm a bit lost -- while pg_amop.h has plenty of examples
>> of crosstype comparison operators for btree index methods, there are
>> none for GiST.  Is GiST somehow a special case in this regard?
>
> It was I that was lost. As Tom mentioned, GiST indexes have records in
> pg_amop in their specialized way. I found gist_point_consistent has
> some kind of hack for that and pg_amop for point_ops records have
> multiple crosstype for that. So, if I understand correctly your first
> approach modifying gist_box_consistent was the right way, although
> trivial issues should be fixed. Also, you may want to follow point_ops
> when you are worried if the counterpart operator of commutator should
> be registered or not.
>
> Looking around those mechanisms, it occurred to me that you mentioned
> only box @> point. Why don't you add circly @> point, poly @> point as
> well as box? Is that hard?
>

It looks like the time to wrap up. I marked "Return with Feedback" on
this patch, since response from author has not come for a while. You
may think the fix was pretty easy and the patch be small, but more
general approach was preferred, I guess. Looking forward to seeing it
in better shape next time!

Thanks,
--
Hitoshi Harada