Quick Links

Re: SP-GiST support for inet datatypes

Lists:	pgsql-hackers

From:	Emre Hasegeli <emre(at)hasegeli(dot)com>
To:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	SP-GiST support for inet datatypes
Date:	2016-03-02 20:56:12
Message-ID:	CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Attached patches add SP-GiST support to the inet datatypes. The operator
class comes with a small change on the SP-GiST framework to allow fixed
number of child nodes.

The index is like prefix tree except that it doesn't bother to split the
addresses into parts as text is split. It also doesn't use labels to know
the part after the prefix, but relies on static node numbers.

The GiST index released with version 9.4 performs really bad with real
world data. SP-GiST works much better with the query posted to the
performance list [1] a while ago:

> hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn =
2914;
> SELECT 732
>
> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
> QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=0.41..571742.27 rows=2248 width=7) (actual
time=12.643..20474.813 rows=8127 loops=1)
> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
time=0.017..0.524 rows=732 loops=1)
> -> Index Only Scan using route_gist on routes (cost=0.41..552.05
rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)
> Index Cond: (route && (hmm.route)::inet)
> Heap Fetches: 8127
> Planning time: 1.507 ms
> Execution time: 20475.605 ms
> (7 rows)
>
> hasegeli=# DROP INDEX route_gist;
> DROP INDEX
>
> hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);
> CREATE INDEX
>
> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
> QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=0.41..588634.27 rows=2248 width=7) (actual
time=0.081..16.961 rows=8127 loops=1)
> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
time=0.022..0.079 rows=732 loops=1)
> -> Index Only Scan using route_spgist on routes (cost=0.41..575.13
rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)
> Index Cond: (route && (hmm.route)::inet)
> Heap Fetches: 8127
> Planning time: 1.376 ms
> Execution time: 15.936 ms

[1]
http://www.postgresql.org/message-id/flat/alpine(dot)DEB(dot)2(dot)11(dot)1508251504160(dot)31004(at)pyrite#alpine(dot)DEB(dot)2(dot)11(dot)1508251504160(dot)31004@pyrite

Attachment	Content-Type	Size
spgist-fixed-nnodes.patch	application/octet-stream	10.6 KB
inet-spgist-v1.patch	application/octet-stream	46.0 KB

From:	Oleg Bartunov <obartunov(at)gmail(dot)com>
To:	Emre Hasegeli <emre(at)hasegeli(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-03-03 05:51:45
Message-ID:	CAF4Au4wc2npN-_yxf-2v8UBBDZ_rSC7r=93b6XKRbqEmM4m=2A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 2, 2016 at 11:56 PM, Emre Hasegeli <emre(at)hasegeli(dot)com> wrote:

> Attached patches add SP-GiST support to the inet datatypes. The operator
> class comes with a small change on the SP-GiST framework to allow fixed
> number of child nodes.
>
> The index is like prefix tree except that it doesn't bother to split the
> addresses into parts as text is split. It also doesn't use labels to know
> the part after the prefix, but relies on static node numbers.
>
>
Thanks, Emre for interesting spgist. We are bit busy and will take a look
on your patches when come to our spgist patch.

> The GiST index released with version 9.4 performs really bad with real
> world data. SP-GiST works much better with the query posted to the
> performance list [1] a while ago:
>
> > hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn =
> 2914;
> > SELECT 732
> >
> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
> > QUERY PLAN
> >
> ----------------------------------------------------------------------------------------------------------------------------------------
> > Nested Loop (cost=0.41..571742.27 rows=2248 width=7) (actual
> time=12.643..20474.813 rows=8127 loops=1)
> > -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.017..0.524 rows=732 loops=1)
> > -> Index Only Scan using route_gist on routes (cost=0.41..552.05
> rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)
> > Index Cond: (route && (hmm.route)::inet)
> > Heap Fetches: 8127
> > Planning time: 1.507 ms
> > Execution time: 20475.605 ms
> > (7 rows)
> >
> > hasegeli=# DROP INDEX route_gist;
> > DROP INDEX
> >
> > hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);
> > CREATE INDEX
> >
> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
> > QUERY PLAN
> >
> -----------------------------------------------------------------------------------------------------------------------------------------
> > Nested Loop (cost=0.41..588634.27 rows=2248 width=7) (actual
> time=0.081..16.961 rows=8127 loops=1)
> > -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.022..0.079 rows=732 loops=1)
> > -> Index Only Scan using route_spgist on routes (cost=0.41..575.13
> rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)
> > Index Cond: (route && (hmm.route)::inet)
> > Heap Fetches: 8127
> > Planning time: 1.376 ms
> > Execution time: 15.936 ms
>
> [1]
> http://www.postgresql.org/message-id/flat/alpine(dot)DEB(dot)2(dot)11(dot)1508251504160(dot)31004(at)pyrite#alpine(dot)DEB(dot)2(dot)11(dot)1508251504160(dot)31004@pyrite
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

From:	Oleg Bartunov <obartunov(at)gmail(dot)com>
To:	Emre Hasegeli <emre(at)hasegeli(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-03-03 05:55:05
Message-ID:	CAF4Au4wScXgOAwPVasD3TmkmpyEa=hKj52BVH-8U-3qn9AJ-nw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Mar 3, 2016 at 8:51 AM, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:

>
>
> On Wed, Mar 2, 2016 at 11:56 PM, Emre Hasegeli <emre(at)hasegeli(dot)com> wrote:
>
>> Attached patches add SP-GiST support to the inet datatypes. The
>> operator class comes with a small change on the SP-GiST framework to allow
>> fixed number of child nodes.
>>
>> The index is like prefix tree except that it doesn't bother to split the
>> addresses into parts as text is split. It also doesn't use labels to know
>> the part after the prefix, but relies on static node numbers.
>>
>>
> Thanks, Emre for interesting spgist. We are bit busy and will take a look
> on your patches when come to our spgist patch.
>
>

Emre, I checked original thread and didn't find sample data. Could you
provide them for testing ?

> The GiST index released with version 9.4 performs really bad with real
>> world data. SP-GiST works much better with the query posted to the
>> performance list [1] a while ago:
>>
>> > hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn =
>> 2914;
>> > SELECT 732
>> >
>> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
>> routes.route && hmm.route;
>> > QUERY
>> PLAN
>> >
>> ----------------------------------------------------------------------------------------------------------------------------------------
>> > Nested Loop (cost=0.41..571742.27 rows=2248 width=7) (actual
>> time=12.643..20474.813 rows=8127 loops=1)
>> > -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
>> time=0.017..0.524 rows=732 loops=1)
>> > -> Index Only Scan using route_gist on routes (cost=0.41..552.05
>> rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)
>> > Index Cond: (route && (hmm.route)::inet)
>> > Heap Fetches: 8127
>> > Planning time: 1.507 ms
>> > Execution time: 20475.605 ms
>> > (7 rows)
>> >
>> > hasegeli=# DROP INDEX route_gist;
>> > DROP INDEX
>> >
>> > hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);
>> > CREATE INDEX
>> >
>> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
>> routes.route && hmm.route;
>> > QUERY PLAN
>> >
>> -----------------------------------------------------------------------------------------------------------------------------------------
>> > Nested Loop (cost=0.41..588634.27 rows=2248 width=7) (actual
>> time=0.081..16.961 rows=8127 loops=1)
>> > -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
>> time=0.022..0.079 rows=732 loops=1)
>> > -> Index Only Scan using route_spgist on routes (cost=0.41..575.13
>> rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)
>> > Index Cond: (route && (hmm.route)::inet)
>> > Heap Fetches: 8127
>> > Planning time: 1.376 ms
>> > Execution time: 15.936 ms
>>
>> [1]
>> http://www.postgresql.org/message-id/flat/alpine(dot)DEB(dot)2(dot)11(dot)1508251504160(dot)31004(at)pyrite#alpine(dot)DEB(dot)2(dot)11(dot)1508251504160(dot)31004@pyrite
>>
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>>
>

From:	Emre Hasegeli <emre(at)hasegeli(dot)com>
To:	Oleg Bartunov <obartunov(at)gmail(dot)com>
Cc:	Teodor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-03-03 10:45:19
Message-ID:	CAE2gYzxv8YKEd4O+9HUYuQ=QMH4pwt9n9cmU-OchV-=N8Q7yXQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> Emre, I checked original thread and didn't find sample data. Could you provide them for testing ?

I found it on the Git history:

https://github.com/job/irrexplorer/blob/9e8b5330d7ef0022abbe1af18291257e044eb24b/data/irrexplorer_dump.sql.gz?raw=true

From:	Oleg Bartunov <obartunov(at)gmail(dot)com>
To:	Emre Hasegeli <emre(at)hasegeli(dot)com>
Cc:	Teodor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-03-08 22:17:05
Message-ID:	CAF4Au4x1dXVkVLdu88Z8wSOfGvKpS1oA1V_TVAiJ0iTYn_6EDQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Mar 3, 2016 at 11:45 AM, Emre Hasegeli <emre(at)hasegeli(dot)com> wrote:

> > Emre, I checked original thread and didn't find sample data. Could you
> provide them for testing ?
>
> I found it on the Git history:
>
>
> https://github.com/job/irrexplorer/blob/9e8b5330d7ef0022abbe1af18291257e044eb24b/data/irrexplorer_dump.sql.gz?raw=true
>

Thanks !

spgist index creates 2 times faster than gist, but index size is
noticeably bugger

Spgist index tree is much better than gist - 12149 pages vs 1334760 !

EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.41..570430.27 rows=2338 width=7) (actual
time=5.730..12085.747 rows=8127 loops=1)
Buffers: shared hit=1334760
-> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
time=0.013..0.528 rows=732 loops=1)
Buffers: shared hit=4
-> Index Only Scan using route_gist on routes (cost=0.41..550.26
rows=22900 width=7) (actual time=2.491..16.503 rows=11 loops=732)
Index Cond: (route && (hmm.route)::inet)
Heap Fetches: 8127
Buffers: shared hit=1334756
Planning time: 0.827 ms
Execution time: 12086.513 ms
(10 rows)

EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.41..588634.27 rows=2338 width=7) (actual
time=0.043..12.150 rows=8127 loops=1)
Buffers: shared hit=12149
-> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
time=0.013..0.075 rows=732 loops=1)
Buffers: shared hit=4
-> Index Only Scan using route_spgist on routes (cost=0.41..575.13
rows=22900 width=7) (actual time=0.011..0.015 rows=11 loops=732)
Index Cond: (route && (hmm.route)::inet)
Heap Fetches: 8127
Buffers: shared hit=12145
Planning time: 0.779 ms
Execution time: 12.603 ms
(10 rows)

From:	Oleg Bartunov <obartunov(at)gmail(dot)com>
To:	Emre Hasegeli <emre(at)hasegeli(dot)com>
Cc:	Teodor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-03-08 22:30:10
Message-ID:	CAF4Au4zWu0-Jpx1zXoe0E8RyFtyywaA1RWM3tGsDtDqSqUcZow@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 8, 2016 at 11:17 PM, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:

>
>
> On Thu, Mar 3, 2016 at 11:45 AM, Emre Hasegeli <emre(at)hasegeli(dot)com> wrote:
>
>> > Emre, I checked original thread and didn't find sample data. Could you
>> provide them for testing ?
>>
>> I found it on the Git history:
>>
>>
>> https://github.com/job/irrexplorer/blob/9e8b5330d7ef0022abbe1af18291257e044eb24b/data/irrexplorer_dump.sql.gz?raw=true
>>
>
> Thanks !
>
> spgist index creates 2 times faster than gist, but index size is
> noticeably bugger
>
> \di+ route_*
> List of relations
> Schema | Name | Type | Owner | Table | Size | Description
> --------+--------------+-------+----------+--------+--------+-------------
> public | route_gist | index | postgres | routes | 96 MB |
> public | route_spgist | index | postgres | routes | 132 MB |
> (2 rows)
>
> Spgist index tree is much better than gist - 12149 pages vs 1334760 !
>

I also noticed, that spgist is much faster than gist for other inet
operators. I'd like to see in 9.6.

>
>
>
> EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
> QUERY PLAN
>
> ----------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=0.41..570430.27 rows=2338 width=7) (actual
> time=5.730..12085.747 rows=8127 loops=1)
> Buffers: shared hit=1334760
> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.013..0.528 rows=732 loops=1)
> Buffers: shared hit=4
> -> Index Only Scan using route_gist on routes (cost=0.41..550.26
> rows=22900 width=7) (actual time=2.491..16.503 rows=11 loops=732)
> Index Cond: (route && (hmm.route)::inet)
> Heap Fetches: 8127
> Buffers: shared hit=1334756
> Planning time: 0.827 ms
> Execution time: 12086.513 ms
> (10 rows)
>
> EXPLAIN (ANALYZE, buffers) SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=0.41..588634.27 rows=2338 width=7) (actual
> time=0.043..12.150 rows=8127 loops=1)
> Buffers: shared hit=12149
> -> Seq Scan on hmm (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.013..0.075 rows=732 loops=1)
> Buffers: shared hit=4
> -> Index Only Scan using route_spgist on routes (cost=0.41..575.13
> rows=22900 width=7) (actual time=0.011..0.015 rows=11 loops=732)
> Index Cond: (route && (hmm.route)::inet)
> Heap Fetches: 8127
> Buffers: shared hit=12145
> Planning time: 0.779 ms
> Execution time: 12.603 ms
> (10 rows)
>
>
>
>
>

From:	Emre Hasegeli <emre(at)hasegeli(dot)com>
To:	Oleg Bartunov <obartunov(at)gmail(dot)com>
Cc:	Teodor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-03-09 09:00:17
Message-ID:	CAE2gYzxcmz_o9SnGBv2De3gK2ECodxHdsDctntLkMSLsswuYDA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>> Spgist index tree is much better than gist - 12149 pages vs 1334760 !

I assume this is the reason why it is bigger. IP addresses are very
well suited to SP-GiST. They naturally do not overlap.

> I also noticed, that spgist is much faster than gist for other inet
> operators. I'd like to see in 9.6.

Unfortunately, I missed the deadline of the last commitfest. It is on
the next one:

https://commitfest.postgresql.org/10/571/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	emre(at)hasegeli(dot)com
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-08-20 21:52:38
Message-ID:	4781.1471729958@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Emre Hasegeli <emre(at)hasegeli(dot)com> writes:
> Attached patches add SP-GiST support to the inet datatypes.

I started to look at this patch. The reported speedup is pretty nice,
but ...

> The operator
> class comes with a small change on the SP-GiST framework to allow fixed
> number of child nodes.

... this part of the patch breaks the existing API for SP-GiST opclasses.
That is a hard sell. It may only matter for one existing opclass in core,
but unless we have reason to think nobody is using any custom SP-GiST
opclasses, that is not a pleasant thing to do. How important is it really
for this opclass? Several of the existing opclasses use fixed numbers of
child nodes, so why does this need something they don't?

regards, tom lane

From:	Emre Hasegeli <emre(at)hasegeli(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-08-21 08:41:50
Message-ID:	CAE2gYzz4ijLDW51LQZycxc3s_jryEWVx5YLg2Jqfqxvt3euFcA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> ... this part of the patch breaks the existing API for SP-GiST opclasses.
> That is a hard sell. It may only matter for one existing opclass in core,
> but unless we have reason to think nobody is using any custom SP-GiST
> opclasses, that is not a pleasant thing to do. How important is it really
> for this opclass? Several of the existing opclasses use fixed numbers of
> child nodes, so why does this need something they don't?

This addition is required to implement the operator class cleanly. We
could store the id of the child node as a "label", and add nodes when
labels are missing, but this complicates all operator class functions.
Other designs are also possible using the labels, but a simple design
with fixed number of nodes worked best for me.

Currently, SP-GiST framework supports fixed number of child nodes, if
the index is growing by page splits, not by prefix splits. This is
not a fair API. I assume it is designed by only having the prefix
tree in mind. The change makes SP-GiST more reusable.

SP-GiST is not widely adopted in my experience. Breaking this part of
the API would affect prefix tree implementations. I don't think there
are much of them. Supporting the new interface is easy for them. We
can try to support the old interface by side of the new one, but this
wouldn't be very clean either.

I thought we were breaking the C interface in similar ways on major releases.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	emre(at)hasegeli(dot)com
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-08-21 16:39:14
Message-ID:	17074.1471797554@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Emre Hasegeli <emre(at)hasegeli(dot)com> writes:
>> ... Several of the existing opclasses use fixed numbers of
>> child nodes, so why does this need something they don't?

> Currently, SP-GiST framework supports fixed number of child nodes, if
> the index is growing by page splits, not by prefix splits. This is
> not a fair API.

Hm, I see your point: the spgSplitTuple action only supports forming a
new upper tuple with a single, labeled child node. That is pretty bogus.
You could imagine doing repeated spgAddNode actions to enlarge the new
upper tuple; but that's ridiculously inefficient, and it's also unclear
how you'd avoid confusing searches that descend through a partially-split
upper tuple (either concurrently, or after a crash that prevented
finishing the split).

> SP-GiST is not widely adopted in my experience. Breaking this part of
> the API would affect prefix tree implementations. I don't think there
> are much of them. Supporting the new interface is easy for them. We
> can try to support the old interface by side of the new one, but this
> wouldn't be very clean either.

We could imagine defining a "spgSplitTupleMulti" action alongside the
existing one, but I agree it's a bit of a wart --- nobody would have
designed it that way in a green field. On the other hand, index AM-to-
opclass APIs are things we don't like to break. We went out of our way
to make past GiST and GIN API changes upward-compatible.

Oleg, Teodor, do you have any idea how many people are using custom
SPGiST opclasses that might be affected by an API break here?

regards, tom lane

From:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	emre(at)hasegeli(dot)com, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-08-21 21:37:36
Message-ID:	87fupxhkti.fsf@news-spur.riddles.org.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Emre Hasegeli <emre(at)hasegeli(dot)com> writes:
>> Attached patches add SP-GiST support to the inet datatypes.

Tom> I started to look at this patch. The reported speedup is pretty
Tom> nice, but ...

The builtin gist support for inet seems quite surprisingly slow; ip4r
beats it into the ground without difficulty. (I'd be curious how well
this sp-gist version stacks up against ip4r.)

--
Andrew (irc:RhodiumToad)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc:	emre(at)hasegeli(dot)com, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-08-22 00:00:42
Message-ID:	12344.1471824042@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> I started to look at this patch. The reported speedup is pretty
> Tom> nice, but ...

> The builtin gist support for inet seems quite surprisingly slow; ip4r
> beats it into the ground without difficulty. (I'd be curious how well
> this sp-gist version stacks up against ip4r.)

Yeah ... what I find interesting about this patch is that the opclass's
data splitting rules don't seem all that much different from what the
existing gist opclass knows how to do. I suspect that the speed
differential arises because the gist opclass's penalty and picksplit
algorithms are somehow extremely stupid, which hints that maybe they
could be made better. In particular, IIRC the test cases that we looked
at when making the gist opclass tended to have only one or a few netmask
lengths, whereas in this data set the netmasks are all over the place.
I theorize that that's what's breaking the gist algorithm. Not sure
exactly how to make it better though.

This example also seems to be showing up some lack of intelligence in
SPGiST itself, in that there are an awful lot of pages with a lot of
empty space on them. Not sure why.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	emre(at)hasegeli(dot)com
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-08-23 19:20:05
Message-ID:	5488.1471980005@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I've pushed this patch with mostly (not entirely) cosmetic adjustments.
I still think it'd be worth looking into why the produced index is larger
than the GiST equivalent, and for that matter exactly why the GiST
equivalent is so much slower to search.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	emre(at)hasegeli(dot)com
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-08-24 14:11:07
Message-ID:	15201.1472047867@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> I've pushed this patch with mostly (not entirely) cosmetic adjustments.
> I still think it'd be worth looking into why the produced index is larger
> than the GiST equivalent, and for that matter exactly why the GiST
> equivalent is so much slower to search.

I spent some time poking at this, and noticed that although the picksplit
rule is guaranteed to produce a non-degenerate split unless the inputs
are all identical, it's entirely capable of producing a very bad split.
Here's some instrumentation printout showing the numbers of items assigned
to the four subnodes, for the test case we've been working with:

58 inet picksplit (227 tuples): 0 0 127 100
65 inet picksplit (227 tuples): 0 0 223 4
69 inet picksplit (225 tuples): 2 0 126 97
69 inet picksplit (227 tuples): 1 0 226 0
72 inet picksplit (227 tuples): 0 0 224 3
72 inet picksplit (227 tuples): 1 0 132 94
82 inet picksplit (226 tuples): 1 0 127 98
86 inet picksplit (227 tuples): 0 0 130 97
95 inet picksplit (227 tuples): 0 0 2 225
99 inet picksplit (227 tuples): 0 0 225 2
117 inet picksplit (227 tuples): 2 0 126 99
118 inet picksplit (227 tuples): 0 0 128 99
137 inet picksplit (227 tuples): 0 0 226 1
151 inet picksplit (227 tuples): 0 0 1 226
270 inet picksplit (227 tuples): 1 0 127 99
499 inet picksplit (122 tuples): 0 0 64 58

(This is from "sort | uniq -c" output, so the first column is the number
of occurrences of identical split counts.)

Aside from the disturbing frequency of 100-to-1 split ratios, it also
looks like the inclusion of the masklen bit is hardly pulling its weight,
though that might be a artifact of this data set.

The bad splits seem to be a direct contributor to the index's relatively
poor space efficiency; they lead to SPGiST having to move nearly all of
a long leaf chain to another page, and then soon having to split it again,
resulting in another mostly-empty page, lather rinse repeat. They can't
be very helpful for search speed either.

I think that it'd be worth investigating changing to a split rule that
uses the next two or three or four bits of the address, not just the
single next bit, to index into more than four subnodes. It's pretty clear
how to adjust the insertion functions to support that, and a crude hack in
that line suggested that the index does get noticeably smaller. However,
I didn't quite grok how to adjust the search (consistent) functions.
Obviously we could apply the same rules in a loop, considering each
successive address bit, but there may be a better way.

Even though I already committed the code, we're a very long way from
v10 release, so I see no reason not to consider incompatible changes
in the way this opclass works.

I also noticed that the index build wastes some space because SPGIST
remembers only one partly-full page within each of its page categories.
A prototype patch to enlarge the size of that cache helped a good bit
too. I'll clean that up and post it later, but I was hoping you'd
work on improving this opclass's split rules.

regards, tom lane

From:	Emre Hasegeli <emre(at)hasegeli(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SP-GiST support for inet datatypes
Date:	2016-08-24 17:32:36
Message-ID:	CAE2gYzz264-9s3coGAc1Rv8bkN4vAoEALuY0rBMx+NJxZX=+2A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> Aside from the disturbing frequency of 100-to-1 split ratios, it also
> looks like the inclusion of the masklen bit is hardly pulling its weight,
> though that might be a artifact of this data set.

I was expecting this. Including masklen bit to decision was something
we could easily do. It doesn't make the index more complicated, even
more simpler. I think it would be useful, when host addresses with
masklen are indexed. IRRExplorer dataset is just networks.

> I think that it'd be worth investigating changing to a split rule that
> uses the next two or three or four bits of the address, not just the
> single next bit, to index into more than four subnodes. It's pretty clear
> how to adjust the insertion functions to support that, and a crude hack in
> that line suggested that the index does get noticeably smaller. However,
> I didn't quite grok how to adjust the search (consistent) functions.
> Obviously we could apply the same rules in a loop, considering each
> successive address bit, but there may be a better way.

I have experimented with such designs before posting this patch, but
couldn't make any of them work. It gets very complicated when more
than one bit is used. When only 2 bits are used, there would be 12
child nodes:

* network bits 00
* network bits 01
* network bits 10
* network bits 11
* network bit 0 and host bit 0
* network bit 0 and host bit 1
* network bit 1 and host bit 0
* network bit 1 and host bit 1
* host bits 00
* host bits 01
* host bits 10
* host bits 11

Another design would be a prefix tree. We could split by using a
byte, and store that byte as label. Then there wouldn't be static
number of nodes. We would need to iterate trough the labels and
re-execute the condition on all of them. Surely this would perform
better for some working sets, but I don't think on all them. I will
experiment with this, if I get the chance.

I belive the current index is useful as it is. The wasted space
depends on the distribution of the keys:

> # create table a3 as select (trunc(random() * 256)::text || '.' || trunc(random() * 8)::text || '.0.0')::inet from generate_series(1, 1000000) as i;
> SELECT 1000000
> # create table a4 as select (trunc(random() * 256)::text || '.' || trunc(random() * 16)::text || '.0.0')::inet from generate_series(1, 1000000) as i;
> SELECT 1000000
> # create table a5 as select (trunc(random() * 256)::text || '.' || trunc(random() * 32)::text || '.0.0')::inet from generate_series(1, 1000000) as i;
> SELECT 1000000
> # create table a6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1, 1000000) as i;
> SELECT 1000000
> # create index on a3 using spgist (inet);
> CREATE INDEX
> # create index on a4 using spgist (inet);
> CREATE INDEX
> # create index on a5 using spgist (inet);
> CREATE INDEX
> # create index on a6 using spgist (inet);
> CREATE INDEX
> # \di+
> List of relations
> Schema | Name | Type | Owner | Table | Size | Description
> -------+-------------+-------+----------+-------+-------+-------------
> public | a3_inet_idx | index | hasegeli | a3 | 42 MB |
> public | a4_inet_idx | index | hasegeli | a4 | 46 MB |
> public | a5_inet_idx | index | hasegeli | a5 | 55 MB |
> public | a6_inet_idx | index | hasegeli | a6 | 56 MB |
> (4 rows)

It also gets better with the number of keys:

> # create table a6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1, 1000000) as i;
> SELECT 1000000
> # create table b6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1, 2000000) as i;
> SELECT 2000000
> # create table c6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1, 4000000) as i;
> SELECT 4000000
> # create table d6 as select (trunc(random() * 256)::text || '.' || trunc(random() * 64)::text || '.0.0')::inet from generate_series(1, 8000000) as i;
> SELECT 8000000
> # create index on a6 using spgist (inet);
> CREATE INDEX
> # create index on b6 using spgist (inet);
> CREATE INDEX
> # create index on c6 using spgist (inet);
> CREATE INDEX
> # create index on d6 using spgist (inet);
> CREATE INDEX
> # \di+
> List of relations
> Schema | Name | Type | Owner | Table | Size | Description
> -------+-------------+-------+----------+-------+--------+-------------
> public | a6_inet_idx | index | hasegeli | a6 | 56 MB |
> public | b6_inet_idx | index | hasegeli | b6 | 109 MB |
> public | c6_inet_idx | index | hasegeli | c6 | 181 MB |
> public | d6_inet_idx | index | hasegeli | a6 | 336 MB |
> (4 rows)

Thank you for committing this.