Re: btree_gin and ranges

Lists: pgsql-hackers
From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: btree_gin and ranges
Date: 2014-10-22 10:55:06
Message-ID: 54478D0A.9050309@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Suggested patch adds GIN support contains operator for ranges over scalar column.

It allows more effective GIN scan. Currently, queries like
SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
will be excuted by GIN with two scans: one is from mines infinity to 1 and
another is from -1 to plus infinity. That's because GIN is "generalized" and it
doesn't know a semantics of operation.

With patch it's possible to rewrite query with ranges
SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
and GIN index will support this query with single scan from -1 to 1.

Patch provides index support only for existing range types: int4, int8, numeric,
date and timestamp with and without time zone.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
btree_gin_range-1.patch.gz application/x-gzip 7.6 KB

From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2014-10-22 11:44:16
Message-ID: CABRT9RBLZUKmeUa1twyHZ5O0Jc5zuyv_GxLGH+048EF_1BOjAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

On Wed, Oct 22, 2014 at 1:55 PM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> With patch it's possible to rewrite query with ranges
> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
> and GIN index will support this query with single scan from -1 to 1.

Shouldn't this be implemented in a more generic manner? An ordinary
btree index could very well support <@ queries too, but your patch
only adds this capability to btree-gin.

The documentation describes btree-gin as providing "GIN operator
classes that implement B-tree equivalent behavior", but now the
behavior diverges.

Regards,
Marti


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2014-10-22 12:01:15
Message-ID: 54479C8B.3080809@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Shouldn't this be implemented in a more generic manner? An ordinary
Unfortunately I don't see a way for that. GIN is generalized - and it doesn't
know semantic. Actually, we could fix first five strategy numbers for BTree's
strategies, and then we could teach GIN core to use BTRee semantic, but what
about with already existed operator classes?

>
> The documentation describes btree-gin as providing "GIN operator
> classes that implement B-tree equivalent behavior", but now the
> behavior diverges.

Anyway GIN couldn't be used for ORDER BY clause.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2014-10-24 13:09:35
Message-ID: 544A4F8F.5000801@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/22/2014 03:01 PM, Teodor Sigaev wrote:
> Anyway GIN couldn't be used for ORDER BY clause.

which would also be nice to fix...

- Heikki


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2014-10-24 13:11:32
Message-ID: 544A5004.4010606@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/22/2014 03:01 PM, Teodor Sigaev wrote:
> Anyway GIN couldn't be used for ORDER BY clause.

which would also be nice to fix...

- Heikki


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2014-10-24 15:27:28
Message-ID: 544A6FE0.3070305@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> which would also be nice to fix..

Of course, agree. With rbtree usage instead of tidbitmap hash and semantic
knowledge about operators in GIN...

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2014-12-17 19:13:40
Message-ID: 5491D5E4.10905@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/22/2014 01:55 PM, Teodor Sigaev wrote:
> Suggested patch adds GIN support contains operator for ranges over scalar column.
>
> It allows more effective GIN scan. Currently, queries like
> SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
> will be excuted by GIN with two scans: one is from mines infinity to 1 and
> another is from -1 to plus infinity. That's because GIN is "generalized" and it
> doesn't know a semantics of operation.
>
> With patch it's possible to rewrite query with ranges
> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
> and GIN index will support this query with single scan from -1 to 1.
>
> Patch provides index support only for existing range types: int4, int8, numeric,
> date and timestamp with and without time zone.

I started to look at this, but very quickly got carried away into
refactoring away the macros. Defining long functions as macros makes
debugging quite difficult, and it's not nice to read or edit the source
code either.

I propose the attached refactoring (it doesn't include your range-patch
yet, just refactoring the existing code). It turns the bulk of the
macros into static functions. GIN_SUPPORT macro still defines the
datatype-specific functions, but they are now very thin wrappers that
just call the corresponding generic static functions.

It's annoying that we need the concept of a leftmost value, for the <
and <= queries. Without that, we could have the support functions look
up the required datatype information from the type cache, based on the
datatype of the passed argument. Then you could easily use the btree_gin
support functions also for user-defined datatypes. But that needs bigger
changes, and this is a step in the right direction anyway.

- Heikki

Attachment Content-Type Size
0001-Turn-much-of-the-btree_gin-macros-into-real-function.patch text/x-diff 15.9 KB

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2014-12-22 01:15:13
Message-ID: CAB7nPqTAUbtMCGYJzt=hBK9GMnY0wBW6Ez-DvEMdvqKnQar61w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 18, 2014 at 4:13 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 10/22/2014 01:55 PM, Teodor Sigaev wrote:
>>
>> Suggested patch adds GIN support contains operator for ranges over scalar
>> column.
>>
>> It allows more effective GIN scan. Currently, queries like
>> SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
>> will be excuted by GIN with two scans: one is from mines infinity to 1 and
>> another is from -1 to plus infinity. That's because GIN is "generalized"
>> and it
>> doesn't know a semantics of operation.
>>
>> With patch it's possible to rewrite query with ranges
>> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
>> and GIN index will support this query with single scan from -1 to 1.
>>
>> Patch provides index support only for existing range types: int4, int8,
>> numeric,
>> date and timestamp with and without time zone.
>
>
> I started to look at this, but very quickly got carried away into
> refactoring away the macros. Defining long functions as macros makes
> debugging quite difficult, and it's not nice to read or edit the source code
> either.
>
> I propose the attached refactoring (it doesn't include your range-patch yet,
> just refactoring the existing code). It turns the bulk of the macros into
> static functions. GIN_SUPPORT macro still defines the datatype-specific
> functions, but they are now very thin wrappers that just call the
> corresponding generic static functions.
>
> It's annoying that we need the concept of a leftmost value, for the < and <=
> queries. Without that, we could have the support functions look up the
> required datatype information from the type cache, based on the datatype of
> the passed argument. Then you could easily use the btree_gin support
> functions also for user-defined datatypes. But that needs bigger changes,
> and this is a step in the right direction anyway.
I just had a look at the refactoring patch and ISTM that this is a
good step forward in terms of readability. Teodor, I am noticing that
your patch cannot apply once the refactoring is done. Could you rebase
your patch once the refactoring is pushed?s
--
Michael


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2014-12-22 15:46:58
Message-ID: 54983CF2.80605@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/22/2014 03:15 AM, Michael Paquier wrote:
> On Thu, Dec 18, 2014 at 4:13 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> On 10/22/2014 01:55 PM, Teodor Sigaev wrote:
>>>
>>> Suggested patch adds GIN support contains operator for ranges over scalar
>>> column.
>>>
>>> It allows more effective GIN scan. Currently, queries like
>>> SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
>>> will be excuted by GIN with two scans: one is from mines infinity to 1 and
>>> another is from -1 to plus infinity. That's because GIN is "generalized"
>>> and it
>>> doesn't know a semantics of operation.
>>>
>>> With patch it's possible to rewrite query with ranges
>>> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
>>> and GIN index will support this query with single scan from -1 to 1.
>>>
>>> Patch provides index support only for existing range types: int4, int8,
>>> numeric,
>>> date and timestamp with and without time zone.
>>
>>
>> I started to look at this, but very quickly got carried away into
>> refactoring away the macros. Defining long functions as macros makes
>> debugging quite difficult, and it's not nice to read or edit the source code
>> either.
>>
>> I propose the attached refactoring (it doesn't include your range-patch yet,
>> just refactoring the existing code). It turns the bulk of the macros into
>> static functions. GIN_SUPPORT macro still defines the datatype-specific
>> functions, but they are now very thin wrappers that just call the
>> corresponding generic static functions.
>>
>> It's annoying that we need the concept of a leftmost value, for the < and <=
>> queries. Without that, we could have the support functions look up the
>> required datatype information from the type cache, based on the datatype of
>> the passed argument. Then you could easily use the btree_gin support
>> functions also for user-defined datatypes. But that needs bigger changes,
>> and this is a step in the right direction anyway.
> I just had a look at the refactoring patch and ISTM that this is a
> good step forward in terms of readability. Teodor, I am noticing that
> your patch cannot apply once the refactoring is done. Could you rebase
> your patch once the refactoring is pushed?s

I've pushed the refactoring patch. Here's a version of Teodor's patch,
rebased over the pushed patch.

Teodor's patch could use some more comments. The
STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are a good idea, but they probably
should go into src/include/access/gin.h so that they can be used in all
compare_partial implementations.

- Heikki

Attachment Content-Type Size
btree_gin_range-2.patch text/x-diff 76.5 KB

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2014-12-26 10:33:26
Message-ID: 549D3976.70606@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
> macros are a good idea, but they probably should go into
> src/include/access/gin.h so that they can be used in all compare_partial
> implementations.

STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and comments
are improved.

Split patch to two: gin and module

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
gin_macros.patch.gz application/x-gzip 1.7 KB
btree_gin_range-3.patch.gz application/x-gzip 5.4 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2015-03-20 03:20:23
Message-ID: 20150320032023.GE6317@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Dec 26, 2014 at 01:33:26PM +0300, Teodor Sigaev wrote:
> >Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
> >macros are a good idea, but they probably should go into
> >src/include/access/gin.h so that they can be used in all compare_partial
> >implementations.
>
> STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and
> comments are improved.
>
> Split patch to two: gin and module

Where are we on this patch?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and ranges
Date: 2015-03-20 10:55:02
Message-ID: 20150320105502.GP3636@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev wrote:
> >Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
> >macros are a good idea, but they probably should go into
> >src/include/access/gin.h so that they can be used in all compare_partial
> >implementations.
>
> STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and
> comments are improved.
>
> Split patch to two: gin and module

Here you forgot to "git add" the two .sql files for the extension. They
are present in the patch Heikki posted upthread but not here.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services