Re: compress method for spgist - 2

Lists: pgsql-hackers
From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2014-12-16 17:48:25
Message-ID: 54907069.1030506@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> For some datatypes, the compress method might be useful even if the leaf
> type is the same as the column type. For example, you could allow
> indexing text datums larger than the page size, with a compress function
> that just truncates the input.

Agree, and patch allows to use compress method in this case, see begining of
spgdoinsert()

> Could you find some use for this in one of the built-in or contrib
> types? Just to have something that exercises it as part of the
> regression suite. How about creating an opclass for the built-in polygon
> type that stores the bounding box, like the PostGIS guys are doing?

Will try, but I don't have nice idea. Polygon opclass will have awful
performance until PostGIS guys show the tree structure.

> The documentation needs to be updated.
Added.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
spgist_compress_method-3.patch.gz application/x-gzip 2.9 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2014-12-23 09:48:38
Message-ID: 54993A76.7050308@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/16/2014 07:48 PM, Teodor Sigaev wrote:
> /*
> * This struct is what we actually keep in index->rd_amcache. It includes
> * static configuration information as well as the lastUsedPages cache.
> */
> typedef struct SpGistCache
> {
> spgConfigOut config; /* filled in by opclass config method */
>
> SpGistTypeDesc attType; /* type of input data and leaf values */
> SpGistTypeDesc attPrefixType; /* type of inner-tuple prefix values */
> SpGistTypeDesc attLabelType; /* type of node label values */
>
> SpGistLUPCache lastUsedPages; /* local storage of last-used info */
> } SpGistCache;

Now that the input data type and leaf data type can be different, which
one is "attType"? It's the leaf data type, as the patch stands. I
renamed that to attLeafType, and went fixing all the references to it.
In most places it's just a matter of search & replace, but what about
the reconstructed datum? In freeScanStackEntry, we assume that
att[Leaf]Type is the datatype for reconstructedValue, but I believe
assume elsewhere that reconstructedValue is of the same data type as the
input. At least if the opclass supports index-only scans.

I think we'll need a separate SpGistTypeDesc for the input type. Or
perhaps a separate SpGistTypeDesc for the reconstructed value and an
optional decompress method to turn the reconstructedValue back into an
actual reconstructed input datum. Or something like that.

Attached is a patch with the kibitzing I did so far.

- Heikki

Attachment Content-Type Size
spgist_compress_method-4-heikki.patch text/x-diff 13.4 KB

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2014-12-23 13:02:22
Message-ID: 549967DE.1030103@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Now that the input data type and leaf data type can be different, which one is
> "attType"? It's the leaf data type, as the patch stands. I renamed that to
> attLeafType, and went fixing all the references to it. In most places it's just
> a matter of search & replace, but what about the reconstructed datum? In
> freeScanStackEntry, we assume that att[Leaf]Type is the datatype for
> reconstructedValue, but I believe assume elsewhere that reconstructedValue is of
> the same data type as the input. At least if the opclass supports index-only scans.

Agree with rename. I doubt that there is a real-world example of datatype which
can be a) effectivly compressed and b) restored to original form. If so, why
don't store it in compressed state in database? In GiST all compress methods
uses one-way compress. In PostGIS example, polygons are "compressed" into
bounding box, and, obviously, they cannot be restored.

>
> I think we'll need a separate SpGistTypeDesc for the input type. Or perhaps a
> separate SpGistTypeDesc for the reconstructed value and an optional decompress
> method to turn the reconstructedValue back into an actual reconstructed input
> datum. Or something like that.

I suppose that compress and reconstruct are mutual exclusive options.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2015-01-02 13:52:19
Message-ID: 54A6A293.2090508@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/23/2014 03:02 PM, Teodor Sigaev wrote:
>> >I think we'll need a separate SpGistTypeDesc for the input type. Or perhaps a
>> >separate SpGistTypeDesc for the reconstructed value and an optional decompress
>> >method to turn the reconstructedValue back into an actual reconstructed input
>> >datum. Or something like that.
> I suppose that compress and reconstruct are mutual exclusive options.

I would rather not assume that. You might well want to store something
in the leaf nodes that's different from the original Datum, but
nevertheless contains enough information to reconstruct the original Datum.

- Heikki


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2015-01-15 07:28:53
Message-ID: CAB7nPqSF2mbxFjJSG2x3_oQou4NfsdZq=G=y4RY+rj+LEX1YHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marking this patch as returned with feedback because it is waiting for
input from the author for now a couple of weeks. Heikki, the
refactoring patch has some value, are you planning to push it?
--
Michael


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2015-01-15 18:36:00
Message-ID: 54B80890.8040607@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/15/2015 09:28 AM, Michael Paquier wrote:
> Marking this patch as returned with feedback because it is waiting for
> input from the author for now a couple of weeks. Heikki, the
> refactoring patch has some value, are you planning to push it?

I think you're mixing up with the other thread, "btree_gin and ranges".
I pushed the refactoring patch I posted to that thread
(http://www.postgresql.org/message-id/54983CF2.80605@vmware.com)
already. I haven't proposed any refactoring related to spgist.

- Heikki


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2015-02-13 16:17:02
Message-ID: 54DE237E.4000207@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Now that the input data type and leaf data type can be different, which one is
> "attType"? It's the leaf data type, as the patch stands. I renamed that to
> attLeafType, and went fixing all the references to it. In most places it's just
> a matter of search & replace, but what about the reconstructed datum? In

Done. Now there is separate attType and attLeafType which describe input/output
and leaf types.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
spgist_compress_method-5.patch.gz application/x-gzip 4.1 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2015-02-25 14:13:27
Message-ID: 54EDD887.8030703@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/13/2015 06:17 PM, Teodor Sigaev wrote:
>> Now that the input data type and leaf data type can be different, which one is
>> "attType"? It's the leaf data type, as the patch stands. I renamed that to
>> attLeafType, and went fixing all the references to it. In most places it's just
>> a matter of search & replace, but what about the reconstructed datum? In
>
> Done. Now there is separate attType and attLeafType which describe input/output
> and leaf types.

Thanks.

Did you try finding a use case for this patch in one of the built-in or
contrib datatypes? That would allow writing a regression test for this.

In the original post on this, you mentioned that the PostGIS guys
planned to use this to store polygons, as bounding boxes
(http://www.postgresql.org/message-id/5447B3FF.2080406@sigaev.ru). Any
idea how would that work?

- Heikki


From: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2015-03-04 16:58:33
Message-ID: CACowWR2LhVi4JHEVV=PYzLSyBfnpMd5+bhr2R8t5vzs0o79Ndw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 25, 2015 at 6:13 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> In the original post on this, you mentioned that the PostGIS guys planned to
> use this to store polygons, as bounding boxes
> (http://www.postgresql.org/message-id/5447B3FF.2080406@sigaev.ru). Any idea
> how would that work?

Poorly, by hanging boxes that straddled dividing lines off the parent
node in a big linear list. The hope would be that the case was
sufficiently rare compared to the overall volume of data, to not be an
issue. Oddly enough this big hammer has worked in other
implementations at least passable well
(https://github.com/mapserver/mapserver/blob/branch-7-0/maptree.c#L261)

P.


From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2015-07-23 07:36:04
Message-ID: 55B09964.80707@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/04/2015 06:58 PM, Paul Ramsey wrote:
> On Wed, Feb 25, 2015 at 6:13 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> In the original post on this, you mentioned that the PostGIS guys planned to
>> use this to store polygons, as bounding boxes
>> (http://www.postgresql.org/message-id/5447B3FF.2080406@sigaev.ru). Any idea
>> how would that work?
>
> Poorly, by hanging boxes that straddled dividing lines off the parent
> node in a big linear list. The hope would be that the case was
> sufficiently rare compared to the overall volume of data, to not be an
> issue. Oddly enough this big hammer has worked in other
> implementations at least passable well
> (https://github.com/mapserver/mapserver/blob/branch-7-0/maptree.c#L261)

Ok, I see, but that's not really what I was wondering. My question is
this: SP-GiST partitions the space into non-overlapping sections. How
can you store polygons - which can overlap - in an SP-GiST index? And
how does the compress method help with that?

- Heikki


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: hlinnaka(at)iki(dot)fi, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2015-07-23 09:18:01
Message-ID: 55B0B149.1010903@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> Poorly, by hanging boxes that straddled dividing lines off the parent
>> node in a big linear list. The hope would be that the case was
> Ok, I see, but that's not really what I was wondering. My question is this:
> SP-GiST partitions the space into non-overlapping sections. How can you store
> polygons - which can overlap - in an SP-GiST index? And how does the compress
> method help with that?

I believe if we found a way to index boxes then we will need a compress method
to build index over polygons.

BTW, we are working on investigation a index structure for box where 2d-box is
treated as 4d-point.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2015-08-25 13:05:05
Message-ID: CAB7nPqTTL1YhDieCO+23irywBaUb64k8MFQEF0-8BKdtwg=RWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 23, 2015 at 6:18 PM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>>> Poorly, by hanging boxes that straddled dividing lines off the parent
>>> node in a big linear list. The hope would be that the case was
>>
>> Ok, I see, but that's not really what I was wondering. My question is
>> this:
>> SP-GiST partitions the space into non-overlapping sections. How can you
>> store
>> polygons - which can overlap - in an SP-GiST index? And how does the
>> compress
>> method help with that?
>
>
> I believe if we found a way to index boxes then we will need a compress
> method to build index over polygons.
>
> BTW, we are working on investigation a index structure for box where 2d-box
> is treated as 4d-point.

There has been no activity on this patch for some time now, and a new
patch version has not been submitted, so I am marking it as return
with feedback.
--
Michael


From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2017-09-18 15:21:44
Message-ID: CAPpHfdt0xmpRR5xt-gm4fnJSyeZGhrBFLpT3XRHMtO5SGVu39w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 25, 2015 at 4:05 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Thu, Jul 23, 2015 at 6:18 PM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> >>> Poorly, by hanging boxes that straddled dividing lines off the parent
> >>> node in a big linear list. The hope would be that the case was
> >>
> >> Ok, I see, but that's not really what I was wondering. My question is
> >> this:
> >> SP-GiST partitions the space into non-overlapping sections. How can you
> >> store
> >> polygons - which can overlap - in an SP-GiST index? And how does the
> >> compress
> >> method help with that?
> >
> >
> > I believe if we found a way to index boxes then we will need a compress
> > method to build index over polygons.
> >
> > BTW, we are working on investigation a index structure for box where
> 2d-box
> > is treated as 4d-point.
>
> There has been no activity on this patch for some time now, and a new
> patch version has not been submitted, so I am marking it as return
> with feedback.
>

There is interest to this patch from PostGIS users. It would be nice to
pickup this patch.
AFAICS, the progress on this patch was suspended because we have no example
for SP-GiST compress method in core/contrib.
However, now we have acdf2a8b committed with 2d to 4d indexing of boxes
using SP-GiST. So, extending this 2d to 4d approach to polygons would be
good example of SP-GiST compress method in core. Would anyone be
volunteering for writing such patch?
If nobody, then I could do it....

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 12:20:59
Message-ID: CAPpHfdsoy2bJMSTtvQySJ9zN2YJmzWJBLvdHV=ZvMOm6BR0UFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 18, 2017 at 6:21 PM, Alexander Korotkov <
a(dot)korotkov(at)postgrespro(dot)ru> wrote:

> On Tue, Aug 25, 2015 at 4:05 PM, Michael Paquier <
> michael(dot)paquier(at)gmail(dot)com> wrote:
>
>> On Thu, Jul 23, 2015 at 6:18 PM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>> >>> Poorly, by hanging boxes that straddled dividing lines off the parent
>> >>> node in a big linear list. The hope would be that the case was
>> >>
>> >> Ok, I see, but that's not really what I was wondering. My question is
>> >> this:
>> >> SP-GiST partitions the space into non-overlapping sections. How can you
>> >> store
>> >> polygons - which can overlap - in an SP-GiST index? And how does the
>> >> compress
>> >> method help with that?
>> >
>> >
>> > I believe if we found a way to index boxes then we will need a compress
>> > method to build index over polygons.
>> >
>> > BTW, we are working on investigation a index structure for box where
>> 2d-box
>> > is treated as 4d-point.
>>
>> There has been no activity on this patch for some time now, and a new
>> patch version has not been submitted, so I am marking it as return
>> with feedback.
>>
>
> There is interest to this patch from PostGIS users. It would be nice to
> pickup this patch.
> AFAICS, the progress on this patch was suspended because we have no
> example for SP-GiST compress method in core/contrib.
> However, now we have acdf2a8b committed with 2d to 4d indexing of boxes
> using SP-GiST. So, extending this 2d to 4d approach to polygons would be
> good example of SP-GiST compress method in core. Would anyone be
> volunteering for writing such patch?
> If nobody, then I could do it....
>

Nobody answered yet. And I decided to nail down this long term issue.
Please, find following attached patches.

0001-spgist-compress-method-6.patch

Patch with SP-GiST compress method rebased on current master. Index AM
interface was changed since that time. I've added validation for compress
method: it validates input and output types of compress method. That
required to call config method before. That is controversial solution. In
particular, no collation is provided in config method call. It would be
weird if collation could affect data types in SP-GiST config method output,
but anyway...

0002-spgist-circle-polygon-6.patch

This patch provides example of SP-GiST compress method usage. It adds
SP-GiST indexing for circles and polygons using mapping of their bounding
boxes to 4d. This patch is based on prior work by Nikita Glukhov for
SP-GiST KNN.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-spgist-compress-method-6.patch application/octet-stream 18.1 KB
0002-spgist-circle-polygon-6.patch application/octet-stream 42.4 KB

From: Darafei Praliaskouski <me(at)komzpa(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 19:00:59
Message-ID: 20170920190059.1354.29457.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

Hi,

I like the SP-GiST part of the patch. Looking forward to it, so PostGIS can benefit from SP-GiST infrastructure.

I have some questions about the circles example though.

* What is the reason for isnan check and swap of box ordinates for circle? It wasn't in the code previously.
* There are tests for infinities in circles, but checks are for NaNs.
* It seems to me that circle can be implemented without recheck, by making direct exact calculations.
How about removing circle from the scope of this patch, so it is smaller and cleaner?


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Darafei Praliaskouski <me(at)komzpa(dot)net>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 19:48:20
Message-ID: CAPpHfdv+7QxdaVmR=wM=zAnaR=w1uvaHhmzuN8mM7c9g14QUfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 20, 2017 at 10:00 PM, Darafei Praliaskouski <me(at)komzpa(dot)net>
wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world: not tested
> Implements feature: not tested
> Spec compliant: not tested
> Documentation: tested, passed
>
> Hi,
>
> I like the SP-GiST part of the patch. Looking forward to it, so PostGIS
> can benefit from SP-GiST infrastructure.
>
> I have some questions about the circles example though.
>
> * What is the reason for isnan check and swap of box ordinates for
> circle? It wasn't in the code previously.
>
* There are tests for infinities in circles, but checks are for NaNs.
>

This code was migrated from original patch by Nikita. I can assume he
means that nan should be treated as greatest possible floating point value
(like float4_cmp_internal() does). However, our current implementation of
geometrical datatypes don't correctly handles all the combinations of infs
as nans. Most of code was written without taking infs and nans into
account. Also, I'm not sure if this code fixes all possible issues with
infs and nans in SP-GiST opclass for circles. This is why I'm going to
remove nans handling from this place.

> * It seems to me that circle can be implemented without recheck, by
> making direct exact calculations.
>

Right. Holding circles in the leafs instead of bounding boxes would both
allow exact calculations and take less space.

> How about removing circle from the scope of this patch, so it is smaller
> and cleaner?

Good point. Polygons are enough for compress function example. Opclass
for circles could be submitted as separate patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Darafei Praliaskouski <me(at)komzpa(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 20:07:33
Message-ID: 31999.1505938053@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Darafei Praliaskouski <me(at)komzpa(dot)net> writes:
> I have some questions about the circles example though.

> * What is the reason for isnan check and swap of box ordinates for circle? It wasn't in the code previously.

I hadn't paid any attention to this patch previously, but this comment
excited my curiosity, so I went and looked:

+ bbox->high.x = circle->center.x + circle->radius;
+ bbox->low.x = circle->center.x - circle->radius;
+ bbox->high.y = circle->center.y + circle->radius;
+ bbox->low.y = circle->center.y - circle->radius;
+
+ if (isnan(bbox->low.x))
+ {
+ double tmp = bbox->low.x;
+ bbox->low.x = bbox->high.x;
+ bbox->high.x = tmp;
+ }

Maybe I'm missing something, but it appears to me that it's impossible for
bbox->low.x to be NaN unless circle->center.x and/or circle->radius is a
NaN, in which case bbox->high.x would also have been computed as a NaN,
making the swap entirely useless. Likewise for the Y case. There may be
something useful to do about NaNs here, but this doesn't seem like it.

> How about removing circle from the scope of this patch, so it is smaller and cleaner?

Neither of those patches would be particularly large, and since they'd need
to touch adjacent code in some places, the diffs wouldn't be independent.
I think splitting them is just make-work.

regards, tom lane


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 20:19:01
Message-ID: CAPpHfduv0XZ_k+D3NoFMr=wFVM22D=wFZCMMswoLwBe2jAbuJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 20, 2017 at 11:07 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Darafei Praliaskouski <me(at)komzpa(dot)net> writes:
> > I have some questions about the circles example though.
>
> > * What is the reason for isnan check and swap of box ordinates for
> circle? It wasn't in the code previously.
>
> I hadn't paid any attention to this patch previously, but this comment
> excited my curiosity, so I went and looked:
>
> + bbox->high.x = circle->center.x + circle->radius;
> + bbox->low.x = circle->center.x - circle->radius;
> + bbox->high.y = circle->center.y + circle->radius;
> + bbox->low.y = circle->center.y - circle->radius;
> +
> + if (isnan(bbox->low.x))
> + {
> + double tmp = bbox->low.x;
> + bbox->low.x = bbox->high.x;
> + bbox->high.x = tmp;
> + }
>
> Maybe I'm missing something, but it appears to me that it's impossible for
> bbox->low.x to be NaN unless circle->center.x and/or circle->radius is a
> NaN, in which case bbox->high.x would also have been computed as a NaN,
> making the swap entirely useless. Likewise for the Y case. There may be
> something useful to do about NaNs here, but this doesn't seem like it.
>

Yeah, +1.

> How about removing circle from the scope of this patch, so it is smaller
> and cleaner?
>
> Neither of those patches would be particularly large, and since they'd need
> to touch adjacent code in some places, the diffs wouldn't be independent.
> I think splitting them is just make-work.
>

I've extracted polygon opclass into separate patch (attached). I'll rework
and resubmit circle patch later.
I'm not particularly sure that polygon.sql is a good place for testing
sp-gist opclass for polygons... But we've already done so for box.sql.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-spgist-compress-method-7.patch application/octet-stream 18.1 KB
0002-spgist-polygon-7.patch application/octet-stream 26.5 KB

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 21:57:49
Message-ID: 60e048d3-aa29-8289-5438-3b0c4c1d8d84@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20.09.2017 23:19, Alexander Korotkov wrote:

> On Wed, Sep 20, 2017 at 11:07 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>
> Darafei Praliaskouski <me(at)komzpa(dot)net <mailto:me(at)komzpa(dot)net>> writes:
> > I have some questions about the circles example though.
>
> >  * What is the reason for isnan check and swap of box ordinates
> for circle? It wasn't in the code previously.
>
> I hadn't paid any attention to this patch previously, but this comment
> excited my curiosity, so I went and looked:
>
> +       bbox->high.x = circle->center.x + circle->radius;
> +       bbox->low.x = circle->center.x - circle->radius;
> +       bbox->high.y = circle->center.y + circle->radius;
> +       bbox->low.y = circle->center.y - circle->radius;
> +
> +       if (isnan(bbox->low.x))
> +       {
> +               double tmp = bbox->low.x;
> +               bbox->low.x = bbox->high.x;
> +               bbox->high.x = tmp;
> +       }
>
> Maybe I'm missing something, but it appears to me that it's
> impossible for
> bbox->low.x to be NaN unless circle->center.x and/or
> circle->radius is a
> NaN, in which case bbox->high.x would also have been computed as a
> NaN,
> making the swap entirely useless.  Likewise for the Y case.  There
> may be
> something useful to do about NaNs here, but this doesn't seem like it.
>
> Yeah, +1.
>

It is possible for bbox->low.x to be NaN when circle->center.x is and
circle->radius are both +Infinity.  Without this float-order-preserving
swapping
one regression test for KNN with ORDER BY index will be totally broken
(you can
try it: https://github.com/glukhovn/postgres/tree/knn). Unfortunately, I
do not
remember exactly why, but most likely because of the incorrect index
structure.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 22:34:47
Message-ID: 5454.1505946887@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> writes:
> On 20.09.2017 23:19, Alexander Korotkov wrote:
>> On Wed, Sep 20, 2017 at 11:07 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
>> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>>> Maybe I'm missing something, but it appears to me that it's
>>> impossible for bbox->low.x to be NaN unless circle->center.x and/or
>>> circle->radius is a NaN, in which case bbox->high.x would also have been computed as a NaN,
>>> making the swap entirely useless.

> It is possible for bbox->low.x to be NaN when circle->center.x is and
> circle->radius are both +Infinity.  Without this float-order-preserving
> swapping
> one regression test for KNN with ORDER BY index will be totally broken
> (you can
> try it: https://github.com/glukhovn/postgres/tree/knn).

If that's the reasoning, not having a comment explaining it is
inexcusable. Do you really think people will understand what
the code is doing?

regards, tom lane


From: Darafei "Komяpa" Praliaskouski <me(at)komzpa(dot)net>
To: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 23:06:30
Message-ID: CAC8Q8tJ6jPJd_BJftq9CVrUNTNM__P-H4jbNO9qr5oV=bYJU0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
> It is possible for bbox->low.x to be NaN when circle->center.x is and
> circle->radius are both +Infinity.
>

What is rationale behind this circle?
It seems to me that any circle with radius of any Infinity should become a
[-Infinity .. Infinity, -Infinity .. Infinity] box. Then you won't have
NaNs, and index structure shouldn't be broken.

If it happens because NaN > Infinity, then the check should be not for
isnan, but for if (low>high){swap(high, low)}. It probably should be a part
of box, not a part of circle, maths.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Darafei "Komяpa" Praliaskouski <me(at)komzpa(dot)net>
Cc: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 23:20:40
Message-ID: 7349.1505949640@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?UTF-8?Q?Darafei_=22Kom=D1=8Fpa=22_Praliaskouski?= <me(at)komzpa(dot)net> writes:
> If it happens because NaN > Infinity, then the check should be not for
> isnan, but for if (low>high){swap(high, low)}.

Yeah, the same idea had occurred to me. It'd still need a comment, but
at least it's slightly more apparent what we're trying to ensure.

regards, tom lane


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>
Cc: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-20 23:27:50
Message-ID: CAPpHfdsAruXqE80v_crv=du4GoUqWgx55gXR05ZAhyRp4kxGBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 21, 2017 at 2:06 AM, Darafei "Komяpa" Praliaskouski <
me(at)komzpa(dot)net> wrote:

> It is possible for bbox->low.x to be NaN when circle->center.x is and
>> circle->radius are both +Infinity.
>>
>
> What is rationale behind this circle?
>

I would prefer to rather forbid any geometries with infs and nans.
However, then upgrade process will suffer. User with such geometries would
get errors during dump/restore, pg_upgraded instances would still contain
invalid values...

> It seems to me that any circle with radius of any Infinity should become a
> [-Infinity .. Infinity, -Infinity .. Infinity] box.Then you won't have
> NaNs, and index structure shouldn't be broken.
>

We probably should produce [-Infinity .. Infinity, -Infinity .. Infinity]
box for any geometry containing inf or nan. That MBR would be founded for
any query, saying: "index can't help you for this kind value, only recheck
can deal with that". Therefore, we would at least guarantee that results
of sequential scan and index scan are the same.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-21 00:14:45
Message-ID: 9463.1505952885@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alexander Korotkov <aekorotkov(at)gmail(dot)com> writes:
> On Thu, Sep 21, 2017 at 2:06 AM, Darafei "Komяpa" Praliaskouski <
> me(at)komzpa(dot)net> wrote:
>> What is rationale behind this circle?

> I would prefer to rather forbid any geometries with infs and nans.
> However, then upgrade process will suffer. User with such geometries would
> get errors during dump/restore, pg_upgraded instances would still contain
> invalid values...

Yeah, that ship has sailed unfortunately.

>> It seems to me that any circle with radius of any Infinity should become a
>> [-Infinity .. Infinity, -Infinity .. Infinity] box.Then you won't have
>> NaNs, and index structure shouldn't be broken.

> We probably should produce [-Infinity .. Infinity, -Infinity .. Infinity]
> box for any geometry containing inf or nan.

Hm, we can do better in at least some cases, eg for a box ((0,1),(1,inf))
there's no reason to give up our knowledge of finite bounds for the
other three boundaries. But certainly for a NaN circle radius
what you suggest seems the most sensible thing to do.

regards, tom lane


From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-21 08:27:47
Message-ID: CAPpHfdsAoqxD+=cGyri3iSqHDTc7HrCRWRwgpU86mC+k8moJiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 21, 2017 at 3:14 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Alexander Korotkov <aekorotkov(at)gmail(dot)com> writes:
> > On Thu, Sep 21, 2017 at 2:06 AM, Darafei "Komяpa" Praliaskouski <
> >> It seems to me that any circle with radius of any Infinity should
> become a
> >> [-Infinity .. Infinity, -Infinity .. Infinity] box.Then you won't have
> >> NaNs, and index structure shouldn't be broken.
>
> > We probably should produce [-Infinity .. Infinity, -Infinity .. Infinity]
> > box for any geometry containing inf or nan.
>
> Hm, we can do better in at least some cases, eg for a box ((0,1),(1,inf))
> there's no reason to give up our knowledge of finite bounds for the
> other three boundaries. But certainly for a NaN circle radius
> what you suggest seems the most sensible thing to do.
>

OK. I'll try implement this for circles.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-09-22 00:03:50
Message-ID: 1499c9d0-075a-3014-d2aa-ba59121b3728@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21.09.2017 02:27, Alexander Korotkov wrote:

> On Thu, Sep 21, 2017 at 2:06 AM, Darafei "Komяpa" Praliaskouski
> <me(at)komzpa(dot)net <mailto:me(at)komzpa(dot)net>> wrote:
>
> It is possible for bbox->low.x to be NaN when circle->center.x
> is and
> circle->radius are both +Infinity.
>
>
> What is rationale behind this circle?
>
>
> I would prefer to rather forbid any geometries with infs and nans. 
> However, then upgrade process will suffer.  User with such geometries
> would get errors during dump/restore, pg_upgraded instances would
> still contain invalid values...
>
> It seems to me that any circle with radius of any Infinity should
> become a [-Infinity .. Infinity, -Infinity .. Infinity] box.Then
> you won't have NaNs, and index structure shouldn't be broken.
>
>
> We probably should produce [-Infinity .. Infinity, -Infinity ..
> Infinity] box for any geometry containing inf or nan.  That MBR would
> be founded for any query, saying: "index can't help you for this kind
> value, only recheck can deal with that".  Therefore, we would at least
> guarantee that results of sequential scan and index scan are the same.
>

I have looked at the GiST KNN code and found the same errors for NaNs,
infinities and NULLs as in my SP-GiST KNN patch.

Attached two patches:

1. Fix NaN-inconsistencies in circle's bounding boxes computed in GiST
compress
method leading to the failed Assert(box->low.x <= box->high.x) in
computeDistance() from src/backend/access/gist/gistproc.c by
transforming NaNs
into infinities (corresponding test case provided in the second patch).

2. Fix ordering of NULL, NaN and infinite distances by GiST.  This distance
values could be mixed because NULL distances were transformed into
infinities,
and there was no special processing for NaNs in KNN queue's comparison
function.
At first I tried just to set recheck flag for NULL distances, but it did not
work for index-only scans because they do not support rechecking. So I had
to add a special flag for NULL distances.

Should I start a separate thread for this issue and add patches to
commitfest?

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Fix-circle-bounding-box-inconsistency-in-GiST-compress-method-v01.patch text/x-patch 1.1 KB
0002-Fix-GiST-ordering-by-distance-for-NULLs-and-NaNs-v01.patch text/x-patch 12.0 KB

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: [HACKERS] compress method for spgist - 2
Date: 2017-11-30 02:31:53
Message-ID: CAB7nPqQs1sYkBnKQ6HFcwSWEnwj0uZU2590C5dWtqw3TLRz5EQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 22, 2017 at 9:03 AM, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> wrote:
> Should I start a separate thread for this issue and add patches to
> commitfest?

Yes, please. It would be nice if you could spawn a separate thread for
what looks like a bug, and separate topics should have their own
thread. This will attract more attention from other hackers as this is
unrelated to this thread. Adding an entry in the CF app under the
category "Bug Fix" also avoids losing any items worth fixing.

I can see as well that the patches posted at the beginning of the
thread got reviews but that those did not get answered. The set of
patches also have conflicts with HEAD so they need a rebase. For those
reasons I am marking this entry as returned with feedback.
--
Michael


From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: [HACKERS] compress method for spgist - 2
Date: 2017-12-05 00:02:32
Message-ID: a4678fac-c074-47c0-2bc5-1ed9866f6e66@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30.11.2017 05:31, Michael Paquier wrote:

> I can see as well that the patches posted at the beginning of the
> thread got reviews but that those did not get answered. The set of
> patches also have conflicts with HEAD so they need a rebase. For those
> reasons I am marking this entry as returned with feedback.

Rebased patches are attached.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-spgist-compress-method-8.patch text/x-patch 14.8 KB
0002-spgist-polygon-8.patch text/x-patch 25.0 KB

From: Darafei Praliaskouski <me(at)komzpa(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-12-05 10:14:17
Message-ID: 20171205101417.1138.18005.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

I've read the updated patch and see my concerns addressed.

I'm looking forward to SP-GiST compress method support, as it will allow usage of SP-GiST index infrastructure for PostGIS.

The new status of this patch is: Ready for Committer


From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Darafei Praliaskouski <me(at)komzpa(dot)net>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-12-05 20:59:33
Message-ID: CAPpHfdtQtqHLEjgdDcML_QuOBtDjcTrQckaBxCGriaaO2HhUKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski <me(at)komzpa(dot)net> wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world: not tested
> Implements feature: not tested
> Spec compliant: not tested
> Documentation: tested, passed
>
> I've read the updated patch and see my concerns addressed.
>
> I'm looking forward to SP-GiST compress method support, as it will allow
> usage of SP-GiST index infrastructure for PostGIS.
>
> The new status of this patch is: Ready for Committer
>

I went trough this patch. I found documentation changes to be not
sufficient. And I've made some improvements.

In particular, I didn't understand why is reconstructedValue claimed to be
of spgConfigOut.leafType while it should be of spgConfigIn.attType both
from general logic and code. I've fixed that. Nikita, correct me if I'm
wrong.

Also, I wonder should we check for existence of compress method when
attType and leafType are not the same in spgvalidate() function? We don't
do this for now.

0002-spgist-polygon-8.patch is OK for me so soon.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-spgist-compress-method-9.patch application/octet-stream 24.7 KB

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Darafei Praliaskouski <me(at)komzpa(dot)net>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-12-06 15:08:33
Message-ID: afbf0789-50c8-82ac-55b8-898724068622@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05.12.2017 23:59, Alexander Korotkov wrote:

> On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski <me(at)komzpa(dot)net
> <mailto:me(at)komzpa(dot)net>> wrote:
>
> The following review has been posted through the commitfest
> application:
> make installcheck-world:  not tested
> Implements feature:       not tested
> Spec compliant:           not tested
> Documentation:            tested, passed
>
> I've read the updated patch and see my concerns addressed.
>
> I'm looking forward to SP-GiST compress method support, as it will
> allow usage of SP-GiST index infrastructure for PostGIS.
>
> The new status of this patch is: Ready for Committer
>
>
> I went trough this patch.  I found documentation changes to be not
> sufficient.  And I've made some improvements.
>
> In particular, I didn't understand why is reconstructedValue claimed
> to be of spgConfigOut.leafType while it should be of
> spgConfigIn.attType both from general logic and code.  I've fixed
> that.  Nikita, correct me if I'm wrong.

I think we are reconstructing a leaf datum, so documentation was correct
but the code in spgWalk() and freeScanStackEntry() wrongly used attType
instead of attLeafType. Fixed patch is attached.

> Also, I wonder should we check for existence of compress method when
> attType and leafType are not the same in spgvalidate() function?  We
> don't do this for now.
I've added compress method existence check to spgvalidate().

> 0002-spgist-polygon-8.patch is OK for me so soon.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-spgist-compress-method-10.patch text/x-patch 20.5 KB

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Cc: Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-12-06 18:49:53
Message-ID: CAPpHfdt0PEL0z1HwYj5Rb=77YuB+tKtrdSM7RMktMPTjnsHPVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 6, 2017 at 6:08 PM, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
wrote:

> On 05.12.2017 23:59, Alexander Korotkov wrote:
>
> On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski <me(at)komzpa(dot)net>
> wrote:
>
>> The following review has been posted through the commitfest application:
>> make installcheck-world: not tested
>> Implements feature: not tested
>> Spec compliant: not tested
>> Documentation: tested, passed
>>
>> I've read the updated patch and see my concerns addressed.
>>
>> I'm looking forward to SP-GiST compress method support, as it will allow
>> usage of SP-GiST index infrastructure for PostGIS.
>>
>> The new status of this patch is: Ready for Committer
>>
>
> I went trough this patch. I found documentation changes to be not
> sufficient. And I've made some improvements.
>
> In particular, I didn't understand why is reconstructedValue claimed to be
> of spgConfigOut.leafType while it should be of spgConfigIn.attType both
> from general logic and code. I've fixed that. Nikita, correct me if I'm
> wrong.
>
>
> I think we are reconstructing a leaf datum, so documentation was correct
> but the code in spgWalk() and freeScanStackEntry() wrongly used attType
> instead of attLeafType. Fixed patch is attached.
>

Reconstructed datum is used for index-only scan. Thus, it should be
original indexed datum of attType, unless we have decompress method and
pass reconstructed datum through it.

> Also, I wonder should we check for existence of compress method when
> attType and leafType are not the same in spgvalidate() function? We don't
> do this for now.
>
> I've added compress method existence check to spgvalidate().
>

It would be nice to evade double calling of config method. Possible option
could be to memorize difference between attribute type and leaf type in
high bit of functionset, which is guaranteed to be free.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-12-07 00:17:58
Message-ID: ede7be56-793c-6ab4-bb42-67e8e591c96d@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06.12.2017 21:49, Alexander Korotkov wrote:
> On Wed, Dec 6, 2017 at 6:08 PM, Nikita Glukhov
> <n(dot)gluhov(at)postgrespro(dot)ru <mailto:n(dot)gluhov(at)postgrespro(dot)ru>> wrote:
>
> On 05.12.2017 23:59, Alexander Korotkov wrote:
>
>> On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski
>> <me(at)komzpa(dot)net <mailto:me(at)komzpa(dot)net>> wrote:
>>
>> The following review has been posted through the commitfest
>> application:
>> make installcheck-world:  not tested
>> Implements feature:       not tested
>> Spec compliant:           not tested
>> Documentation:            tested, passed
>>
>> I've read the updated patch and see my concerns addressed.
>>
>> I'm looking forward to SP-GiST compress method support, as it
>> will allow usage of SP-GiST index infrastructure for PostGIS.
>>
>> The new status of this patch is: Ready for Committer
>>
>>
>> I went trough this patch.  I found documentation changes to be
>> not sufficient.  And I've made some improvements.
>>
>> In particular, I didn't understand why is reconstructedValue
>> claimed to be of spgConfigOut.leafType while it should be of
>> spgConfigIn.attType both from general logic and code.  I've fixed
>> that.  Nikita, correct me if I'm wrong.
>
> I think we are reconstructing a leaf datum, so documentation was
> correct but the code in spgWalk() and freeScanStackEntry() wrongly
> used attType instead of attLeafType. Fixed patch is attached.
>
>
> Reconstructed datum is used for index-only scan.  Thus, it should be
> original indexed datum of attType, unless we have decompress method
> and pass reconstructed datum through it.
But if we have compress method and do not have decompress method then
index-only scan seems to be impossible.

>> Also, I wonder should we check for existence of compress method
>> when attType and leafType are not the same in spgvalidate()
>> function?  We don't do this for now.
> I've added compress method existence check to spgvalidate().
>
> It would be nice to evade double calling of config method.  Possible
> option could be to memorize difference between attribute type and leaf
> type in high bit of functionset, which is guaranteed to be free.
I decided to simply set compress method's bit in functionset when this
method it is not required.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-spgist-compress-method-11.patch text/x-patch 20.5 KB

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Cc: Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Fedor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: compress method for spgist - 2
Date: 2017-12-07 12:46:01
Message-ID: CAPpHfds3Wuco3iyNmQaP724iqiaRNSEW4qR17KOFqr6mqzLGYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 7, 2017 at 3:17 AM, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
wrote:

> On 06.12.2017 21:49, Alexander Korotkov wrote:
>
> On Wed, Dec 6, 2017 at 6:08 PM, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
> wrote:
>
>> On 05.12.2017 23:59, Alexander Korotkov wrote:
>>
>> On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski <me(at)komzpa(dot)net>
>> wrote:
>>
>>> The following review has been posted through the commitfest application:
>>> make installcheck-world: not tested
>>> Implements feature: not tested
>>> Spec compliant: not tested
>>> Documentation: tested, passed
>>>
>>> I've read the updated patch and see my concerns addressed.
>>>
>>> I'm looking forward to SP-GiST compress method support, as it will allow
>>> usage of SP-GiST index infrastructure for PostGIS.
>>>
>>> The new status of this patch is: Ready for Committer
>>>
>>
>> I went trough this patch. I found documentation changes to be not
>> sufficient. And I've made some improvements.
>>
>> In particular, I didn't understand why is reconstructedValue claimed to
>> be of spgConfigOut.leafType while it should be of spgConfigIn.attType both
>> from general logic and code. I've fixed that. Nikita, correct me if I'm
>> wrong.
>>
>>
>> I think we are reconstructing a leaf datum, so documentation was correct
>> but the code in spgWalk() and freeScanStackEntry() wrongly used attType
>> instead of attLeafType. Fixed patch is attached.
>>
>
> Reconstructed datum is used for index-only scan. Thus, it should be
> original indexed datum of attType, unless we have decompress method and
> pass reconstructed datum through it.
>
> But if we have compress method and do not have decompress method then
> index-only scan seems to be impossible.
>

Sorry, I didn't realize that we don't return reconstructed value
immediately, but return only leafValue provided by leafConsistent. In this
case, leafConsistent is responsible to convert value from
spgConfigOut.leafType to spgConfigIn.attType.

TBH, practical example of SP-GiST opclass with both compress method and
index-only scan support doesn't come to my mind, because compress method is
typically needed when we have lossy representation of index keys. For
example, in GiST all the opclasses where compress method do useful work use
lossy representation of keys. Nevertheless, it's good to not cut
possibility of index-only scans when spgConfigOut.leafType !=
spgConfigIn.attType. And to lay responsibility for conversion on
leafConsistent seems like elegant way to do this. So, that's OK for me.

Also, I wonder should we check for existence of compress method when
>> attType and leafType are not the same in spgvalidate() function? We don't
>> do this for now.
>>
>> I've added compress method existence check to spgvalidate().
>>
> It would be nice to evade double calling of config method. Possible
> option could be to memorize difference between attribute type and leaf type
> in high bit of functionset, which is guaranteed to be free.
>
> I decided to simply set compress method's bit in functionset when this
> method it is not required.
>

Looks good for me.

Now, this patch is ready for committer from my point of view.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Cc: Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: compress method for spgist - 2
Date: 2017-12-25 16:02:00
Message-ID: 33b254aa-710d-794b-ca02-d1e6873a0cef@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>
> Now, this patch is ready for committer from my point of view.

Thank you, pushed

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: ilmari(at)ilmari(dot)org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=)
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: compress method for spgist - 2
Date: 2018-01-03 22:17:22
Message-ID: d8j8tdevb7x.fsf@dalvik.ping.uio.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:

>>
>> Now, this patch is ready for committer from my point of view.
>
> Thank you, pushed

This patch added two copies of the poly_ops row to the "Built-in SP-GiST
Operator Classes" table in spgist.sgml. The attached patched removes
one of them.

- ilmari
--
- Twitter seems more influential [than blogs] in the 'gets reported in
the mainstream press' sense at least. - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
to a mainstream media article. - Calle Dybedahl

Attachment Content-Type Size
0001-Remove-duplicate-poly_ops-row-from-SP-GiST-opclass-t.patch text/x-diff 1.3 KB

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: compress method for spgist - 2
Date: 2018-01-03 22:21:57
Message-ID: CAPpHfduLWb9sagdFEm=cer2F_LcRMXNVKibw__fiOwJztEuGyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 4, 2018 at 1:17 AM, Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>
wrote:

> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>
> >>
> >> Now, this patch is ready for committer from my point of view.
> >
> > Thank you, pushed
>
> This patch added two copies of the poly_ops row to the "Built-in SP-GiST
> Operator Classes" table in spgist.sgml.

Right.

> The attached patched removes
> one of them.
>

Thank for fixing this! I'm sure that Teodor will push this after end of
New Year holidays in Russia.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: compress method for spgist - 2
Date: 2018-02-28 01:37:07
Message-ID: 8FF57C39-D72C-4347-8370-B23C36BB1906@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 04 Jan 2018, at 06:17, Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org> wrote:
>
> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>
>>> Now, this patch is ready for committer from my point of view.
>>
>> Thank you, pushed
>
> This patch added two copies of the poly_ops row to the "Built-in SP-GiST
> Operator Classes" table in spgist.sgml. The attached patched removes
> one of them.

Patch looks good, marked as Ready for Committer in the CF app.

cheers ./daniel


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Darafei Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: compress method for spgist - 2
Date: 2018-02-28 23:56:08
Message-ID: 10321.1519862168@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> writes:
> On Thu, Jan 4, 2018 at 1:17 AM, Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>
> wrote:
>> This patch added two copies of the poly_ops row to the "Built-in SP-GiST
>> Operator Classes" table in spgist.sgml.

> Thank for fixing this! I'm sure that Teodor will push this after end of
> New Year holidays in Russia.

He didn't ... so I did.

regards, tom lane