Re: Index-only scans for multicolumn GIST

Lists: pgsql-hackers
From: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Index-only scans for multicolumn GIST
Date: 2014-07-21 19:47:26
Message-ID: CAP4vRV58m0=U8Qn07PvzcAbidjqtoCpvRZ=+Pi1djdigcBQMAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi, hackers!
There are new results of my work on GSoC project "Index-only scans for
GIST".
Previous post is here:
http://postgresql.1045698.n5.nabble.com/Index-only-scans-for-GIST-td5804892.html

Repository is
https://github.com/lubennikovaav/postgres/tree/indexonlygist2
Patch is in attachments.
It includes indexonlyscan for multicolumn GiST. It works correctly -
results are the same with another scan plans.
Fetch() method is realized for box and circle opclasses
Improvement for circle opclass is not such distinct as for box opclass,
because of recheck.

I remember that all "elog" and other bad comments must be fixed before this
patch can be committed.

Any comments are welcome
--
Best regards,
Lubennikova Anastasia

Attachment Content-Type Size
indexonlygist_2.1.patch application/octet-stream 31.2 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index-only scans for multicolumn GIST
Date: 2014-07-22 06:55:07
Message-ID: 53CE0ACB.9030104@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 07/21/2014 10:47 PM, Anastasia Lubennikova wrote:
> Hi, hackers!
> There are new results of my work on GSoC project "Index-only scans for
> GIST".
> Previous post is here:
> http://postgresql.1045698.n5.nabble.com/Index-only-scans-for-GIST-td5804892.html
>
> Repository is
> https://github.com/lubennikovaav/postgres/tree/indexonlygist2
> Patch is in attachments.
> It includes indexonlyscan for multicolumn GiST. It works correctly -
> results are the same with another scan plans.
> Fetch() method is realized for box and circle opclasses
> Improvement for circle opclass is not such distinct as for box opclass,
> because of recheck.

For a circle, the GiST index stores a bounding box of the circle. The
new fetch function reverses that, calculating the radius and center of
the circle from the bounding box.

Those conversions lose some precision due to rounding. Are we okay with
that? Floating point math is always subject to rounding errors, but
there's a good argument to be made that it's not acceptable to get a
different value back when the user didn't explicitly invoke any math
functions.

As an example:

create table circle_tbl (c circle);
create index circle_tbl_idx on circle_tbl using gist (c);
insert into circle_tbl values ('1.23456789012345,1.23456789012345,1e300');

postgres=# set enable_seqscan=off; set enable_bitmapscan=off; set
enable_indexonlyscan=on;
SET
SET
SET
postgres=# select * from circle_tbl ;
c
----------------
<(0,0),1e+300>
(1 row)

postgres=# set enable_indexonlyscan=off;
SET
postgres=# select * from circle_tbl ;
c
----------------------------------------------
<(1.23456789012345,1.23456789012345),1e+300>
(1 row)

- Heikki


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index-only scans for multicolumn GIST
Date: 2014-07-22 14:06:01
Message-ID: 21665.1406037961@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> For a circle, the GiST index stores a bounding box of the circle. The
> new fetch function reverses that, calculating the radius and center of
> the circle from the bounding box.

> Those conversions lose some precision due to rounding. Are we okay with
> that?

That seems like a nonstarter :-(. Index-only scans don't have a license
to return approximations. If we drop the behavior for circles, how much
functionality do we have left?

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index-only scans for multicolumn GIST
Date: 2014-07-22 14:06:03
Message-ID: CA+TgmoaOrXjV3YzSnEyg1Z2hyNKm2BD9HbBzeuJNKOdENVSEfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jul 22, 2014 at 2:55 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 07/21/2014 10:47 PM, Anastasia Lubennikova wrote:
>>
>> Hi, hackers!
>> There are new results of my work on GSoC project "Index-only scans for
>> GIST".
>> Previous post is here:
>>
>> http://postgresql.1045698.n5.nabble.com/Index-only-scans-for-GIST-td5804892.html
>>
>> Repository is
>> https://github.com/lubennikovaav/postgres/tree/indexonlygist2
>> Patch is in attachments.
>> It includes indexonlyscan for multicolumn GiST. It works correctly -
>> results are the same with another scan plans.
>> Fetch() method is realized for box and circle opclasses
>> Improvement for circle opclass is not such distinct as for box opclass,
>> because of recheck.
>
>
> For a circle, the GiST index stores a bounding box of the circle. The new
> fetch function reverses that, calculating the radius and center of the
> circle from the bounding box.
>
> Those conversions lose some precision due to rounding. Are we okay with
> that? Floating point math is always subject to rounding errors, but there's
> a good argument to be made that it's not acceptable to get a different value
> back when the user didn't explicitly invoke any math functions.
>
> As an example:
>
> create table circle_tbl (c circle);
> create index circle_tbl_idx on circle_tbl using gist (c);
> insert into circle_tbl values ('1.23456789012345,1.23456789012345,1e300');
>
> postgres=# set enable_seqscan=off; set enable_bitmapscan=off; set
> enable_indexonlyscan=on;
> SET
> SET
> SET
> postgres=# select * from circle_tbl ;
> c
> ----------------
> <(0,0),1e+300>
> (1 row)
>
> postgres=# set enable_indexonlyscan=off;
> SET
> postgres=# select * from circle_tbl ;
> c
> ----------------------------------------------
> <(1.23456789012345,1.23456789012345),1e+300>
> (1 row)

I really don't think it's ever OK for a query to produce different
answers depending on which plan is chosen.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Emre Hasegeli <emre(at)hasegeli(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index-only scans for multicolumn GIST
Date: 2014-07-23 17:28:02
Message-ID: 20140723172802.GA76818@hasegeli-2.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> That seems like a nonstarter :-(. Index-only scans don't have a license
> to return approximations. If we drop the behavior for circles, how much
> functionality do we have left?

It should work with exact operator classes, box_ops, point_ops,
range_ops, inet_ops.