Re: Fix picksplit with nan values

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix picksplit with nan values
Date: 2013-09-08 08:27:02
Message-ID: CAPpHfdvFMv8mQFo9TLS8OkRB1J-ZjbXTH9UY_kvFxhKMOnpLqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 7, 2013 at 1:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Alexander Korotkov <aekorotkov(at)gmail(dot)com> writes:
> > PostGIS spotted that picksplit algorithm freezes in infinite loop when
> > dealing with nan values. I discovered same bug is present in core
> > opclasses. Attached patch fixes this issue interpreting nan as value
> > greater than infinity like btree comparison function does.
>
> Hm. Good point, but it seems like some of these hunks are only taking
> care of a subset of the possible combinations of input NaNs. If you're
> certain the other combinations are impossible, there should be code
> comments explaining why.
>

I had some more insight into NaNs and geometrical datatypes. I didn't find
functions dealing with geometrical datatypes to take care about NaN.
Behaviour seems pretty weird. For example, box constructor takes care about
order of coordinates but no in case of NaN:

test=# select box('(0.0,0.0)'::point,'(1.0,1.0)'::point);
box
-------------
(1,1),(0,0)
(1 row)

test=# select box('(1.0,1.0)'::point,'(0.0,0.0)'::point);
box
-------------
(1,1),(0,0)
(1 row)

test=# select box('(nan,nan)'::point,'(0.0,0.0)'::point);
box
-----------------
(0,0),(nan,nan)
(1 row)

test=# select box('(0.0,0.0)'::point,'(nan,nan)'::point);
box
-----------------
(nan,nan),(0,0)
(1 row)

Since comparison with NaN always returns false, bool returning operators
dealing with NaN-containing values mostly return false. Exceptions are
operators dealing with only some of coordinates like << and >>.

I thinks proper way to fix it is to prohibit NaNs in geometrical datatypes
at all. However, it could be a hard decision like it is about fuzzy float
comparison. I would like to fix GiST index by now.

I wrote attached patch by following principles:
1) NaN coordinates shouldn't crash or hang GiST.
2) NaN coordinates should be processed in GiST index scan like in
sequential scan.
3) NaN coordinates shouldn't lead to significant slowdown.

That could be illustrated on following test-case:

create table test1 as (select point(random(), random()) as p from
generate_series(1,1000000));
create index test1_idx on test1 using gist(p);
create table temp as (select * from test1);
insert into temp (select point('nan'::float8, 'nan'::float8) from
generate_series(1,1000));
insert into temp (select point('nan'::float8, random()) from
generate_series(1,1000));
insert into temp (select point(random(), 'nan'::float8) from
generate_series(1,1000));
create table test2 as (select * from temp order by random());
create index test2_idx on test2 using gist(p);
drop table temp;

You can't observe performance degradation of index:

test=# explain (analyze, buffers) select * from test1 where p <@
box(point(0.25,0.6),point(0.30,0.68));
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test1 (cost=48.03..2593.37 rows=1000 width=16)
(actual time=3.137..8.315 rows=4057 loops=1)
Recheck Cond: (p <@ '(0.3,0.68),(0.25,0.6)'::box)
Buffers: shared hit=2901
-> Bitmap Index Scan on test1_idx (cost=0.00..47.78 rows=1000 width=0)
(actual time=2.281..2.281 rows=4057 loops=1)
Index Cond: (p <@ '(0.3,0.68),(0.25,0.6)'::box)
Buffers: shared hit=59
Total runtime: 9.648 ms
(7 rows)

test=# explain (analyze, buffers) select * from test2 where p <@
box(point(0.25,0.6),point(0.30,0.68));
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test2 (cost=48.06..2601.55 rows=1003 width=16)
(actual time=3.121..17.717 rows=4057 loops=1)
Recheck Cond: (p <@ '(0.3,0.68),(0.25,0.6)'::box)
Buffers: shared hit=70 read=2830
-> Bitmap Index Scan on test2_idx (cost=0.00..47.81 rows=1003 width=0)
(actual time=2.246..2.246 rows=4057 loops=1)
Index Cond: (p <@ '(0.3,0.68),(0.25,0.6)'::box)
Buffers: shared hit=56
Total runtime: 18.233 ms
(7 rows)

NaN results of queries seems to be same with sequential scan:

test=# explain (analyze, buffers) select * from test2 where p <^
'(nan,0.5)'::point and p::text like '%nan%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test2 (cost=4389.54..11817.54 rows=4012 width=16)
(actual time=138.879..1096.506 rows=492 loops=1)
Recheck Cond: (p <^ '(nan,0.5)'::point)
Filter: ((p)::text ~~ '%nan%'::text)
Rows Removed by Filter: 499961
Buffers: shared hit=6287 read=3703 written=60
-> Bitmap Index Scan on test2_idx (cost=0.00..4388.53 rows=100300
width=0) (actual time=136.003..136.003 rows=500453 loops=1)
Index Cond: (p <^ '(nan,0.5)'::point)
Buffers: shared hit=4568
Total runtime: 1096.720 ms
(9 rows)

test=# explain (analyze, buffers) select * from test2 where p <^
'(nan,0.5)'::point and p::text like '%nan%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on test2 (cost=0.00..25482.02 rows=4012 width=16) (actual
time=1.156..1118.252 rows=492 loops=1)
Filter: ((p <^ '(nan,0.5)'::point) AND ((p)::text ~~ '%nan%'::text))
Rows Removed by Filter: 1002508
Buffers: shared hit=5422
Total runtime: 1118.404 ms
(5 rows)

test=# explain (analyze, buffers) select * from test2 where p >^
'(nan,0.5)'::point and p::text like '%nan%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test2 (cost=4389.54..11817.54 rows=4012 width=16)
(actual time=138.041..1084.822 rows=508 loops=1)
Recheck Cond: (p >^ '(nan,0.5)'::point)
Filter: ((p)::text ~~ '%nan%'::text)
Rows Removed by Filter: 500036
Buffers: shared hit=8602 read=1442
-> Bitmap Index Scan on test2_idx (cost=0.00..4388.53 rows=100300
width=0) (actual time=135.415..135.415 rows=500544 loops=1)
Index Cond: (p >^ '(nan,0.5)'::point)
Buffers: shared hit=4560 read=62
Total runtime: 1085.028 ms
(9 rows)

test=# explain (analyze, buffers) select * from test2 where p >^
'(nan,0.5)'::point and p::text like '%nan%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on test2 (cost=0.00..25482.02 rows=4012 width=16) (actual
time=2.786..1121.685 rows=508 loops=1)
Filter: ((p >^ '(nan,0.5)'::point) AND ((p)::text ~~ '%nan%'::text))
Rows Removed by Filter: 1002492
Buffers: shared hit=5422
Total runtime: 1121.815 ms
(5 rows)

test=# explain (analyze, buffers) select * from test2 where p <<
'(0.5,nan)'::point and p::text like '%nan%';
explain (analyze, buffers) select * from test2 where p >>
'(0.5,nan)'::point and p::text like '%nan%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test2 (cost=4389.54..11817.54 rows=4012 width=16)
(actual time=135.344..1072.619 rows=502 loops=1)
Recheck Cond: (p << '(0.5,nan)'::point)
Filter: ((p)::text ~~ '%nan%'::text)
Rows Removed by Filter: 500000
Buffers: shared hit=9992 read=31
-> Bitmap Index Scan on test2_idx (cost=0.00..4388.53 rows=100300
width=0) (actual time=132.975..132.975 rows=500502 loops=1)
Index Cond: (p << '(0.5,nan)'::point)
Buffers: shared hit=4570 read=31
Total runtime: 1072.820 ms
(9 rows)

test=# explain (analyze, buffers) select * from test2 where p <<
'(0.5,nan)'::point and p::text like '%nan%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on test2 (cost=0.00..25482.02 rows=4012 width=16) (actual
time=2.154..1110.458 rows=502 loops=1)
Filter: ((p << '(0.5,nan)'::point) AND ((p)::text ~~ '%nan%'::text))
Rows Removed by Filter: 1002498
Buffers: shared hit=5422
Total runtime: 1110.587 ms
(5 rows)

test=# explain (analyze, buffers) select * from test2 where p >>
'(0.5,nan)'::point and p::text like '%nan%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test2 (cost=4389.54..11817.54 rows=4012 width=16)
(actual time=135.071..1076.240 rows=498 loops=1)
Recheck Cond: (p >> '(0.5,nan)'::point)
Filter: ((p)::text ~~ '%nan%'::text)
Rows Removed by Filter: 499997
Buffers: shared hit=9989 read=25
-> Bitmap Index Scan on test2_idx (cost=0.00..4388.53 rows=100300
width=0) (actual time=132.679..132.679 rows=500495 loops=1)
Index Cond: (p >> '(0.5,nan)'::point)
Buffers: shared hit=4567 read=25
Total runtime: 1076.434 ms
(9 rows)

test=# explain (analyze, buffers) select * from test2 where p >>
'(0.5,nan)'::point and p::text like '%nan%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on test2 (cost=0.00..25482.02 rows=4012 width=16) (actual
time=2.473..1121.940 rows=498 loops=1)
Filter: ((p >> '(0.5,nan)'::point) AND ((p)::text ~~ '%nan%'::text))
Rows Removed by Filter: 1002502
Buffers: shared hit=5422
Total runtime: 1122.073 ms
(5 rows)

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
picksplit-nan-fix-2.patch application/octet-stream 7.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel =?ISO-8859-1?Q?V=E9rit=E9?= 2013-09-08 18:00:58 file_fdw target file ownership
Previous Message Atri Sharma 2013-09-08 05:22:02 Re: [rfc] overhauling pgstat.stat