Crash in gist insertion on pathological box data

Lists: pgsql-hackers
From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Crash in gist insertion on pathological box data
Date: 2009-03-26 14:39:05
Message-ID: 8763hwl56e.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

A user on IRC reported a crash (backend segfault) in GiST insertion
(in 8.3.5 but I can reproduce this in today's HEAD) that turns out
to be due to misbehaviour of gist_box_picksplit.

The nature of the problem is this: if gist_box_picksplit doesn't find
a good disposition on the first try, then it tries to split the data
again based on the positions of the box centers. But there's a problem
here with floating-point rounding; it's possible for the average of N
floating-point values to be strictly greater (or less) than all of the
values individually, and the function then returns with, for example,
all the entries assigned to the left node, and nothing in the right
node. This causes gistSplit to try and split the left node again, with
predictable results.

Here is a test case:

file of floating-point values here (999 lines):
http://www.rhodiumtoad.org.uk/junk/badfloats.txt

create table floats3(x float8, y float8);
\copy floats3 from 'badfloats.txt'
create table boxes1 (b box);
create index boxes1_idx on boxes1 using gist (b);
insert into boxes1 select box(point(x,x),point(y,y)) as b from floats3;
[crash]

I'm not sure what the best fix is. I would think that it would make
sense for gistUserPickSplit to error out if the user's split function
returned an empty left or right node, since that would seem to
guarantee this problem. Certainly gist_box_picksplit also needs some
sort of fix to try and split sensibly in the presence of data of this
type.

--
Andrew (irc:RhodiumToad)


From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-03-26 15:40:38
Message-ID: c3a7de1f0903260840s1bf2a76du53dc7d7d62d8270e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 26, 2009 at 5:39 PM, Andrew Gierth
<andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
> A user on IRC reported a crash (backend segfault) in GiST insertion
> (in 8.3.5 but I can reproduce this in today's HEAD) that turns out
> to be due to misbehaviour of gist_box_picksplit.
>

Andrew, thank you for the test case and report.

p.s. The user Andrew mentioned above is me and if you have a question
to me I am ready to answer it.

--
Regards,
Sergey Konoplev
--
PostgreSQL articles in english & russian
http://gray-hemp.blogspot.com/search/label/postgresql/


From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-03-27 13:04:06
Message-ID: c3a7de1f0903270604v5ee8e5c2s67b464c5bd86a340@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 26, 2009 at 5:39 PM, Andrew Gierth
<andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
> A user on IRC reported a crash (backend segfault) in GiST insertion
> (in 8.3.5 but I can reproduce this in today's HEAD) that turns out
> to be due to misbehaviour of gist_box_picksplit.
>
> The nature of the problem is this: if gist_box_picksplit doesn't find
> a good disposition on the first try, then it tries to split the data
> again based on the positions of the box centers. But there's a problem
> here with floating-point rounding; it's possible for the average of N
> floating-point values to be strictly greater (or less) than all of the
> values individually, and the function then returns with, for example,
> all the entries assigned to the left node, and nothing in the right
> node. This causes gistSplit to try and split the left node again, with
> predictable results.
>

I probably have a workaround. As I understand the problem it touches
gist indexes with one box type field only. After googling picksplit
and reading some info I supposed that If another (distinctive) field
would be appended to the index (after the box field) then another
(old) picksplit functionality would be started instead of new (buggy)
one. Andrew approved my assumption on IRC. So I found all the indexes
(gist) with one box field and recreated them with extra column (bigint
PK field). Well on this moment our DB has been working for a 22 hour
without crashes and errors.

Of course not being pg-hacker I can't guaranty that my assumption is
absolutely correct and I welcome your criticism.

--
Regards,
Sergey Konoplev
--
PostgreSQL articles in english & russian
http://gray-hemp.blogspot.com/search/label/postgresql/


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-03-28 11:48:34
Message-ID: 20090328114834.GA8762@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 26, 2009 at 02:39:05PM +0000, Andrew Gierth wrote:
> A user on IRC reported a crash (backend segfault) in GiST insertion
> (in 8.3.5 but I can reproduce this in today's HEAD) that turns out
> to be due to misbehaviour of gist_box_picksplit.
>
> The nature of the problem is this: if gist_box_picksplit doesn't find
> a good disposition on the first try, then it tries to split the data
> again based on the positions of the box centers. But there's a problem
> here with floating-point rounding; it's possible for the average of N
> floating-point values to be strictly greater (or less) than all of the
> values individually, and the function then returns with, for example,
> all the entries assigned to the left node, and nothing in the right
> node. This causes gistSplit to try and split the left node again, with
> predictable results.

ISTM the simplest solution here is detect that everything has been put
in one node (left or right) and in that case just split the list
straight down the middle (since clearly it doesn't matter on which side
they appear.).

Or switch algorithms, but that's more than just a bugfix.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-03-28 14:28:24
Message-ID: 877i29en7b.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Martijn" == Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:

>> The nature of the problem is this: if gist_box_picksplit doesn't
>> find a good disposition on the first try, then it tries to split
>> the data again based on the positions of the box centers. But
>> there's a problem here with floating-point rounding; it's possible
>> for the average of N floating-point values to be strictly greater
>> (or less) than all of the values individually, and the function
>> then returns with, for example, all the entries assigned to the
>> left node, and nothing in the right node. This causes gistSplit to
>> try and split the left node again, with predictable results.

Martijn> ISTM the simplest solution here is detect that everything
Martijn> has been put in one node (left or right) and in that case
Martijn> just split the list straight down the middle (since clearly
Martijn> it doesn't matter on which side they appear.).

It's not quite so simple; we know that not all the values are equal,
since that's checked for earlier in the code (if they're all actually
equal they just get split down the middle). In this specific case the
values could actually be quite different, since we're only looking
at the centers; and with, say, a large number of very different size
boxes all centered on the same point, it would be preferable to split
them so that at least one node has a smaller union.

I'm interested to see what Oleg's solution (see other thread) is.

--
Andrew.


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-02 18:12:26
Message-ID: 49D5000A.2090402@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> The nature of the problem is this: if gist_box_picksplit doesn't find
> a good disposition on the first try, then it tries to split the data
> again based on the positions of the box centers. But there's a problem
> here with floating-point rounding; it's possible for the average of N

Look at the patch, it fixes the problem by comparing for equality by FPeq()
macros which is used everywhere in geometry calculation.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
gist.patch.gz application/x-tar 401 bytes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-02 18:33:36
Message-ID: 13749.1238697216@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> Look at the patch, it fixes the problem by comparing for equality by FPeq()
> macros which is used everywhere in geometry calculation.

Ick. FPeq() is a crock; I'd like to see us get rid of it, not spread it
even further. And what confidence do you have that this change
eliminates all forms of the problem, anyway?

regards, tom lane


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-02 18:52:12
Message-ID: 49D5095C.5050204@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> even further. And what confidence do you have that this change
> eliminates all forms of the problem, anyway?

Yes, I think. Because that part of code ( if (IS_BADRATIO) {...} ) is a corner
case itself. In example from Andrew, all boxes are placed to one page because of
floating-point rounding.

We could check IS_BADRATIO again and if it's just put one half of all boxes on
one page and another half to the another page as it does if all boxes are equal.
But FPeq() seemed to me a simpler solution and FP* comparisons are widely used
in geometry.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane), Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-02 19:13:02
Message-ID: 873acq27k1.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> Look at the patch, it fixes the problem by comparing for equality
>> by FPeq() macros which is used everywhere in geometry calculation.

Tom> Ick. FPeq() is a crock; I'd like to see us get rid of it, not
Tom> spread it even further. And what confidence do you have that
Tom> this change eliminates all forms of the problem, anyway?

Here is a test case that crashes even with the patch:

create table floats3(x float8, y float8);
-- same badfloats.txt data as before
\copy floats3 from 'badfloats.txt'
update floats3 set x = x * pow(2::float8,33), y = y * pow(2::float8,33);
create table boxes1 (b box);
create index boxes1_idx on boxes1 using gist (b);
insert into boxes1 select box(point(x,x),point(y,y)) as b from floats3;
[crash]

--
Andrew (irc:RhodiumToad)


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: teodor(at)sigaev(dot)ru (Teodor Sigaev), pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-02 19:58:37
Message-ID: 87vdpmzv2q.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Teodor" == Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:

>> even further. And what confidence do you have that this change
>> eliminates all forms of the problem, anyway?

Teodor> Yes, I think. Because that part of code ( if (IS_BADRATIO)
Teodor> {...} ) is a corner case itself. In example from Andrew, all
Teodor> boxes are placed to one page because of floating-point
Teodor> rounding.

Yes, it's a corner case, but it arose in real-world data (the test
data set is contrived, but that's simply because it was the easiest
way to demonstrate the bug without access to the real data, which
had a much larger variation in box sizes).

Teodor> We could check IS_BADRATIO again and if it's just put one
Teodor> half of all boxes on one page and another half to the another
Teodor> page as it does if all boxes are equal. But FPeq() seemed to
Teodor> me a simpler solution and FP* comparisons are widely used in
Teodor> geometry.

I think that not only does there need to be another IS_BADRATIO check,
but also there needs to be some sort of backstop in gistSplit or
gistUserPicksplit to either recover or (as a last resort) error out
cleanly rather than crash the entire db in cases that would result in
infinite recursion.

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: teodor(at)sigaev(dot)ru (Teodor Sigaev), pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-02 20:05:29
Message-ID: 22855.1238702729@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> I think that not only does there need to be another IS_BADRATIO check,
> but also there needs to be some sort of backstop in gistSplit or
> gistUserPicksplit to either recover or (as a last resort) error out
> cleanly rather than crash the entire db in cases that would result in
> infinite recursion.

+1. This is not just a problem in one picksplit method, it's a generic
hazard for all of them. The core code should be defending against a
pathological split.

regards, tom lane


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-03 13:08:33
Message-ID: 49D60A51.1020109@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Here is a test case that crashes even with the patch:
I was too optimistic :(

Attached patch contains:
- changes in R-tree picksplit methods. Now it checks bad ratio and if so then
use simple split: one half of entries to one page, and another part - to
another page.
- protection from buggy picksplit method: GiST will emit an error if picksplit
of first column has that bug. For second and next column it could be a desired
behaviour, because picksplit may take in attention result of picksplit of
previous column.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
gist.patch-1.gz application/x-tar 1.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-03 15:38:03
Message-ID: 12519.1238773083@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> Here is a test case that crashes even with the patch:

> I was too optimistic :(

> Attached patch contains:
> - changes in R-tree picksplit methods. Now it checks bad ratio and if so then
> use simple split: one half of entries to one page, and another part - to
> another page.
> - protection from buggy picksplit method: GiST will emit an error if picksplit
> of first column has that bug. For second and next column it could be a desired
> behaviour, because picksplit may take in attention result of picksplit of
> previous column.

I don't like throwing an error there; I wish there were a way for the
generic code to apply the fallbackSplit code instead. I see that
in this particular formulation it's dependent on the datatype ---
can we get around that, by having it invoke the union method?

regards, tom lane


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-05 14:02:29
Message-ID: 49D8B9F5.5050102@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I don't like throwing an error there; I wish there were a way for the
> generic code to apply the fallbackSplit code instead. I see that
> in this particular formulation it's dependent on the datatype ---
> can we get around that, by having it invoke the union method?

Done. rtree.patch.gz contains patch for gistproc.c, genericsplit.patch.gz adds
simple genericPickSplit to gistsplit.c to workaround bug of user-defined picksplit.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
rtree.patch.gz application/x-tar 1.2 KB
genericsplit.patch.gz application/x-tar 1.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-06 01:29:02
Message-ID: 7290.1238981342@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> I don't like throwing an error there; I wish there were a way for the
>> generic code to apply the fallbackSplit code instead. I see that
>> in this particular formulation it's dependent on the datatype ---
>> can we get around that, by having it invoke the union method?

> Done. rtree.patch.gz contains patch for gistproc.c, genericsplit.patch.gz adds
> simple genericPickSplit to gistsplit.c to workaround bug of user-defined picksplit.

This looks good to me. I tested it to the extent of verifying that
either patch individually would prevent the originally-reported failure.

The only question I have is whether we want this nag message or not:

! ereport(LOG,
! (errcode(ERRCODE_INTERNAL_ERROR),
! errmsg("Picksplit method for %d column of index \"%s\" failed",
! attno+1, RelationGetRelationName(r)),
! errhint("Index is not optimal, to optimize it contact developer or try to use the column as a second one in create index command")));

I'd be inclined to keep it but reduce it to level DEBUG1 or so.

regards, tom lane


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-06 15:00:16
Message-ID: 49DA1900.2050504@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> This looks good to me. I tested it to the extent of verifying that
> either patch individually would prevent the originally-reported failure.
> I'd be inclined to keep it but reduce it to level DEBUG1 or so.

Committed in HEAD, 8.3 and 8.2. Previous versions need to be patched too but
codebase is different, so I'll patch them a bit later.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-11 16:52:31
Message-ID: 200904111952.32434.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday 06 April 2009 04:29:02 Tom Lane wrote:
> The only question I have is whether we want this nag message or not:
>
> ! ereport(LOG,
> ! (errcode(ERRCODE_INTERNAL_ERROR),
> ! errmsg("Picksplit method for %d column of index \"%s\"
> failed", ! attno+1, RelationGetRelationName(r)),
> ! errhint("Index is not optimal, to optimize it contact
> developer or try to use the column as a second one in create index
> command")));
>
> I'd be inclined to keep it but reduce it to level DEBUG1 or so.

If it is supposed to be DEBUG1, would you mind if I convert it to an elog so
it doesn't appear in the translation roster?

Not sure how many responses you will get if you hide the message like that.
Probably almost anyone would not turn on DEBUG1 unless asked by a developer in
the first place.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: Crash in gist insertion on pathological box data
Date: 2009-04-11 17:07:43
Message-ID: 25659.1239469663@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On Monday 06 April 2009 04:29:02 Tom Lane wrote:
>> The only question I have is whether we want this nag message or not:

> If it is supposed to be DEBUG1, would you mind if I convert it to an elog so
> it doesn't appear in the translation roster?

Okay with me. You could use errmsg_internal but we have no equivalent
of errhint_internal, so I suppose it's got to be smashed to a single
message anyway if you don't want to translate it.

regards, tom lane