Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: Add a filed to PageHeaderData)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Soroosh Sardari <soroosh(dot)sardari(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: Add a filed to PageHeaderData)
Date: 2014-06-24 20:22:06
Message-ID: 53A9DDEE.3030507@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/24/2014 08:48 PM, Pavan Deolasee wrote:
> FWIW I can reproduce this on HEAD with the attached patch. I could
> reproduce this on a 64-bit Ubuntu as well as 64-bit Mac OSX. Very confusing
> it is because I tried with various values for N in char[N] array and it
> fails for N=20. Other values I tried are 4, 12, 22, 24 and the test passes
> for all of them. The logic for trying other values is to see if pd_linp[]
> starting on un-aligned boundary can trigger the issue. But there seem to be
> no correlation.
>
> postgres=# select version();
>
> PostgreSQL 9.5devel on x86_64-apple-darwin13.2.0, compiled by Apple LLVM
> version 5.1 (clang-503.0.38) (based on LLVM 3.4svn), 64-bit
>
> postgres=# -- test SP-GiST index that's been built incrementally
>
> postgres=# create table test_range_spgist(ir int4range);
> postgres=# create index test_range_spgist_idx on test_range_spgist using
> spgist (ir);
> postgres=# insert into test_range_spgist select int4range(g, g+10) from
> generate_series(1,586) g;
> INSERT 0 586
>
> postgres=# SET enable_seqscan = t;
> postgres=# SET enable_indexscan = f;
> postgres=# SET enable_bitmapscan = f;
>
> postgres=# select * from test_range_spgist where ir -|- int4range(100,500);
> ir
> -----------
> [90,100)
> [500,510)
> (2 rows)
>
> postgres=# SET enable_seqscan = f;
> postgres=# select * from test_range_spgist where ir -|- int4range(100,500);
> ir
> -----------
> [90,100)
> [500,510)
> (2 rows)
>
> At this point, both rows are visible via index scan as well as seq scan.
>
> postgres=# insert into test_range_spgist select int4range(g, g+10) from
> generate_series(587,587) g;
> INSERT 0 1
>
> postgres=# select * from test_range_spgist where ir -|- int4range(100,500);
> ir
> ----------
> [90,100)
> (1 row)
>
> Ouch. The second row somehow disappeared.
>
> postgres=# SET enable_seqscan = t;
> postgres=# select * from test_range_spgist where ir -|- int4range(100,500);
> ir
> -----------
> [90,100)
> [500,510)
> (2 rows)
>
> So the last INSERT suddenly makes one row disappear via the index scan
> though its still reachable via seq scan. I tried looking at the SP-Gist
> code but clearly I don't understand it a whole lot to figure out the issue,
> if one exists.

Yeah, I can reproduce this. It doesn't seem to be related to the padding
or alignment at all. The padding just happens to move tuples around so
that [500, 510) is picked as an SP-GiST inner node.

The real bug is in spg_range_quad_inner_consistent(), for the adjacent
operator. Things go wrong when:

The scan key is [100, 500)
The prev centroid is [500, 510)
The current centroid is [544, 554).

The row that should match but isn't returned, [500, 510) is equal to the
previous centroid. It's in quadrant 3 from the current centroid, but
spg_range_quad_inner_consistent() incorrectly concludes that it doesn't
need to scan that quadrant.

The function compares the scan key's upper bound with the the previous
centroid's lower bound and the current centroid's lower bound:

> /*
> * Check if upper bound of argument is not in a
> * quadrant we visited in the previous step.
> */
> cmp1 = range_cmp_bounds(typcache, &upper, &prevLower);
> cmp2 = range_cmp_bounds(typcache, &centroidLower,
> &prevLower);
> if ((cmp2 < 0 && cmp1 > 0) || (cmp2 > 0 && cmp1 < 0))
> which2 = 0;

The idea is that if the scan key's upper bound doesn't fall between the
prev and current centroid's lower bounds, there is no match.

* * *
PL X CL

X = scan key's upper bound: 500)
PL = prev centroid's lower bound [500
CL = current centroid's lower bound [500

This is wrong. X < PL, but it's still nevertheless adjacent to it.

I'll take a closer look tomorrow...

(The "if (which2) ..." block after the code I quoted above also looks
wrong - it seems to be comparing the argument's lower bound when it
should be comparing the upper bound according to the comment. )

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2014-06-24 21:17:05 Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Previous Message Piotr Stefaniak 2014-06-24 20:01:26 Keepalive-related socket options under FreeBSD 9, 10