Re: Strange issue with GiST index scan taking far too long

Lists: pgsql-hackers
From: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Strange issue with GiST index scan taking far too long
Date: 2008-06-09 13:18:50
Message-ID: 484D2DBA.9040508@siriusit.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi there,

I'm currently looking at a bug report in PostGIS where we are getting
extremely long index scan times querying an index in one case, but the
same scan can take much less time if the input geometry is calculated as
the result of another function.

First of all, I include the EXPLAIN ANALYZE of the basic query which
looks like this:

postgis=# explain analyze select count(*) from geography where centroid
&& (select the_geom from geography where id=69495);

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual
time=2691.783..2691.784 rows=1 loops=1)
InitPlan
-> Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559)
(actual time=60.987..61.373 rows=1 loops=1)
Filter: (id = 69495::numeric)
-> Index Scan using geography_geom_centroid_idx on geography
(cost=0.00..8.28 rows=1 width=0) (actual time=79.241..2645.722
rows=32880 loops=1)
Index Cond: (centroid && $0)
Filter: (centroid && $0)
Total runtime: 2692.288 ms
(8 rows)

The only real thing to know about the query is that the id field within
the geography table is a primary key, and hence only a single geometry
is being returned from within the subselect. Note that most of the time
is disappearing into the index scan.

Where things start getting really strange is when we add an extra
function called force_2d() into the mix. All this function does is scan
through the single geometry returned from the subselect and remove any
3rd dimension coordinates. Now the resulting EXPLAIN ANALYZE for this
query looks like this:

postgis=# explain analyze select count(*) from geography where centroid
&& (select force_2d(the_geom) from geography where id=69495);

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual
time=343.004..343.005 rows=1 loops=1)
InitPlan
-> Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559)
(actual time=48.714..49.016 rows=1 loops=1)
Filter: (id = 69495::numeric)
-> Index Scan using geography_geom_centroid_idx on geography
(cost=0.00..8.28 rows=1 width=0) (actual time=49.367..235.296 rows=32880
loops=1)
Index Cond: (centroid && $0)
Filter: (centroid && $0)
Total runtime: 343.084 ms
(8 rows)

So by adding in an extra function around the subselect result, we have
speeded up the index lookup by several orders of magnitude, and the
speedup appears to be coming from somewhere within the index scan?! I've
spent a little bit of time playing with this and it seems even writing a
function to return a copy of the input function is enough. Here is my
test function below:

Datum LWGEOM_mcatest(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(LWGEOM_mcatest);
Datum LWGEOM_mcatest(PG_FUNCTION_ARGS)
{
PG_LWGEOM *pgl = (PG_LWGEOM *)
PG_DETOAST_DATUM(PG_GETARG_DATUM(0));
void *mem;

/* Copy somewhere else */
mem = palloc(VARSIZE(pgl));
memcpy(mem, pgl, VARSIZE(pgl)-VARHDRSZ);

PG_RETURN_POINTER(mem);
}

CREATE OR REPLACE FUNCTION mcatest(geometry)
RETURNS geometry
AS '$libdir/lwpostgis','LWGEOM_mcatest'
LANGUAGE 'C';

And then here is the resulting EXPLAIN ANALYZE:

postgis=# explain analyze select count(*) from geography where centroid
&& (select mcatest(the_geom) from geography where id=69495);

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual
time=283.126..283.127 rows=1 loops=1)
InitPlan
-> Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559)
(actual time=48.712..49.040 rows=1 loops=1)
Filter: (id = 69495::numeric)
-> Index Scan using geography_geom_centroid_idx on geography
(cost=0.00..8.28 rows=1 width=0) (actual time=49.321..215.524 rows=32880
loops=1)
Index Cond: (centroid && $0)
Filter: (centroid && $0)
Total runtime: 283.221 ms
(8 rows)

Unfortunately I can't seem to work out why the extra time is
disappearing into the index scan when my extra mcatest() function is not
present, especially as sprof doesn't seem to want to run at the moment
:( I'm wondering if it's related to either excess TOAST/palloc/pfree
somewhere in the code, but I'd definitely appreciate any pointers.

All these tests were done using PostgreSQL 8.3.1 and the latest PostGIS SVN.

Many thanks,

Mark.

--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063


From: "A(dot) Kretschmer" <andreas(dot)kretschmer(at)schollglas(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-09 14:24:08
Message-ID: 20080609142408.GA9409@a-kretschmer.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

am Mon, dem 09.06.2008, um 14:18:50 +0100 mailte Mark Cave-Ayland folgendes:
> Hi there,
>
> I'm currently looking at a bug report in PostGIS where we are getting
> extremely long index scan times querying an index in one case, but the
> same scan can take much less time if the input geometry is calculated as
> the result of another function.
>
> First of all, I include the EXPLAIN ANALYZE of the basic query which
> looks like this:
>
>
> postgis=# explain analyze select count(*) from geography where centroid
> && (select the_geom from geography where id=69495);
>
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual
> time=2691.783..2691.784 rows=1 loops=1)
> InitPlan
> -> Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559)
> (actual time=60.987..61.373 rows=1 loops=1)
> Filter: (id = 69495::numeric)
> -> Index Scan using geography_geom_centroid_idx on geography
> (cost=0.00..8.28 rows=1 width=0) (actual time=79.241..2645.722
> rows=32880 loops=1)
> Index Cond: (centroid && $0)
> Filter: (centroid && $0)
> Total runtime: 2692.288 ms
> (8 rows)
>

There are a BIG difference between estimated rows and real rows (1 vs.
32880). Why? Do you have recent statistics?

>
> The only real thing to know about the query is that the id field within
> the geography table is a primary key, and hence only a single geometry

Sure? I can't believe this because the rows=32880.

> is being returned from within the subselect. Note that most of the time
> is disappearing into the index scan.
>
> Where things start getting really strange is when we add an extra
> function called force_2d() into the mix. All this function does is scan
> through the single geometry returned from the subselect and remove any
> 3rd dimension coordinates. Now the resulting EXPLAIN ANALYZE for this
> query looks like this:
>
>
> postgis=# explain analyze select count(*) from geography where centroid
> && (select force_2d(the_geom) from geography where id=69495);
>
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual
> time=343.004..343.005 rows=1 loops=1)
> InitPlan
> -> Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559)
> (actual time=48.714..49.016 rows=1 loops=1)
> Filter: (id = 69495::numeric)
> -> Index Scan using geography_geom_centroid_idx on geography
> (cost=0.00..8.28 rows=1 width=0) (actual time=49.367..235.296 rows=32880
> loops=1)
> Index Cond: (centroid && $0)
> Filter: (centroid && $0)
> Total runtime: 343.084 ms
> (8 rows)
>
>
> So by adding in an extra function around the subselect result, we have
> speeded up the index lookup by several orders of magnitude, and the

Wild guess: you have a big cache.

But i'm not a PostGIS-Expert.

Regards, Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-09 14:36:07
Message-ID: 1213022167.12046.109.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Mon, 2008-06-09 at 14:18 +0100, Mark Cave-Ayland wrote:
> Unfortunately I can't seem to work out why the extra time is
> disappearing into the index scan when my extra mcatest() function is
> not present,

Hmmm, perhaps implicit casting?

Try this to see if it works better also

select count(*) from geography where centroid
&& (select the_geom::geometry from geography where id=69495);

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-09 14:44:52
Message-ID: 484D41E4.5050903@siriusit.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:

> Hmmm, perhaps implicit casting?
>
> Try this to see if it works better also
>
> select count(*) from geography where centroid
> && (select the_geom::geometry from geography where id=69495);

Hi Simon,

Unfortunately that seems to take the slow runtime path too. I did
initially think about casting being involved (since the underlying index
storage type is actually box2d rather than geometry), however my
mcatest() function is also declared as returning geometry too.

Interesting enough, forcing a cast to box2d instead of geometry seems to
take the faster path, i.e:

postgis=# explain analyze select count(*) from geography where centroid
&& (select the_geom::box2d from geography where id=69495);

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7157.29..7157.30 rows=1 width=0) (actual
time=376.033..376.034 rows=1 loops=1)
InitPlan
-> Seq Scan on geography (cost=0.00..7149.00 rows=1 width=4559)
(actual time=42.853..43.051 rows=1 loops=1)
Filter: (id = 69495::numeric)
-> Index Scan using geography_geom_centroid_idx on geography
(cost=0.00..8.28 rows=1 width=0) (actual time=43.218..286.535 rows=32880
loops=1)
Index Cond: (centroid && ($0)::geometry)
Filter: (centroid && ($0)::geometry)
Total runtime: 376.117 ms
(8 rows)

ATB,

Mark.

--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-09 15:49:12
Message-ID: 20209.1213026552@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk> writes:
> So by adding in an extra function around the subselect result, we have
> speeded up the index lookup by several orders of magnitude, and the
> speedup appears to be coming from somewhere within the index scan?!

Is the value you are fetching from the geography table large enough to
be toasted? I'm thinking you might be looking at the cost of repeated
de-toastings.

BTW, that mcatest function is buggy --- it's not copying the last
4 bytes of the source value. I don't know enough about PostGIS
data types to know what effect that would have.

regards, tom lane


From: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-09 15:59:26
Message-ID: 484D535E.5020000@siriusit.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> Is the value you are fetching from the geography table large enough to
> be toasted? I'm thinking you might be looking at the cost of repeated
> de-toastings.

Yeah, it's a fairly large geometry field so it will definitely be
getting toasted. So is it a case of with the mcatest function in place,
we're effectively caching the de-TOASTED value?

> BTW, that mcatest function is buggy --- it's not copying the last
> 4 bytes of the source value. I don't know enough about PostGIS
> data types to know what effect that would have.

It would probably lose that last point in the coordinate sequence, but
nothing major. I've removed the -VARHDRSZ part just to check and it
doesn't make any difference.

Many thanks,

Mark.

--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-09 16:03:46
Message-ID: 20497.1213027426@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk> writes:
> Tom Lane wrote:
>> Is the value you are fetching from the geography table large enough to
>> be toasted? I'm thinking you might be looking at the cost of repeated
>> de-toastings.

> Yeah, it's a fairly large geometry field so it will definitely be
> getting toasted. So is it a case of with the mcatest function in place,
> we're effectively caching the de-TOASTED value?

Well, yeah, because the first thing it does is pg_detoast_datum.
Just as a cross-check, try changing it to copy the value without
forcibly detoasting --- I'll bet it's still slow then.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-09 16:26:32
Message-ID: 1213028792.12046.133.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Mon, 2008-06-09 at 11:49 -0400, Tom Lane wrote:
> Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk> writes:
> > So by adding in an extra function around the subselect result, we have
> > speeded up the index lookup by several orders of magnitude, and the
> > speedup appears to be coming from somewhere within the index scan?!
>
> Is the value you are fetching from the geography table large enough to
> be toasted? I'm thinking you might be looking at the cost of repeated
> de-toastings.

So you are saying it is de-toasted 32880 times, in this case? If not,
where are the repeated de-toastings happening?

Sounds like we need some stats on how well toast is working for us.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-09 17:25:14
Message-ID: 21579.1213032314@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Mon, 2008-06-09 at 11:49 -0400, Tom Lane wrote:
>> Is the value you are fetching from the geography table large enough to
>> be toasted? I'm thinking you might be looking at the cost of repeated
>> de-toastings.

> So you are saying it is de-toasted 32880 times, in this case? If not,
> where are the repeated de-toastings happening?

Inside the index support functions. I'm thinking we could fix this by
forcibly detoasting values passed as index scan keys, but it's not quite
clear where's the best place to do that.

regards, tom lane


From: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-10 08:57:18
Message-ID: 484E41EE.1060906@siriusit.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> Well, yeah, because the first thing it does is pg_detoast_datum.
> Just as a cross-check, try changing it to copy the value without
> forcibly detoasting --- I'll bet it's still slow then.

Yeah, that appears to be exactly the case. After some grepping through
various header files I came up with this alternative:

PG_FUNCTION_INFO_V1(LWGEOM_mcatest);
Datum LWGEOM_mcatest(PG_FUNCTION_ARGS)
{
struct varlena *pgl = PG_GETARG_RAW_VARLENA_P(0);
void *mem;

// Copy somewhere else
mem = palloc(VARSIZE_EXTERNAL(pgl));
memcpy(mem, pgl, VARSIZE_EXTERNAL(pgl));

PG_RETURN_POINTER(mem);
}

...and of course, this now takes the slower runtime of 2.5s.

ATB,

Mark.

--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063


From: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange issue with GiST index scan taking far too long
Date: 2008-06-10 09:16:51
Message-ID: 484E4683.7030501@siriusit.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

>> So you are saying it is de-toasted 32880 times, in this case? If not,
>> where are the repeated de-toastings happening?
>
> Inside the index support functions. I'm thinking we could fix this by
> forcibly detoasting values passed as index scan keys, but it's not quite
> clear where's the best place to do that.

Ouch. This is rapidly getting out of my sphere of knowledge, but I'd
guess you'd want to do this either just before you start the index scan,
or cache the results within the AM after the first deTOASTing.

In terms of PostGIS, we tend to do a lot of index queries against large
geometries so we see cases like this frequently - so optimising them
would be good.

I did think of another idea though: at the moment all members of the
GiST opclass for geometry objects are declared using the geometry type
(which contains the entire geometry), whereas individual entries are
stored within the index as box2d objects representing just their
bounding box.

Would it make sense to rework the GiST routines so that instead of
accepting geometry <op> geometry, they accept box2d <op> box2d? Then
surely if we add a CAST from geometry to box2d then the geometry would
get converted to its bounding box (which would not require deTOASTing)
before being used as an index scan key.

ATB,

Mark.

--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063