Quick Links

Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

Lists:	pgsql-patches

From:	"Rocco Altier" <RoccoA(at)Routescape(dot)com>
To:	"Martijn van Oosterhout" <kleptog(at)svana(dot)org>, <pgsql-patches(at)postgresql(dot)org>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Subject:	Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling
Date:	2006-05-09 21:16:57
Message-ID:	6E0907A94904D94B99D7F387E08C4F57010A5E8C@FALCON.INSIGHT
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-patches

> - To get this close it needs to get an estimate of the sampling
overhead.
> It does this by a little calibration loop that is run once per
backend.
> If you don't do this, you end up assuming all tuples take the same
time
> as tuples with the overhead, resulting in nodes apparently taking
> longer than their parent nodes. Incidently, I measured the overhead to
> be about 3.6us per tuple per node on my (admittedly slightly old)
> machine.

Could this be deferred until the first explain analyze? So that we
aren't paying the overhead of the calibration in all backends, even the
ones that won't be explaining?

-rocco

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Rocco Altier <RoccoA(at)Routescape(dot)com>
Cc:	pgsql-patches(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling
Date:	2006-05-09 21:38:54
Message-ID:	20060509213854.GK29652@svana.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-patches

On Tue, May 09, 2006 at 05:16:57PM -0400, Rocco Altier wrote:
> > - To get this close it needs to get an estimate of the sampling
> > overhead. It does this by a little calibration loop that is run
> > once per backend. If you don't do this, you end up assuming all
> > tuples take the same time as tuples with the overhead, resulting in
> > nodes apparently taking longer than their parent nodes. Incidently,
> > I measured the overhead to be about 3.6us per tuple per node on my
> > (admittedly slightly old) machine.
>
> Could this be deferred until the first explain analyze? So that we
> aren't paying the overhead of the calibration in all backends, even the
> ones that won't be explaining?

If you look it's only done on the first call to InstrAlloc() which
should be when you run EXPLAIN ANALYZE for the first time.

In any case, the calibration is limited to half a millisecond (that's
500 microseconds), and it'll be a less on fast machines.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

From:	"Luke Lonergan" <llonergan(at)greenplum(dot)com>
To:	"Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "Rocco Altier" <RoccoA(at)Routescape(dot)com>
Cc:	pgsql-patches(at)postgresql(dot)org, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Subject:	Re: [PATCH] Improve EXPLAIN ANALYZE overhead by
Date:	2006-05-11 04:16:43
Message-ID:	C08808BB.232A6%llonergan@greenplum.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-patches

Nice one Martijn - we have immediate need for this, as one of our sizeable
queries under experimentation took 3 hours without EXPLAIN ANALYZE, then
over 20 hours with it...

- Luke

On 5/9/06 2:38 PM, "Martijn van Oosterhout" <kleptog(at)svana(dot)org> wrote:

> On Tue, May 09, 2006 at 05:16:57PM -0400, Rocco Altier wrote:
>>> - To get this close it needs to get an estimate of the sampling
>>> overhead. It does this by a little calibration loop that is run
>>> once per backend. If you don't do this, you end up assuming all
>>> tuples take the same time as tuples with the overhead, resulting in
>>> nodes apparently taking longer than their parent nodes. Incidently,
>>> I measured the overhead to be about 3.6us per tuple per node on my
>>> (admittedly slightly old) machine.
>>
>> Could this be deferred until the first explain analyze? So that we
>> aren't paying the overhead of the calibration in all backends, even the
>> ones that won't be explaining?
>
> If you look it's only done on the first call to InstrAlloc() which
> should be when you run EXPLAIN ANALYZE for the first time.
>
> In any case, the calibration is limited to half a millisecond (that's
> 500 microseconds), and it'll be a less on fast machines.
>
> Have a nice day,

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc:	Rocco Altier <RoccoA(at)Routescape(dot)com>, pgsql-patches(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling
Date:	2006-05-11 07:17:36
Message-ID:	20060511071736.GA30113@svana.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-patches

On Wed, May 10, 2006 at 09:16:43PM -0700, Luke Lonergan wrote:
> Nice one Martijn - we have immediate need for this, as one of our sizeable
> queries under experimentation took 3 hours without EXPLAIN ANALYZE, then
> over 20 hours with it...

Did you test it? There are some cases where this might still leave some
noticable overhead (high loop count). I'm just not sure if they occur
all that often in practice...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

From:	"Luke Lonergan" <llonergan(at)greenplum(dot)com>
To:	"Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc:	"Rocco Altier" <RoccoA(at)Routescape(dot)com>, pgsql-patches(at)postgresql(dot)org, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Subject:	Re: [PATCH] Improve EXPLAIN ANALYZE overhead by
Date:	2006-05-11 21:42:32
Message-ID:	C088FDD8.2338C%llonergan@greenplum.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-patches

Martijn,

On 5/11/06 12:17 AM, "Martijn van Oosterhout" <kleptog(at)svana(dot)org> wrote:

> Did you test it? There are some cases where this might still leave some
> noticable overhead (high loop count). I'm just not sure if they occur
> all that often in practice...

I've sent it to our team for testing, let's see if we get some info to
forward.

We're running the 10TB TPC-H case and I'm asking for EXPLAIN ANALYZE that
might take days to complete, so we certainly have some test cases for this
;-)

- Luke