Re: Large time difference between explain analyze and normal run

Lists: pgsql-performance
From: Chris Kratz <chris(dot)kratz(at)vistashare(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Large time difference between explain analyze and normal run
Date: 2005-02-10 18:34:40
Message-ID: 200502101334.40341.chris.kratz@vistashare.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Does anyone have any idea why there be over a 4s difference between running
the statement directly and using explain analyze? Multiple runs give the
same result and I've tested on several servers.

db=# \timing
Timing is on.
db=# select count(*) from answer;
count
--------
530576
(1 row)

Time: 358.805 ms
db=# explain analyze select count(*) from answer;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=9848.12..9848.12 rows=1 width=0) (actual
time=4841.231..4841.235 rows=1 loops=1)
-> Seq Scan on answer (cost=0.00..8561.29 rows=514729 width=0) (actual
time=0.011..2347.762 rows=530576 loops=1)
Total runtime: 4841.412 ms
(3 rows)

Time: 4855.712 ms

---

Postgresql 7.4.5 running on Linux 2.6.8.1


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Chris Kratz <chris(dot)kratz(at)vistashare(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Large time difference between explain analyze and normal run
Date: 2005-02-10 18:58:37
Message-ID: 28553.1108061917@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Chris Kratz <chris(dot)kratz(at)vistashare(dot)com> writes:
> Does anyone have any idea why there be over a 4s difference between running
> the statement directly and using explain analyze?

> Aggregate (cost=9848.12..9848.12 rows=1 width=0) (actual
> time=4841.231..4841.235 rows=1 loops=1)
> -> Seq Scan on answer (cost=0.00..8561.29 rows=514729 width=0) (actual
> time=0.011..2347.762 rows=530576 loops=1)
> Total runtime: 4841.412 ms

EXPLAIN ANALYZE's principal overhead is two gettimeofday() kernel calls
per plan node execution, so 1061154 such calls here. I infer that
gettimeofday takes about 4 microseconds on your hardware ... which seems
a bit slow for modern machines. What sort of box is it?

regards, tom lane


From: Chris Kratz <chris(dot)kratz(at)vistashare(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Large time difference between explain analyze and normal run
Date: 2005-02-10 19:05:46
Message-ID: 200502101405.46581.chris.kratz@vistashare.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Thursday 10 February 2005 01:58 pm, Tom Lane wrote:
> Chris Kratz <chris(dot)kratz(at)vistashare(dot)com> writes:
> > Does anyone have any idea why there be over a 4s difference between
> > running the statement directly and using explain analyze?
> >
> > Aggregate (cost=9848.12..9848.12 rows=1 width=0) (actual
> > time=4841.231..4841.235 rows=1 loops=1)
> > -> Seq Scan on answer (cost=0.00..8561.29 rows=514729 width=0)
> > (actual time=0.011..2347.762 rows=530576 loops=1)
> > Total runtime: 4841.412 ms
>
> EXPLAIN ANALYZE's principal overhead is two gettimeofday() kernel calls
> per plan node execution, so 1061154 such calls here. I infer that
> gettimeofday takes about 4 microseconds on your hardware ... which seems
> a bit slow for modern machines. What sort of box is it?
>
> regards, tom lane

OK, that makes sense.

Athlon XP 3000+
1.5G Mem

Is there a way to test the gettimeofday() directly?


From: Darcy Buskermolen <darcy(at)wavefire(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Chris Kratz <chris(dot)kratz(at)vistashare(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Large time difference between explain analyze and normal run
Date: 2005-02-10 20:09:42
Message-ID: 200502101209.42079.darcy@wavefire.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On February 10, 2005 10:58 am, Tom Lane wrote:
> Chris Kratz <chris(dot)kratz(at)vistashare(dot)com> writes:
> > Does anyone have any idea why there be over a 4s difference between
> > running the statement directly and using explain analyze?
> >
> > Aggregate (cost=9848.12..9848.12 rows=1 width=0) (actual
> > time=4841.231..4841.235 rows=1 loops=1)
> > -> Seq Scan on answer (cost=0.00..8561.29 rows=514729 width=0)
> > (actual time=0.011..2347.762 rows=530576 loops=1)
> > Total runtime: 4841.412 ms
>
> EXPLAIN ANALYZE's principal overhead is two gettimeofday() kernel calls
> per plan node execution, so 1061154 such calls here. I infer that
> gettimeofday takes about 4 microseconds on your hardware ... which seems
> a bit slow for modern machines. What sort of box is it?

dvl reported the same thing on #postgresql some months back, and neilc
was/is/did looking into it. I belive he came up with a way to move the
function call outside of the loop with no ill effects to the rest of the
expected behavior.

>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx: 250.763.1759
http://www.wavefire.com


From: Chris Kratz <chris(dot)kratz(at)vistashare(dot)com>
To: Darcy Buskermolen <darcy(at)wavefire(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Large time difference between explain analyze and normal run
Date: 2005-02-10 20:25:09
Message-ID: 200502101525.09165.chris.kratz@vistashare.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Thursday 10 February 2005 03:09 pm, Darcy Buskermolen wrote:
> On February 10, 2005 10:58 am, Tom Lane wrote:
> > Chris Kratz <chris(dot)kratz(at)vistashare(dot)com> writes:
> > > Does anyone have any idea why there be over a 4s difference between
> > > running the statement directly and using explain analyze?
> > >
> > > Aggregate (cost=9848.12..9848.12 rows=1 width=0) (actual
> > > time=4841.231..4841.235 rows=1 loops=1)
> > > -> Seq Scan on answer (cost=0.00..8561.29 rows=514729 width=0)
> > > (actual time=0.011..2347.762 rows=530576 loops=1)
> > > Total runtime: 4841.412 ms
> >
> > EXPLAIN ANALYZE's principal overhead is two gettimeofday() kernel calls
> > per plan node execution, so 1061154 such calls here. I infer that
> > gettimeofday takes about 4 microseconds on your hardware ... which seems
> > a bit slow for modern machines. What sort of box is it?
>
> dvl reported the same thing on #postgresql some months back, and neilc
> was/is/did looking into it. I belive he came up with a way to move the
> function call outside of the loop with no ill effects to the rest of the
> expected behavior.

That's interesting to know. It's not a big deal, we were just curious as to
why the difference. Tom's explanation makes good sense. We run into the
same situation with using a profiler on an application, ie measuring incurs
overhead.

-Chris