Re: EXPLAIN BUFFERS

From: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: EXPLAIN BUFFERS
Date: 2009-12-14 03:00:06
Message-ID: 20091214120006.8A81.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I have a question about the comment in InstrStopNode(), which reads:
> "Adds delta of buffer usage to node's count and resets counter to
> start so that the counters are not double counted by parent nodes."
> It then calls BufferUsageAccumDiff(), but that function doesn't
> actually "reset" anything, so it seems like the comment is wrong.

Oops, it's wrong. It just does "Adds delta of buffer usage to node's count."

> Two other thoughts:
>
> 1. It doesn't appear that there is any provision to ever zero
> pgBufferUsage. Shouldn't we do this, say, once per explain, just to
> avoid the possibility of overflowing the counters?

I think the overflowing will not be a problem because we only use
the differences of values. The delta is always corrent unless we use
2^32 buffer accesses during one execution of a node.

> 2. We seem to do all the work associated with pgBufferUsage even when
> the "buffers" option is not passed to explain. The overhead of
> incrementing the counters is probably negligible (and we were paying
> the equivalent overhead before anyway) but I'm not sure whether saving
> the starting counters and accumulating the deltas might be enough to
> slow down EXPLAIN ANALYZE. That's sorta slow already so I'd hate to
> whack it any more - have you benchmarked this at all?

There are 5% of overheads in the worst cases. The difference will be
little if we have more complex operations or some disk I/Os.

Adding Instrumentation->count_bufusage flag could reduce the overheads.
if (instr->count_bufusage)
BufferUsageAccumDiff(...)

Should I add countBufferUsage boolean arguments to all places
doInstrument booleans are currently used? This requires several
minor modifications of codes in many places.

[without patch]
=# EXPLAIN (ANALYZE) SELECT * FROM pgbench_accounts;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Seq Scan on pgbench_accounts (cost=0.00..263935.00 rows=10000000 width=97) (actual time=0.003..571.794 rows=10000000 loops=1)
Total runtime: 899.427 ms

[with patch]
=# EXPLAIN (ANALYZE) SELECT * FROM pgbench_accounts;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Seq Scan on pgbench_accounts (cost=0.00..263935.00 rows=10000000 width=97) (actual time=0.003..585.885 rows=10000000 loops=1)
Total runtime: 955.280 ms

- shared_buffers = 1500MB
- pgbench -i -s100
- Read all pages in the pgbench_accounts into shared buffers before runs.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-12-14 03:15:15 Re: EXPLAIN BUFFERS
Previous Message Tom Lane 2009-12-14 02:38:55 Re: Streaming replication and non-blocking I/O