Re: stats for network traffic WIP

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Nigel Heron <nheron(at)querymetrics(dot)com>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: stats for network traffic WIP
Date: 2013-12-10 20:39:06
Message-ID: CA+Tgmoba0Dy7qv_xq11z5zoCdKqbASYaSt4LYY4PgHDVOtO3tQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 10, 2013 at 12:29 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Tue, Dec 10, 2013 at 6:56 AM, Nigel Heron <nheron(at)querymetrics(dot)com> wrote:
>> On Sat, Dec 7, 2013 at 1:17 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>>
>>> Could you share the performance numbers? I'm really concerned about
>>> the performance overhead caused by this patch.
>>>
>>
>> I've tried pgbench in select mode with small data sets to avoid disk
>> io and didn't see any difference. That was on my old core2duo laptop
>> though .. I'll have to retry it on some server class multi core
>> hardware.
>
> When I ran pgbench -i -s 100 in four parallel, I saw the performance difference
> between the master and the patched one. I ran the following commands.
>
> psql -c "checkpoint"
> for i in $(seq 1 4); do time pgbench -i -s100 -q db$i & done
>
> The results are:
>
> * Master
> 10000000 of 10000000 tuples (100%) done (elapsed 13.91 s, remaining 0.00 s).
> 10000000 of 10000000 tuples (100%) done (elapsed 14.03 s, remaining 0.00 s).
> 10000000 of 10000000 tuples (100%) done (elapsed 14.01 s, remaining 0.00 s).
> 10000000 of 10000000 tuples (100%) done (elapsed 14.13 s, remaining 0.00 s).
>
> It took almost 14.0 seconds to store 10000000 tuples.
>
> * Patched
> 10000000 of 10000000 tuples (100%) done (elapsed 14.90 s, remaining 0.00 s).
> 10000000 of 10000000 tuples (100%) done (elapsed 15.05 s, remaining 0.00 s).
> 10000000 of 10000000 tuples (100%) done (elapsed 15.42 s, remaining 0.00 s).
> 10000000 of 10000000 tuples (100%) done (elapsed 15.70 s, remaining 0.00 s).
>
> It took almost 15.0 seconds to store 10000000 tuples.
>
> Thus, I'm afraid that enabling network statistics would cause serious
> performance
> degradation. Thought?

Yes, I think the overhead of this patch is far, far too high to
contemplate applying it. It sends a stats collector message after
*every socket operation*. Once per transaction would likely be too
much overhead already (think: pgbench -S) but once per socket op is
insane.

Moreover, even if we found some way to reduce that overhead to an
acceptable level, I think a lot of people would be unhappy about the
statsfile bloat. Unfortunately, the bottom line here is that, until
someone overhauls the stats collector infrastructure to make
incremental updates to the statsfile cheap, we really can't afford to
add much of anything in the way of new statistics. So I fear this
patch is doomed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-12-10 20:45:04 Re: plpgsql_check_function - rebase for 9.3
Previous Message Antonin Houska 2013-12-10 20:37:50 Re: Reference to parent query from ANY sublink