Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)
Date: 2013-06-08 19:31:13
Message-ID: 51B38681.70509@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/1/13 5:00 AM, Fabien COELHO wrote:
> Question 1: should it report the maximum lang encountered?

I haven't found the lag measurement to be very useful yet, outside of
debugging the feature itself. Accordingly I don't see a reason to add
even more statistics about the number outside of testing the code. I'm
seeing some weird lag problems that this will be useful for though right
now, more on that a few places below.

> Question 2: the next step would be to have the current lag shown under
> option --progress, but that would mean having a combined --throttle
> --progress patch submission, or maybe dependencies between patches.

This is getting too far ahead. Let's get the throttle part nailed down
before introducing even more moving parts into this. I've attached an
updated patch that changes a few things around already. I'm not done
with this yet and it needs some more review before commit, but it's not
too far away from being ready.

This feature works quite well. On a system that will run at 25K TPS
without any limit, I did a run with 25 clients and a rate of 400/second,
aiming at 10,000 TPS, and that's what I got:

number of clients: 25
number of threads: 1
duration: 60 s
number of transactions actually processed: 599620
average transaction lag: 0.307 ms
tps = 9954.779317 (including connections establishing)
tps = 9964.947522 (excluding connections establishing)

I never thought of implementing the throttle like this before, but it
seems to work out well so far. Check out tps.png to see the smoothness
of the TPS curve (the graphs came out of pgbench-tools. There's a
little more play outside of the target than ideal for this case. Maybe
it's worth tightening the Poisson curve a bit around its center?

The main implementation issue I haven't looked into yet is why things
can get weird at the end of the run. See the latency.png graph attached
and you can see what I mean.

I didn't like the naming on this option or all of the ways you could
specify the delay. None of those really added anything, since you can
get every other behavior by specifying a non-integer TPS. And using the
word "throttle" inside the code is fine, but I didn't like exposing that
implementation detail more than it had to be.

What I did instead was think of this as a transaction rate target, which
makes the help a whole lot simpler:

-R SPEC, --rate SPEC
target rate per client in transactions per second

Made the documentation easier to write too. I'm not quite done with
that yet, the docs wording in this updated patch could still be better.

I personally would like this better if --rate specified a *total* rate
across all clients. However, there are examples of both types of
settings in the program already, so there's no one precedent for which
is right here. -t is per-client and now -R is too; I'd prefer it to be
like -T instead. It's not that important though, and the code is
cleaner as it's written right now. Maybe this is better; I'm not sure.

I did some basic error handling checks on this and they seemed good, the
program rejects target rates of <=0.

On the topic of this weird latency spike issue, I did see that show up
in some of the results too. Here's one where I tried to specify a rate
higher than the system can actually handle, 80000 TPS total on a
SELECT-only test

$ pgbench -S -T 30 -c 8 -j 4 -R10000tps pgbench
starting vacuum...end.
transaction type: SELECT only
scaling factor: 100
query mode: simple
number of clients: 8
number of threads: 4
duration: 30 s
number of transactions actually processed: 761779
average transaction lag: 10298.380 ms
tps = 25392.312544 (including connections establishing)
tps = 25397.294583 (excluding connections establishing)

It was actually limited by the capabilities of the hardware, 25K TPS.
10298 ms of lag per transaction can't be right though.

Some general patch submission suggestions for you as a new contributor:

-When re-submitting something with improvements, it's a good idea to add
a version number to the patch so reviewers can tell them apart easily.
But there is no reason to change the subject line of the e-mail each
time. I followed that standard here. If you updated this again I would
name the file pgbench-throttle-v9.patch but keep the same e-mail subject.

-There were some extra carriage return characters in your last
submission. Wasn't a problem this time, but if you can get rid of those
that makes for a better patch.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Attachment Content-Type Size
image/png 4.0 KB
image/png 4.4 KB
pgbench-throttle-v8.patch text/plain 7.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Cédric Villemain 2013-06-08 20:01:22 Re: Proposed patch: remove hard-coded limit MAX_ALLOCATED_DESCS
Previous Message Jeff Janes 2013-06-08 19:17:27 Re: Hard limit on WAL space used (because PANIC sucks)