[PATCH] add --throttle option to pgbench

Lists: pgsql-hackers
From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: [PATCH] add --throttle option to pgbench
Date: 2013-04-29 12:42:40
Message-ID: alpine.DEB.2.02.1304291411110.24306@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Hello,

Please find attached a small patch to add a throttling capability to
pgbench, that is pgbench aims at a given client transaction rate instead
of maximizing the load. The throttling relies on Poisson-distributed
delays inserted after each transaction.

I wanted that to test the impact of various load levels, and for
functionnal tests on my laptop which should not drain the battery.

sh> ./pgbench -T 10 -c 2 --throttle 10tps test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 2
number of threads: 1
duration: 10 s
number of transactions actually processed: 214
tps = 21.054216 (including connections establishing)
tps = 21.071253 (excluding connections establishing)

--
Fabien.

Attachment Content-Type Size
pgbench-throttle.patch text/x-diff 7.3 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle option to pgbench
Date: 2013-04-29 15:27:00
Message-ID: 3113.1367249220@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> writes:
> Please find attached a small patch to add a throttling capability to
> pgbench, that is pgbench aims at a given client transaction rate instead
> of maximizing the load. The throttling relies on Poisson-distributed
> delays inserted after each transaction.

I'm having a hard time understanding the use-case for this feature.
Surely, if pgbench is throttling its transaction rate, you're going
to just end up measuring the throttle rate.

> I wanted that to test the impact of various load levels, and for
> functionnal tests on my laptop which should not drain the battery.

How does causing a test to take longer result in reduced battery drain?
You still need the same number of transactions if you want an honest
test, so it seems to me the machine would have to be on longer and thus
you'd eat *more* battery to get an equivalently trustworthy result.

regards, tom lane


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle option to pgbench
Date: 2013-04-29 16:56:28
Message-ID: CAMkU=1xEW69gP3RYE37iXgh7q4imTBA5Yq0fdu1VV6hHQkzgrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 29, 2013 at 8:27 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> writes:
> > Please find attached a small patch to add a throttling capability to
> > pgbench, that is pgbench aims at a given client transaction rate instead
> > of maximizing the load. The throttling relies on Poisson-distributed
> > delays inserted after each transaction.
>
> I'm having a hard time understanding the use-case for this feature.
> Surely, if pgbench is throttling its transaction rate, you're going
> to just end up measuring the throttle rate.
>

While I don't understand the part about his laptop battery, I think that
there is a good use case for this. If you are looking at latency
distributions or spikes, you probably want to see what they are like with a
load which is like the one you expect having, not the load which is the
highest possible. Although for this use case you would almost surely be
using custom transaction files, not default ones, so I think you could just
use \sleep. However, I don't know if there is an easy way to dynamically
adjust the sleep value by subtracting off the overhead time and randomizing
it a bit, like is done here.

It does seem to me that we should Poissonize the throttle time, then
subtract the average overhead, rather than Poissonizing the difference.

Cheers,

Jeff


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle option to pgbench
Date: 2013-04-29 17:45:18
Message-ID: alpine.DEB.2.02.1304291944460.7344@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Hello Tom,

> I'm having a hard time understanding the use-case for this feature.
> Surely, if pgbench is throttling its transaction rate, you're going
> to just end up measuring the throttle rate.

Indeed, I do not want to measure the tps if I throttle it.

The point is to generate a continuous but not necessarily maximal load,
and to test other things under such load such as possiby cascading
replication, failover, various dump strategies, whatever.

>> I wanted that to test the impact of various load levels, and for
>> functionnal tests on my laptop which should not drain the battery.
>
> How does causing a test to take longer result in reduced battery drain?

If I test a replication setup on my laptop at maximum load, I can see the
battery draining in a few seconds by looking at the effect on the time
left widget. This remark is mostly for functional tests, not for
performance test.

If I want to test the maximum load of a setup, obviously I will not do
that on my laptop, and I will not use --throttle...

--
Fabien.


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle option to pgbench
Date: 2013-04-29 18:08:40
Message-ID: alpine.DEB.2.02.1304291945540.7344@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Hello Jeff,

> While I don't understand the part about his laptop battery, I think that
> there is a good use case for this. If you are looking at latency
> distributions or spikes, you probably want to see what they are like with a
> load which is like the one you expect having, not the load which is the
> highest possible. Although for this use case you would almost surely be
> using custom transaction files, not default ones, so I think you could just
> use \sleep. However, I don't know if there is an easy way to dynamically
> adjust the sleep value by subtracting off the overhead time and randomizing
> it a bit, like is done here.

Indeed, my thoughts:-) Having regularly (\sleep n) or uniformly
distributed (\sleep :random_value) is not very realistic, and I would have
to do some measures to find the right value for a target load.

> It does seem to me that we should Poissonize the throttle time, then
> subtract the average overhead, rather than Poissonizing the difference.

I actually thought about doing it the way you suggested, because it was
"right". However I did not do it, because if the Poisson gives, possibly
quite frequently, a time below the transaction time, one ends up with an
artificial sequence of stuck transactions, as a client cannot start the
second transaction while the previous one is not finished, and this does
not seem realistic. To really do that more cleanly, it would require
distributing the events between clients, so having some kind of
coordination between clients, which would really be another test
application. Having an approximation of that seemed good enough for my
purpose.

--
Fabien.


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle option to pgbench
Date: 2013-04-29 19:14:42
Message-ID: alpine.DEB.2.02.1304292059120.8239@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> I'm having a hard time understanding the use-case for this feature.

Here is an example functional use case I had in mind.

Let us say I'm teaching a practice session about administrating
replication. Students have a desktop computer on which they can install
several instances or postgresql, or possibly use virtual machines. I'd
like them to setup one server, put it under a continuous load, then create
a first slave, then a second, and things like that. The thing I do not
want is the poor desktop and its hard drive to be at maximum speed for the
whole afternoon while doing the session, making it hard to do anything
else on the host. So I want something both realistic (the database is
under a load, the WAL is advancing, let us dump it, base backup it,
replicate it, monitor it, update it, whatever...), but gentle all the
same.

Using pgbench with --throttle basically provides the adjustable continuous
load I need. I understand that this is not at all the intent for which it
was developed.

Note that I will probably propose another patch to provide a heart beat
while things are going on, but I thought that one patch at a time was
enough.

--
Fabien.


From: Jim Nasby <jim(at)nasby(dot)net>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle option to pgbench
Date: 2013-04-29 19:44:24
Message-ID: 517ECD98.80000@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4/29/13 1:08 PM, Fabien COELHO wrote:
>
>> While I don't understand the part about his laptop battery, I think that
>> there is a good use case for this. If you are looking at latency
>> distributions or spikes, you probably want to see what they are like with a
>> load which is like the one you expect having, not the load which is the
>> highest possible. Although for this use case you would almost surely be
>> using custom transaction files, not default ones, so I think you could just
>> use \sleep. However, I don't know if there is an easy way to dynamically
>> adjust the sleep value by subtracting off the overhead time and randomizing
>> it a bit, like is done here.
>
> Indeed, my thoughts:-) Having regularly (\sleep n) or uniformly distributed (\sleep :random_value) is not very realistic, and I would have to do some measures to find the right value for a target load.

+1 to being able to throttle to make latency measurements.

I'm also wondering if it would be useful to be able to set a latency target and have something adjust concurrency to see how well you can hit it. Certainly feature creep for the proposed patch; I only bring it up because there may be enough similarity to consider that use case at this time, even if we don't implement it yet.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle option to pgbench [patch 2]
Date: 2013-04-29 22:39:44
Message-ID: alpine.DEB.2.02.1304300026540.17961@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> It does seem to me that we should Poissonize the throttle time, then
> subtract the average overhead, rather than Poissonizing the difference.

After thinking again about Jeff's point and failing to sleep, I think that
doing exactly that is better because:
- it is "right"
- the code is simpler and shorter
- my transaction stuck sequence issue is not that big an issue anyway

Here is a patch to schedule transactions along Poisson-distributed events.
This patch replaces my previous proposal.

Note that there is no reference to the current time after the stochastic
process is initiated. This is necessary, and mean that if transactions lag
behind the throttle at some point they will try to catch up later. Neither
a good nor a bad thing, mostly a feature.

--
Fabien

Attachment Content-Type Size
pgbench-throttle-2.patch text/x-diff 6.6 KB