[PATCH] add --throttle to pgbench (submission 3)

Lists: pgsql-hackers
From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-01 08:57:50
Message-ID: alpine.DEB.2.02.1305011048580.25330@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Add --throttle to pgbench

Each client is throttled to the specified rate, which can be expressed in
tps or in time (s, ms, us). Throttling is achieved by scheduling
transactions along a Poisson-distribution.

This is an update of the previous proposal which fix a typo in the sgml
documentation.

The use case of the option is to be able to generate a continuous gentle
load for functional tests, eg in a practice session with students or for
testing features on a laptop.

--
Fabien.

Attachment Content-Type Size
pgbench-throttle-2.patch text/x-diff 6.5 KB

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-01 16:56:06
Message-ID: 51814926.7030907@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 5/1/13 4:57 AM, Fabien COELHO wrote:
> The use case of the option is to be able to generate a continuous gentle
> load for functional tests, eg in a practice session with students or for
> testing features on a laptop.

If you add this to
https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll
review it next month. I have a lot of use cases for a pgbench that
doesn't just run at 100% all the time. I had tried to simulate
something with simple sleep calls, but I realized it was going to take a
stronger math basis to do the job well.

The situations where I expect this to be useful all require collecting
latency data and then both plotting it and doing some statistical
analysis. pgbench-tools computes worst-case and 90th percentile latency
for example, along with the graph over time. There's a useful concept
that some of the official TPC tests have: how high can you get the
throughput while still keeping the latency within certain parameters.
Right now we have no way to simulate that. What we see with write-heavy
pgbench is that latency goes crazy (>60 second commits sometimes) if all
you do is hit the server with maximum throughput. That's interesting,
but it's not necessarily relevant in many cases.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-02 08:25:52
Message-ID: alpine.DEB.2.02.1305021016530.27669@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Hello Greg,

> If you add this to
> https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll review it
> next month.

Ok. Thanks. I just did that.

> I have a lot of use cases for a pgbench that doesn't just run at 100%
> all the time. I had tried to simulate something with simple sleep
> calls, but I realized it was going to take a stronger math basis to do
> the job well.
>
> The situations where I expect this to be useful all require collecting
> latency data and then both plotting it and doing some statistical analysis.
> pgbench-tools computes worst-case and 90th percentile latency for example,
> along with the graph over time. There's a useful concept that some of the
> official TPC tests have: how high can you get the throughput while still
> keeping the latency within certain parameters. Right now we have no way to
> simulate that. What we see with write-heavy pgbench is that latency goes
> crazy (>60 second commits sometimes) if all you do is hit the server with
> maximum throughput. That's interesting, but it's not necessarily relevant in
> many cases.

Indeed. It is a good thing that my proposed feature can help in more
situations than my particular need.

--
Fabien.


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-28 02:19:08
Message-ID: 51A4141C.5030701@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/02/2013 12:56 AM, Greg Smith wrote:
> On 5/1/13 4:57 AM, Fabien COELHO wrote:
>> The use case of the option is to be able to generate a continuous gentle
>> load for functional tests, eg in a practice session with students or for
>> testing features on a laptop.
>
> If you add this to
> https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll
> review it next month. I have a lot of use cases for a pgbench that
> doesn't just run at 100% all the time.
As do I - in particular, if time permits I'll merge this patch into my
working copy of pgbench so I can find the steady-state transaction rate
where BDR replication's lag is stable and doesn't increase continually.
Right now I don't really have any way of doing that, only measuring how
long it takes to catch up once the test run completes.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-28 08:13:46
Message-ID: alpine.DEB.2.02.1305281011310.12479@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>>> The use case of the option is to be able to generate a continuous gentle
>>> load for functional tests, eg in a practice session with students or for
>>> testing features on a laptop.
>>
>> If you add this to
>> https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll
>> review it next month. I have a lot of use cases for a pgbench that
>> doesn't just run at 100% all the time.
> As do I - in particular, if time permits I'll merge this patch into my
> working copy of pgbench so I can find the steady-state transaction rate
> where BDR replication's lag is stable and doesn't increase continually.
> Right now I don't really have any way of doing that, only measuring how
> long it takes to catch up once the test run completes.

You can try to use and improve the --progress option in another patch
submission which shows how things are going.

--
Fabien.


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-28 08:38:11
Message-ID: 51A46CF3.7080400@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/28/2013 04:13 PM, Fabien COELHO wrote:
>
> You can try to use and improve the --progress option in another patch
> submission which shows how things are going.
That'll certainly be useful, but won't solve this issue. The thing is
that with asynchronous replication you need to know how long it takes
until all nodes are back in sync, with no replication lag.

I can probably do it with a custom pgbench script, but I'm tempted to
add support for timing that part separately with a "wait command" to run
at the end of the benchmark.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-28 11:52:51
Message-ID: alpine.DEB.2.02.1305281344500.12479@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>> You can try to use and improve the --progress option in another patch
>> submission which shows how things are going.

> That'll certainly be useful, but won't solve this issue. The thing is
> that with asynchronous replication you need to know how long it takes
> until all nodes are back in sync, with no replication lag.

> I can probably do it with a custom pgbench script, but I'm tempted to
> add support for timing that part separately with a "wait command" to run
> at the end of the benchmark.

ISTM that a separate process not related to pgbench should try to monitor
the master-slave async lag, as it is an interesting information anyway...

However I'm not sure that pg_stat_replication currently has the necessary
information on either side to measure the lag (in time transactions, but
how do I know when a transaction was committed? or number of
transactions?).

--
Fabien.


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-30 07:10:06
Message-ID: 51A6FB4E.1010608@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/28/2013 07:52 PM, Fabien COELHO wrote:
>
> However I'm not sure that pg_stat_replication currently has the
> necessary information on either side to measure the lag (in time
> transactions, but how do I know when a transaction was committed? or
> number of transactions?).

The BDR codebase now has a handy function to report when a transaction
was committed, pg_get_transaction_committime(xid) .

It also adds pg_xlog_wait_remote_apply and pg_xlog_wait_remote_receive
that can be used with pg_current_xlog_location() to wait until one or
all replicas have caught up, or with LSNs from pg_stat_replication to
(say) wait until all replicas have caught up with the most up-to-date one.

I don't think these depend on anything BDR-specific, though Andres or
Álvaro would be able to say for sure. Take a look in:

git://git.postgresql.org/git/users/andresfreund/postgres.git

on the 'bdr' branch. Be aware that it is rebased regularly, though the
'0.4' tag applied earlier today will remain constant and contains the
functions of interest.

I hope this helps.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-30 07:54:01
Message-ID: 51A70599.7070108@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/30/2013 03:10 PM, Craig Ringer wrote:
> On 05/28/2013 07:52 PM, Fabien COELHO wrote:
>> However I'm not sure that pg_stat_replication currently has the
>> necessary information on either side to measure the lag (in time
>> transactions, but how do I know when a transaction was committed? or
>> number of transactions?).
> The BDR codebase now has a handy function to report when a transaction
> was committed, pg_get_transaction_committime(xid) .
>
> It also adds pg_xlog_wait_remote_apply and pg_xlog_wait_remote_receive
> that can be used with pg_current_xlog_location() to wait until one or
> all replicas have caught up, or with LSNs from pg_stat_replication to
> (say) wait until all replicas have caught up with the most up-to-date one.
>
> I don't think these depend on anything BDR-specific
They do, however, require changes to Pg core. These aren't functions you
can just borrow and add to an extension, they require additional changes
to core to collect the data they use.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-30 07:58:49
Message-ID: 20130530075849.GE4201@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-05-30 15:54:01 +0800, Craig Ringer wrote:
> On 05/30/2013 03:10 PM, Craig Ringer wrote:
> > On 05/28/2013 07:52 PM, Fabien COELHO wrote:
> >> However I'm not sure that pg_stat_replication currently has the
> >> necessary information on either side to measure the lag (in time
> >> transactions, but how do I know when a transaction was committed? or
> >> number of transactions?).
> > The BDR codebase now has a handy function to report when a transaction
> > was committed, pg_get_transaction_committime(xid) .
> >
> > It also adds pg_xlog_wait_remote_apply and pg_xlog_wait_remote_receive
> > that can be used with pg_current_xlog_location() to wait until one or
> > all replicas have caught up, or with LSNs from pg_stat_replication to
> > (say) wait until all replicas have caught up with the most up-to-date one.
> >
> > I don't think these depend on anything BDR-specific
> They do, however, require changes to Pg core. These aren't functions you
> can just borrow and add to an extension, they require additional changes
> to core to collect the data they use.

pg_xlog_wait_remote_receive() doesn't require changes afaics and should
be easily packable as an extension. We might want to make it use the
sync commit infrastructure at some point instead of essentially busy
waiting, but...

'committs' - the mapping of xids to timestamp certainly does though.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-05-31 07:41:09
Message-ID: alpine.DEB.2.02.1305310938440.31253@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>> However I'm not sure that pg_stat_replication currently has the
>> necessary information on either side to measure the lag (in time
>> transactions, but how do I know when a transaction was committed? or
>> number of transactions?).
>
> The BDR codebase now has a handy function to report when a transaction
> was committed, pg_get_transaction_committime(xid) .

This looks handy for monitoring a replication setup.
It should really be in core...

Any plans? Or is there other ways to get this kind of information in core?

--
Fabien.


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-06-09 09:50:13
Message-ID: 51B44FD5.4000200@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/31/2013 03:41 PM, Fabien COELHO wrote:
>
>>> However I'm not sure that pg_stat_replication currently has the
>>> necessary information on either side to measure the lag (in time
>>> transactions, but how do I know when a transaction was committed? or
>>> number of transactions?).
>>
>> The BDR codebase now has a handy function to report when a transaction
>> was committed, pg_get_transaction_committime(xid) .
>
> This looks handy for monitoring a replication setup.
> It should really be in core...
>
> Any plans? Or is there other ways to get this kind of information in
> core?

Yes, it's my understanding that the idea is to eventually get all the
BDR functionality merged, piece by piece, including the commit time
tracking feature.

pg_get_transaction_committime isn't trivial to just add to core because
it requires a commit time to be recorded with commit records in the
transaction logs, among other changes.

I don't know if Andres or any of the others involved are planning on
trying to get this particular feature merged in 9.4, but I wouldn't be
too surprised since (AFAIK) it's fairly self-contained and would be
useful for monitoring streaming replication setups as well.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Greg Smith <greg(at)2ndQuadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-06-09 13:25:54
Message-ID: 20130609132554.GA1456@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-06-09 17:50:13 +0800, Craig Ringer wrote:
> On 05/31/2013 03:41 PM, Fabien COELHO wrote:
> >
> >>> However I'm not sure that pg_stat_replication currently has the
> >>> necessary information on either side to measure the lag (in time
> >>> transactions, but how do I know when a transaction was committed? or
> >>> number of transactions?).
> >>
> >> The BDR codebase now has a handy function to report when a transaction
> >> was committed, pg_get_transaction_committime(xid) .
> >
> > This looks handy for monitoring a replication setup.
> > It should really be in core...
> >
> > Any plans? Or is there other ways to get this kind of information in
> > core?

> pg_get_transaction_committime isn't trivial to just add to core because
> it requires a commit time to be recorded with commit records in the
> transaction logs, among other changes.

The commit records actually already have that information available
(c.f. xl_xact_commit(_compact) in xact.h), the problem is having a
datastructure which collects all that.
That's why the committs (written by Alvaro) added an slru mapping xids
to timestamps. And yes, we want to submit that sometime.

The pg_xlog_wait_remote_apply(), pg_xlog_wait_remote_receive() functions
however don't need any additional infrastructure, so I think those are
easier and less controversial to add.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-06-19 13:51:31
Message-ID: 51C1B763.9010807@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/01/13 04:57, Fabien COELHO wrote:
>
> Add --throttle to pgbench
>
> Each client is throttled to the specified rate, which can be expressed in
> tps or in time (s, ms, us). Throttling is achieved by scheduling
> transactions along a Poisson-distribution.
>
> This is an update of the previous proposal which fix a typo in the sgml
> documentation.
>
> The use case of the option is to be able to generate a continuous gentle
> load for functional tests, eg in a practice session with students or for
> testing features on a laptop.

Why does this need two option formats (-H and --throttle)?

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-06-19 18:34:10
Message-ID: alpine.DEB.2.02.1306192025330.25404@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>> The use case of the option is to be able to generate a continuous gentle
>> load for functional tests, eg in a practice session with students or for
>> testing features on a laptop.
>
> Why does this need two option formats (-H and --throttle)?

On the latest version it is --rate and -R.

Because you may want to put something very readable and understandable in
a script and like long options, or have to type it interactively every day
in a terminal and like short ones. Most UNIX commands include both kind.

--
Fabien.


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-06-19 18:53:31
Message-ID: 51C1FE2B.6060102@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/19/13 14:34, Fabien COELHO wrote:
>
>>> The use case of the option is to be able to generate a continuous gentle
>>> load for functional tests, eg in a practice session with students or for
>>> testing features on a laptop.
>>
>> Why does this need two option formats (-H and --throttle)?
>
> On the latest version it is --rate and -R.
>
> Because you may want to put something very readable and understandable in
> a script and like long options, or have to type it interactively every day
> in a terminal and like short ones. Most UNIX commands include both kind.
>

Would it make sense then to add long versions for all the other standard
options too?

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] add --throttle to pgbench (submission 3)
Date: 2013-06-19 20:18:52
Message-ID: alpine.DEB.2.02.1306192217100.25404@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>> Because you may want to put something very readable and understandable in
>> a script and like long options, or have to type it interactively every day
>> in a terminal and like short ones. Most UNIX commands include both kind.
>
> Would it make sense then to add long versions for all the other standard
> options too?

Yep. It is really a stylistic (pedantic?) matter. See for pgbench:

https://commitfest.postgresql.org/action/patch_view?id=1106

--
Fabien.