Re: pgbench --startup option

Lists: pgsql-hackers
From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: pgbench --startup option
Date: 2013-02-10 23:27:49
Message-ID: CAMkU=1xV3tYKoHD8U2mQzfC5Kbn_bdcVf8br-EnUvy-6Z=B47w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

While looking at some proposed patches and pondering some questions on
performance, I realized I desperately needed ways to run benchmarks with
different settings without needing to edit postgresql.conf and
restart/reload the server each time.

Most commonly, I want to run with synchronous_commit on or off at whim. I
thought of writing a patch specifically for that, but decided a more
general approach would be better. The attached patch allows any arbitrary
command to be executed upon the start up of each connection intended for
benchmarking (as opposed to utility connections, intended for either -i or
for counting the rows in in pgbench_branches and for vacuuming), and so
covers the need for changing any session-changeable GUCs, plus doing other
things as well.

I created doBenchMarkConnect() to segregate bench-marking connections from
utility connections. At first I thought of adding the startup code to only
the normal path and leaving support for -C in the wind, but decided that
was just lazy.

Will add to commitfest-next.

Cheers,

Jeff

Attachment Content-Type Size
pgbench_startup-v1.patch text/x-patch 5.8 KB

From: Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-02-10 23:45:19
Message-ID: CAEYLb_Uv7kCgMqRCK4FZJJUcZt2Q0YxOEYQjD+LUfhaEAbW35A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10 February 2013 23:27, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> While looking at some proposed patches and pondering some questions on
> performance, I realized I desperately needed ways to run benchmarks with
> different settings without needing to edit postgresql.conf and
> restart/reload the server each time.

I'd thought about hacking this capability into pgbench-tools, so that
there was a new outer-most loop that iterates through different
postgresql.conf files. Come to think of it, the need to vary settings
per test set (that is, per series of benchmarks, each of which is
plotted as a different color) generally only exists for one or two
settings, so it is probably better to pursue the approach that you
propose here.

I guess what I've outlined could still be useful for PGC_POSTMASTER
gucs, like shared_buffers.

--
Regards,
Peter Geoghegan


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-04-01 18:00:07
Message-ID: CAMkU=1z81FbL8mfihN1oQP0R9WPx6fvBRfjG4cKfjCzjFso3dQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 10, 2013 at 3:27 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

...

> Will add to commitfest-next.
>
> Cheers,
>
> Jeff
>

I realized the original patch was broken for multithreaded use due to an
errant "static".

Also, I've cleaned up some carelessness in where I did the copy-and-paste
for parameter parsing.

Attachment Content-Type Size
pgbench_startup-v2.patch application/octet-stream 5.7 KB

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-05-02 18:25:30
Message-ID: alpine.DEB.2.02.1305022022470.9292@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I've just done a quick review of the source, as I've been hacking in
pgbench myself.

I think that the feature makes sense.

About the details of the patch:

(1) Some changes in the patch are unrelated to the purpose of the patch
(e.g. spacing changes, error message...), and should be removed?

(2) Instead adding a new function, I would suggest to modify the existing
one with an added argument, which would be ignored when NULL is passed.

(3) I'm not sure of the behavior of the feature. What if two statements
are required, should it not be able to handle multiple --startup
specifications, say with an array instead of a scalar?

(4) C style: there is no need to put a ";" after "}".

(5) In the documentation, other options do not put a "=" sign between the
option and its argument, although it is also accepted.

Have a nice day,

--
Fabien.


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-05-28 00:11:22
Message-ID: 51A3F62A.6010707@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/11/2013 07:27 AM, Jeff Janes wrote:
> I created doBenchMarkConnect() to segregate bench-marking connections from
> utility connections. At first I thought of adding the startup code to only
> the normal path and leaving support for -C in the wind, but decided that
> was just lazy.

That sounds very useful and would've eased some recent pgbench work I've
been doing too.

I've put some patches together to make pgbench capable of talking to
multiple servers. I needed it for benchmarking work on bidirectional
replication, but it's also useful if you want to benchmark a group of
hot standbys in read-only mode, and it may be useful with various 3rd
pty replication solutions. As written it uses one or more threads per
server, with all clients managed by a given thread using the same
server. Multiple servers are specified by using connstring style syntax, eg:

pgbench -T 600 -j 4 -c 64 "host=server1 user=postgres"
"host=server2 user=postgres port=5433"

It isn't ready for a commitfest yet as I need to tidy up a few things
and I still haven't added an extra set of timings to measure how long
the DB takes to return to a steady state after the pgbench run, but once
that's done I'll send it in. The after-run timings are intended for
things like measuring how much lag an asynchronous replica has built up
and how long it takes to catch up after the write flood stops, or how
long a CHECKPOINT takes after the pgbench run.

I also have a patch that adds a flag to force a CHECKPOINT after vacuum
and before running its tests. This makes pgbench results _vastly_ more
stable over short runs.

The work is currently lurking in the 'multibench' branch of
git://github.com/ringerc/postgres.git ; see
https://github.com/ringerc/postgres/tree/multibench. Only pgbench.c is
actually changed. Comments appreciated.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-06-17 04:42:21
Message-ID: CAMkU=1zrsQUuB=2nGc1N_k7TW1ROscxzVmS4m1v7Nr1VqPaxnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 2, 2013 at 11:25 AM, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:

>
> I've just done a quick review of the source, as I've been hacking in
> pgbench myself.
>
> I think that the feature makes sense.
>
> About the details of the patch:
>
> (1) Some changes in the patch are unrelated to the purpose of the patch
> (e.g. spacing changes, error message...), and should be removed?
>

I did fix the indenting of the -c and -t options in the usage message,
which was inconsistent before and was bugging me. (The bad indenting was
just in the source, did not makes its way to the usage output.) Perhaps I
shouldn't have done that in this patch, at least not without mentioning it.

I made the error message more verbose, in case someone passes invalid
syntax to the startup command, the original message was rather confusing I
thought. I'm not sure if the error handling I do is suitable or not, it is
rather minimal, simply piggy backing on what was already there. Do you
have any thoughts on that?

>
> (2) Instead adding a new function, I would suggest to modify the existing
> one with an added argument, which would be ignored when NULL is passed.
>

I considered passing a boolean to doConnect, but I often find that hard to
follow because:

doConnect(false);

doesn't tell you what is false without having to go hunt down the comments
for function definition (maybe using a IDE rather than vi would help me out
there). There are plenty of places in the existing code which do exactly
that, but those places usually have the action depending on the boolean
take place in the middle of the function. Here the dependent action takes
place at the end, so it is easy to have two functions without code
duplication because one function just calls the other and then takes one
more action. So, I don't know; mostly a matter of taste.

But your suggestion would be to pass in either char *startup or NULL to
doConnect? That is at least self-explanatory in the case where what you
pass in is startup rather than NULL. It is a bit weird to pass a
file-scoped global variable into a function which is in the same file, but
I don't see a clean way of reducing the scope without introducing a lot of
noise.

(3) I'm not sure of the behavior of the feature. What if two statements are
> required, should it not be able to handle multiple --startup
> specifications, say with an array instead of a scalar?
>

I've just been putting a semicolon in the string to separate the statements
(see example at end of post). I could use a linked list of strings, but
that is pretty elaborate for something so simple. Are the core LL routines
easily available to contrib?

>
> (4) C style: there is no need to put a ";" after "}".
>

Thanks, old habits die hard.

> (5) In the documentation, other options do not put a "=" sign between the
> option and its argument, although it is also accepted.
>

The convention seems to be to use a space for short options and an equal
sign for long options in the documentation.

Thanks for doing the review. I'm not sure what things to go change without
further feedback/discussion, except point 4. I'll wait a day to see if I
get more feedback on the other issues and submit a new patch.

By the way, I have two more examples of using this beyond the
syncronous_commit one:

To probe the overhead of beginning a transaction.

pgbench -S -T 300 -c4 -j4 --startup='BEGIN'
pgbench -S -T 300 -c4 -j4

At a scale of 10 (all in shaerd_buffers), tps excluding connections are
respectively (median of 10 alternating runs):
38623.2636090.69

So doing it all within one transaction gives a 7% improvement.

One the other hand, preemptively locking the table gives no detectable
further improvement, which doesn't surprise me because every select runs in
a different resource owner, so it still needs to obtain the lock for itself
despite a higher resource owner already owning it (median of many hundreds
alternating runs):

pgbench -S -T 30 -c4 -j4 --startup='BEGIN;LOCK TABLE pgbench_accounts in
access share mode'
pgbench -S -T 30 -c4 -j4 --startup='BEGIN;'

38645.2538681.52

Cheers,

Jeff


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-06-20 17:46:38
Message-ID: CAMkU=1zVbjsQFARCdVUgGjb+k+Xw1cedTCwB_DhXVqnyKHLbxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 16, 2013 at 9:42 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Thu, May 2, 2013 at 11:25 AM, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>wrote:
>
>>
>>

> Thanks for doing the review. I'm not sure what things to go change
> without further feedback/discussion, except point 4. I'll wait a day to
> see if I get more feedback on the other issues and submit a new patch.
>

I've fixed a conflict, and I've removed extraneous semicolons from the C.

I've left in the fixing of some existing bad indenting in the existing
code, which is not strictly related to my change.

I hope my defenses of the other points were persuasive.

Cheers,

Jeff

Attachment Content-Type Size
pgbench_startup-v3.patch application/octet-stream 6.1 KB

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-06-20 19:09:50
Message-ID: alpine.DEB.2.02.1306202013280.29119@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> I've fixed a conflict, and I've removed extraneous semicolons from the C.
>
> I've left in the fixing of some existing bad indenting in the existing
> code, which is not strictly related to my change.

There are still unrelated changes : spacing on -c and -t options' help.
The "pgindent" command is passed on the sources from time to time, so
there should be no reason to change this in this commit.

The updated string for PQerrorMessage does not bring much, and the message
does not seem an improvement. "Command failed with ERROR", indeed.

Command failed with ERROR: syntax error at or near ";"
LINE 1: set synchronous_commit=on;set synchronous_;

The preceding result seems bother simpler and fine:

ERROR: syntax error at or near ";"
LINE 1: set synchronous_commit=on;set synchronous_;

Otherwise I've tested the patch with one "set", two "set"s and a syntax
error, and it worked as expected.

--
Fabien.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-06-25 15:47:08
Message-ID: CA+TgmoYJektX7vd9FyveQLuq+4SW9C7wzZbLL3=da-712uxrCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 20, 2013 at 1:46 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> I've fixed a conflict, and I've removed extraneous semicolons from the C.
>
> I've left in the fixing of some existing bad indenting in the existing code,
> which is not strictly related to my change.

OK, I like this idea a lot, but I have a question. Right now, to use
this, you have to supply the startup SQL on the command line. And
that could definitely be useful. But ISTM that you might also want to
take the startup SQL from a file, and indeed you might well want to
include metacommands in there. Maybe that's getting greedy, but the
rate at which people are adding features to pgbench suggests to me
that it won't be long before this isn't enough.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-06-25 18:21:59
Message-ID: alpine.DEB.2.02.1306252016350.27845@localhost6.localdomain6
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> OK, I like this idea a lot, but I have a question. Right now, to use
> this, you have to supply the startup SQL on the command line. And
> that could definitely be useful. But ISTM that you might also want to
> take the startup SQL from a file, and indeed you might well want to
> include metacommands in there. Maybe that's getting greedy, but the
> rate at which people are adding features to pgbench suggests to me
> that it won't be long before this isn't enough.
>
> Thoughts?

My 0.02€:

I thought about this point while reviewing the patch, but I did not add it
to the review because ISTM that it would require a lot of changes, and
pgbench is already kind of a mess, IMHO. Also, possibly you would like to
use a script under different assumptions to test them, that would require
the startup string to be out of the script so as to change it easily. So
it would not be that useful.

--
Fabien.


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-06-26 04:54:08
Message-ID: CAMkU=1xDo9QpcKw9NexHZDsUHujsz8p+-Ci-wRYute-ZVi5a2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday, June 25, 2013, Robert Haas wrote:

> On Thu, Jun 20, 2013 at 1:46 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com<javascript:;>>
> wrote:
> > I've fixed a conflict, and I've removed extraneous semicolons from the C.
> >
> > I've left in the fixing of some existing bad indenting in the existing
> code,
> > which is not strictly related to my change.
>
> OK, I like this idea a lot, but I have a question. Right now, to use
> this, you have to supply the startup SQL on the command line. And
> that could definitely be useful. But ISTM that you might also want to
> take the startup SQL from a file, and indeed you might well want to
> include metacommands in there.

I had not previously considered making the argument to --startup be the
name of a file rather than the command itself.

I had at first wanted to make a new metacommand named \startup which could
be added into regular "-f" files, but realized that that would make it
harder to work with the default supplied transactions, and would be
substantially harder to implement. (And it is somewhat unclear what to do
when there are multiple -f given--take the UNION ALL of them in
command-line order, I guess)

> Maybe that's getting greedy, but the
> rate at which people are adding features to pgbench suggests to me
> that it won't be long before this isn't enough.
>
> Thoughts?
>

On a related note, I have also occasionally wished for a variant of -f
which would read the argument as the contents rather than as the name of a
file supplying the contents. That way a shell script to run a custom
transaction could be self-contained (both -c and -j etc as well the
contents of -f being in a single file, and so nothing to get out of sync or
misplaced).

So thinking about this as a more general problem now, I see that I can use
a bash trick to get what I want, if given what you want (example written in
terms of -f not --startup, as -f already exists in the needed form)

pgbench -f <(echo -e 'select count(*) from pgbench_accounts')

It is a little less clean, but workable.

So I now think --startup should take a file, as that is more consistent
with what other pgbench options take, and still allows me to do what I want
fairly easily.

I'll look into re-writing it to work in this way. But probably not during
this commitfest.

Thanks,

Jeff