Re: pgbench --startup option

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench --startup option
Date: 2013-06-17 04:42:21
Message-ID: CAMkU=1zrsQUuB=2nGc1N_k7TW1ROscxzVmS4m1v7Nr1VqPaxnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 2, 2013 at 11:25 AM, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:

>
> I've just done a quick review of the source, as I've been hacking in
> pgbench myself.
>
> I think that the feature makes sense.
>
> About the details of the patch:
>
> (1) Some changes in the patch are unrelated to the purpose of the patch
> (e.g. spacing changes, error message...), and should be removed?
>

I did fix the indenting of the -c and -t options in the usage message,
which was inconsistent before and was bugging me. (The bad indenting was
just in the source, did not makes its way to the usage output.) Perhaps I
shouldn't have done that in this patch, at least not without mentioning it.

I made the error message more verbose, in case someone passes invalid
syntax to the startup command, the original message was rather confusing I
thought. I'm not sure if the error handling I do is suitable or not, it is
rather minimal, simply piggy backing on what was already there. Do you
have any thoughts on that?

>
> (2) Instead adding a new function, I would suggest to modify the existing
> one with an added argument, which would be ignored when NULL is passed.
>

I considered passing a boolean to doConnect, but I often find that hard to
follow because:

doConnect(false);

doesn't tell you what is false without having to go hunt down the comments
for function definition (maybe using a IDE rather than vi would help me out
there). There are plenty of places in the existing code which do exactly
that, but those places usually have the action depending on the boolean
take place in the middle of the function. Here the dependent action takes
place at the end, so it is easy to have two functions without code
duplication because one function just calls the other and then takes one
more action. So, I don't know; mostly a matter of taste.

But your suggestion would be to pass in either char *startup or NULL to
doConnect? That is at least self-explanatory in the case where what you
pass in is startup rather than NULL. It is a bit weird to pass a
file-scoped global variable into a function which is in the same file, but
I don't see a clean way of reducing the scope without introducing a lot of
noise.

(3) I'm not sure of the behavior of the feature. What if two statements are
> required, should it not be able to handle multiple --startup
> specifications, say with an array instead of a scalar?
>

I've just been putting a semicolon in the string to separate the statements
(see example at end of post). I could use a linked list of strings, but
that is pretty elaborate for something so simple. Are the core LL routines
easily available to contrib?

>
> (4) C style: there is no need to put a ";" after "}".
>

Thanks, old habits die hard.

> (5) In the documentation, other options do not put a "=" sign between the
> option and its argument, although it is also accepted.
>

The convention seems to be to use a space for short options and an equal
sign for long options in the documentation.

Thanks for doing the review. I'm not sure what things to go change without
further feedback/discussion, except point 4. I'll wait a day to see if I
get more feedback on the other issues and submit a new patch.

By the way, I have two more examples of using this beyond the
syncronous_commit one:

To probe the overhead of beginning a transaction.

pgbench -S -T 300 -c4 -j4 --startup='BEGIN'
pgbench -S -T 300 -c4 -j4

At a scale of 10 (all in shaerd_buffers), tps excluding connections are
respectively (median of 10 alternating runs):
38623.2636090.69

So doing it all within one transaction gives a 7% improvement.

One the other hand, preemptively locking the table gives no detectable
further improvement, which doesn't surprise me because every select runs in
a different resource owner, so it still needs to obtain the lock for itself
despite a higher resource owner already owning it (median of many hundreds
alternating runs):

pgbench -S -T 30 -c4 -j4 --startup='BEGIN;LOCK TABLE pgbench_accounts in
access share mode'
pgbench -S -T 30 -c4 -j4 --startup='BEGIN;'

38645.2538681.52

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jon Nelson 2013-06-17 04:53:21 Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Previous Message Robins Tharakan 2013-06-17 03:36:09 Re: Patch to add support of "IF NOT EXISTS" to others "CREATE" statements