Re: too much pgbench init output

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: too much pgbench init output
Date: 2012-09-16 22:26:10
Message-ID: 50565202.5010906@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5.9.2012 06:17, Robert Haas wrote:
> On Tue, Sep 4, 2012 at 11:31 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> On Tue, 2012-09-04 at 23:14 -0400, Robert Haas wrote:
>>> Actually, this whole things seems like a solution in search of a
>>> problem to me. We just reduced the verbosity of pgbench -i tenfold in
>>> the very recent past - I would have thought that enough to address
>>> this problem. But maybe not.
>>
>> The problem is that
>>
>> a) It blasts out too much output and everything scrolls off the screen,
>> and
>>
>> b) There is no indication of where the end is.
>>
>> These are independent problems, and I'd be happy to address them
>> separately if there are such specific concerns attached to this.
>>
>> Speaking of tenfold, we could reduce the output frequency tenfold to
>> once every 1000000, which would alleviate this problem for a while
>> longer.
>
> Well, I wouldn't object to displaying a percentage on each output
> line. But I don't really like the idea of having them less frequent
> than they already are, because if you run into a situation that makes
> pgbench -i run slowly, as I occasionally do, it's marginal to tell the
> difference between "slow" and "completely hung" even with the current
> level of verbosity.
>
> However, we could add a -q flag to run more quietly, or something like
> that. Actually, I'd even be fine with making the default quieter,
> though we can't use -v for verbose since that's already taken. But
> I'd like to preserve the option of getting the current amount of
> output because sometimes I need that to troubleshoot problems.
> Actually it'd be nice to even get a bit more output: say, a timestamp
> on each line, and a completion percentage... but now I'm getting
> greedy.

Hi,

I've been thinking about this a bit more, and do propose to use an
option that determines "logging step" i.e. number of items (either
directly or as a percentage) between log lines.

The attached patch defines a new option "--logging-step" that accepts
either integers or percents. For example if you want to print a line
each 1000 lines, you can to this

$ pgbench -i -s 1000 --logging-step 1000 testdb

and if you want to print a line each 5%, you can do this

$ pgbench -i -s 1000 --logging-step 5% testdb

and that's it.

Moreover the patch adds a record of elapsed an estimate of remaining
time. So for example with 21% you may get this:

creating tables...
21000 of 100000 tuples (21%) done (elapsed 1.56 s, remaining 5.85 s).
42000 of 100000 tuples (42%) done (elapsed 3.15 s, remaining 4.35 s).
63000 of 100000 tuples (63%) done (elapsed 4.73 s, remaining 2.78 s).
84000 of 100000 tuples (84%) done (elapsed 6.30 s, remaining 1.20 s).
100000 of 100000 tuples (100%) done (elapsed 8.17 s, remaining 0.00 s).
vacuum...
set primary keys...

Now, I've had a hard time with the patch - no matter what I do, I do get
"invalid option" error whenever I try to run that from command line for
some reason. But when I run it from gdb, it works just fine.

kind regards
Tomas

Attachment Content-Type Size
pgbench-logging-v1.diff text/plain 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-09-16 22:35:32 Re: [PATCH 2/2] Add a new function pg_relation_by_filenode to lookup up a relation given the tablespace and the filenode OIDs
Previous Message Andres Freund 2012-09-16 22:15:37 [PATCH 2/2] Add a new function pg_relation_by_filenode to lookup up a relation given the tablespace and the filenode OIDs