Re: review: pgbench - aggregation of info written into log

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: review: pgbench - aggregation of info written into log
Date: 2012-11-27 13:55:59
Message-ID: CAFj8pRC-rZ6zSw4k6PETcdieMSz5hb+FMr783U3brGFqNFQ=dA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

2012/11/12 Tomas Vondra <tv(at)fuzzy(dot)cz>:
> Hi,
>
> attached is a v4 of the patch. There are not many changes, mostly some
> simple tidying up, except for handling the Windows.
>
> After a bit more investigation, it seems to me that we can't really get
> the same behavior as in other systems - basically the timestamp is
> unavailable so we can't log the interval start. So it seems to me the
> best we can do is to disable this option on Windows (and this is done in
> the patch). So when you try to use "--aggregate-interval" on Win it will
> complain it's an unknown option.
>
> Now that I think of it, there might be a better solution that would not
> mimic the Linux/Unix behaviour exactly - we do have elapsed time since
> the start of the benchmark, so maybe we should use this instead of the
> timestamp? I mean on systems with reasonable gettimeofday we'd get
>
> 1345828501 5601 1542744 483552416 61 2573
> 1345828503 7884 1979812 565806736 60 1479
> 1345828505 7208 1979422 567277552 59 1391
> 1345828507 7685 1980268 569784714 60 1398
> 1345828509 7073 1979779 573489941 236 1411
>
> but on Windows we'd get
>
> 0 5601 1542744 483552416 61 2573
> 2 7884 1979812 565806736 60 1479
> 4 7208 1979422 567277552 59 1391
> 6 7685 1980268 569784714 60 1398
> 8 7073 1979779 573489941 236 1411
>
> i.e. the first column is "seconds since start of the test". Hmmmm, and
> maybe we could call 'gettimeofday' at the beginning, to get the
> timestamp of the test start (AFAIK the issue on Windows is the
> resolution, but it should be enough). Then we could add it up with the
> elapsed time and we'd get the same output as on Linux.
>
> And maybe this could be done in regular pgbench execution too? But I
> really need help from someone who knows Windows and can test this.
>
> Comments regarding Pavel's reviews are below:
>
> On 2.10.2012 20:08, Pavel Stehule wrote:
>> Hello
>>
>> I did review of pgbench patch
>> http://archives.postgresql.org/pgsql-hackers/2012-09/msg00737.php
>>
>> * this patch is cleanly patched
>>
>> * I had a problem with doc
>>
>> make[1]: Entering directory `/home/pavel/src/postgresql/doc/src/sgml'
>> openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .
>> -c /usr/share/sgml/docbook/dsssl-stylesheets/catalog -d stylesheet.dsl
>> -t sgml -i output-html -V html-index postgres.sgml
>> openjade:pgbench.sgml:767:8:E: document type does not allow element
>> "TITLE" here; missing one of "ABSTRACT", "AUTHORBLURB", "MSGSET",
>> "CALLOUTLIST", "ITEMIZEDLIST", "ORDEREDLIST", "SEGMENTEDLIST",
>> "VARIABLELIST", "CAUTION", "IMPORTANT", "NOTE", "TIP", "WARNING",
>> "FORMALPARA", "BLOCKQUOTE", "EQUATION", "EXAMPLE", "FIGURE", "TABLE",
>> "PROCEDURE", "SIDEBAR", "QANDASET", "REFSECT3" start-tag
>> make[1]: *** [HTML.index] Error 1
>> make[1]: *** Deleting file `HTML.index'
>> make[1]: Leaving directory `/home/pavel/src/postgresql/doc/src/sgml
>
> Hmmm, I've never got the docs to build at all, all I do get is a heap of
> some SGML/DocBook related issues. Can you investigate a bit more where's
> the issue? I don't remember messing with the docs in a way that might
> cause this ... mostly copy'n'paste edits.
>
>> * pgbench is compiled without warnings
>>
>> * there is a few issues in source code
>>
>> +
>> + /* should we aggregate the results or not? */
>> + if (use_log_agg)
>> + {
>> +
>> + /* are we still in the same interval? if yes, accumulate the
>> + * values (print them otherwise) */
>> + if (agg->start_time + use_log_agg >= INSTR_TIME_GET_DOUBLE(now))
>> + {
>> +
>
> Errr, so what are the issues here?

using a use_log_agg variable - bad name for variable - it is fixed

>
>>
>> * a real time range for aggregation is dynamic (pgbench is not
>> realtime application), so I am not sure if following constraint has
>> sense
>>
>> + if ((duration > 0) && (use_log_agg > 0) && (duration % use_log_agg != 0)) {
>> + fprintf(stderr, "duration (%d) must be a multiple of aggregation
>> interval (%d)\n", duration, use_log_agg);
>> + exit(1);
>> + }
>
> I'm not sure what might be the issue here either. If the test duration
> is set (using "-T" option), then this kicks in and says that the
> duration should be a multiple of aggregation interval. Seems like a
> sensible assumption to me. Or do you think this is check should be removed?
>
>> * it doesn't flush last aggregated data (it is mentioned by Tomas:
>> "Also, I've realized the last interval may not be logged at all - I'll
>> take a look into this in the next version of the patch."). And it can
>> be significant for higher values of "use_log_agg"
>
> Yes, and I'm still not sure how to fix this in practice. But I do have
> this on my TODO.
>
>>
>> * a name of variable "use_log_agg" is not good - maybe "log_agg_interval" ?
>
> I've changed it to agg_interval.

ok

issues:

* empty lines with invisible chars (tabs) + and sometimes empty lines
after and before {}

* adjustment of start_time

+ * the desired interval */
+ while (agg->start_time
+ agg_interval < INSTR_TIME_GET_DOUBLE(now))
+
agg->start_time = agg->start_time + agg_interval;

can "skip" one interval - so when transaction time will be larger or
similar to agg_interval - then results can be strange. We have to know
real length of interval

Regards

Pavel

>
> regards
> Tomas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2012-11-27 14:07:34 Re: Materialized views WIP patch
Previous Message Michael Paquier 2012-11-27 13:45:05 Re: [WIP] pg_ping utility