Re: Uh, I change my mind about commit_delay + commit_siblings (sort of)

From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Uh, I change my mind about commit_delay + commit_siblings (sort of)
Date: 2012-05-29 16:47:50
Message-ID: CAEYLb_XKr5E15ico3Mbq4JWyz1xn1UusUK5wAKumd7y5GVHKNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29 May 2012 17:10, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> This is a clever idea,

Thanks.

> but I think it needs some fine-tuning: as
> written, this will sleep for any flush, not just a flush of a commit
> record.  One idea might be to add a flag to XLogFlush indicating
> whether a commit-delay sleep is permissible.  Callers other than
> RecordTransactionCommit() could pass false; RecordTransactionCommit()
> could pass true.
>
> The comments need some updating, too.

Uh, yeah, I posted that at 2am, which was about an hour after I
initially had the idea. Attached patch revises the comments and
documentation in a way that I think is appropriate, though it is still
essentially the same patch. I did consider the fact that the delay
might occur from one of a number of XLogFlush() callsites that did not
previously delay. However, even if I do what you suggest, it is just
dumb luck as to whether or not that makes any difference, since those
other sites may well be exactly as delayed, since they're still going
to have to wait behind the WALWriteLock, which the leader now holds
while sleeping anyway. I imagined that the fact that those callsites
were excluded before had something to do with ameliorating historic
commit_delay problems, but those problems have already been
significantly reduced.

Why do you think that doing this for all XLogFlush() callsites might
be problematic?

> Like Heikki, I find the test results that you posted pretty hard to
> understand.  This is sort of a general beef I have with pgbench-tools:
> there are no explanations anywhere about what the graphs actually
> mean. The first three graphs seem fairly useless, and the third one
> doesn't even appear to contain any data.  The fourth and fifth graphs
> seem like the most useful part of the output, but the lack of an
> explanation of what's being graphed really hinders understanding.
> Apparently, the fourth graph is TPS vs. scale factor (at an
> unspecified number of clients) and the fifth graph is TPS vs. clients
> (at an unspecified scale factor).  Maybe we're just averaging across
> the unspecified parameter in each case, but if so that doesn't seem
> very useful.

I'm sure that Greg will be happy to consider a github pull request to
improve the tool in these areas. I myself am used to interpreting
pgbench-tools output, but I suppose it is a little confusing in
various ways. As you probably realise, I was directing your attention
towards clients-set.png . I am in the habit of hacking pgbench-tools
to output much bigger gnu-plot images.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachment Content-Type Size
move_delay_2012_05_29.v2.patch application/octet-stream 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-05-29 16:55:08 Re: Bogus nestloop rows estimate in 8.4.7
Previous Message Robert Haas 2012-05-29 16:42:53 Re: Issues with MinGW W64