Re: checkpoint patches

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject: Re: checkpoint patches
Date: 2012-03-23 14:24:43
Message-ID: 20120323142443.GH3938@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> Well, how do you want to look at it?

I thought the last graph you provided was a useful way to view the
results. It was my intent to make that clear in my prior email, my
apologies if that didn't come through.

> Here's the data from 80th
> percentile through 100th percentile - percentile, patched, unpatched,
> difference - for the same two runs I've been comparing:
[...]
> 98 12100 24645 -12545
> 99 186043 201309 -15266
> 100 9513855 9074161 439694

Those are the areas that I think we want to be looking at/for: the
outliers.

> By the way, I reran the tests on master with checkpoint_timeout=16min,
> and here are the tps results: 2492.966759, 2588.750631, 2575.175993.
> So it seems like not all of the tps gain from this patch comes from
> the fact that it increases the time between checkpoints. Comparing
> the median of three results between the different sets of runs,
> applying the patch and setting a 3s delay between syncs gives you
> about a 5.8% increase throughput, but also adds 30-40 seconds between
> checkpoints. If you don't apply the patch but do increase time
> between checkpoints by 1 minute, you get about a 5.0% increase in
> throughput. That certainly means that the patch is doing something -
> because 5.8% for 30-40 seconds is better than 5.0% for 60 seconds -
> but it's a pretty small effect.

That doesn't surprise me too much. As I mentioned before, and Greg
please correct me if I'm wrong, but I thought this patch was intended to
reduce the latency spikes that we suffer from under some workloads,
which can often be attributed back to i/o related contention. I don't
believe it's intended or expected to seriously increase throughput.

> The picture looks similar here. Increasing checkpoint_timeout isn't
> *quite* as good as spreading out the fsyncs, but it's pretty darn
> close. For example, looking at the median of the three 98th
> percentile numbers for each configuration, the patch bought us a 28%
> improvement in 98th percentile latency. But increasing
> checkpoint_timeout by a minute bought us a 15% improvement in 98th
> percentile latency. So it's still not clear to me that the patch is
> doing anything on this test that you couldn't get just by increasing
> checkpoint_timeout by a few more minutes. Granted, it lets you keep
> your inter-checkpoint interval slightly smaller, but that's not that
> exciting. That having been said, I don't have a whole lot of trouble
> believing that there are other cases where this is more worthwhile.

I could certainly see the checkpoint_timeout parameter, along with the
others, as being sufficient to address this, in which case we likely
don't need the patch. They're both more-or-less intended to do the same
thing and it's just a question of if being more granular ends up helping
or not.

Thanks,

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2012-03-23 14:30:04 Re: Finer Extension dependencies
Previous Message Dimitri Fontaine 2012-03-23 14:05:37 Re: Finer Extension dependencies