Re: Postgres 8.4 memory related parameters

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <sthomas(at)peak6(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>, "Claire Chang" <yenhsiac(at)yahoo(dot)com>
Subject: Re: Postgres 8.4 memory related parameters
Date: 2011-08-05 14:58:46
Message-ID: 4E3BBED6020000250003FB6D@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Shaun Thomas <sthomas(at)peak6(dot)com> wrote:
> On 08/05/2011 09:00 AM, Kevin Grittner wrote:
>
>> optimal pool size = ((2 * actual core count) + effective spindle
>> count)
>
> How does that work? If your database fits in memory, your optimal
> TPS is only constrained by CPU. Any fetches from disk reduce your
> throughput from IO Waits.

I think you're misunderstanding the purpose of the formula. I'm not
saying that causing processes to wait for disk speeds things up;
clearly things will run faster if the active data set is cached.
What I'm saying is that if processes are blocked waiting for disk
they are not going to be using CPU, and there is room for that many
additional processes to be useful, as the CPUs and other drives
would otherwise be sitting idle.

> How do you account for SSDs/PCIe cards which act as an
> effective spindle multiplier?

The "effective spindle count" is basically about "how many resources
are reads typically waiting on with this hardware and workload".
Perhaps a better name for that could be chosen, but it's the best
I've come up with.

> I'd be more inclined to believe this:
>
> optimal pool size = 3*cores - cores/spindles
>
> Then, as your spindles increase, you're subtracting less and less
> until you reach optimal 3x.

Well, to take an extreme example in another direction, let's
hypothesize a machine with one CPU and 24 drives, where random disk
access is the bottleneck for the workload. My formula would have 26
processes, which would typically be running with 26 blocked waiting
on a read for a cache miss, while the other two processes would be
serving up responses for cache hits and getting requests for the
next actual disk reads ready. Your formula would have two processes
struggling to get reads going on 24 spindles while also serving up
cached data.

Just because disk speed sucks doesn't mean you don't want to do your
disk reads in parallel; quite the opposite!

-Kevin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2011-08-05 15:05:43 Re: Postgres 8.4 memory related parameters
Previous Message Shaun Thomas 2011-08-05 14:21:26 Re: Postgres 8.4 memory related parameters