Re: Simple postgresql.conf wizard

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: jd(at)commandprompt(dot)com, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Simple postgresql.conf wizard
Date: 2008-12-04 05:56:20
Message-ID: Pine.GSO.4.64.0812040000270.21937@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 4 Dec 2008, Gregory Stark wrote:

> My point was more that you could have a data warehouse on a
> non-dedicated machine, you could have a web server on a non-dedicated
> machine, or you could have a mixed server on a non-dedicated machine.

I should just finish the documentation, where there will be a big
disclaimer saying "THESE SETTINGS ASSUME A SERVER DEDICATED TO
POSTGRESQL!" That's the context here. Why, after you follow my tuning
instructions, you're lucky if the server will run anything but the
database afterwards.

> Josh's logic is impeccable -- for the specific use case he's describing of a
> truly dedicated server with enough disk space for a major production database.
> But not every install is going to have gigabytes of space reserved for it and
> not every admin is going to realize that he really should set aside gigabytes
> of space even though he only expects his database to be a few megabytes.

It's really quite simple. Josh and I don't care directly about disk space
used by the WAL for people with trivial databases. At all. Whatsoever.
Maybe once, long ago, when we were young and frugal and skinny[1]; not
now, or probably ever again the future. If that's your concern, maybe
there can be some companion utility named pgmiser that lowers parameters
back down again. Your mascot can be some sort of animal that efficiently
lives off small scraps of food or something.[2]

The context here is pgtune, which is aiming to make a fat elephant of a
server faster so that there's an answer to people who say "My benchmarks
are all running really slow, is this because my system with 16PT of RAM is
only using 32MB of it for the database? This sucks, I'm going back to
Oracle which used all my RAM." If there are people who instead think,
"hey, I'll run this tuning utility to make my database faster, then it
will also be a lot smaller!", maybe we can find a class about space/time
tradeoffs in algorithm design to send them to or something.[3]

There are exactly two important things here. The first is how large
checkpoint_settings needs to be in order to for the considerable overhead
of checkpoints to be bearable. That drives the setting up. Our super-fat
DW application gets set to at least 64 so that when you bulk-load another
TB of data into it, that doesn't get bottlenecked dumping gigabytes of
dirty buffers every few seconds. If the database crashes and recovery
reads or writes a bunch of data, who cares about random writes because
your SAN has a 4GB write cache on it and dozens of drives slaving away.

Driving the setting down is knowing how much time you'll have to wait for
recovery to happen, which is really a measure of what your tolerance for
downtime is. We're thinking that someone who picks the Desktop tuning may
have no tolerance for the database to be sluggish coming back up after
Windows crashed and they rebooted, so tiny setting for them to make
recovery super fast.

Everybody else in our sample profiles fall in the middle of those two
extremes, which is why the values curve the way they do. Web app?
Probably not a lot of write volume, probably trouble if it's down a long
time; how about 8, on the low side, but it gives checkpoints more time to
spread out their I/O so worst-case latency isn't as bad. That's the sort
of analysis those numbers come from. Do performance tuning and juggle
these trade-offs for long enough for new people all the time, you get a
gut feel for the right ballpark an app should start at based on its type.
The whole idea behind this tool is that we're taking some of that hard-won
knowledge and trying to automate the distribution of it.

> It's great that Postgres has such great documentation but whenever we have the
> chance to replace something with an option which doesn't need any
> documentation that would be even better. I'm just exploring whether that's an
> option here.

I would be glad to have a post-CommitFest discussion of this very topic as
it's quite a pain to me in its current form. Just not right now because
it's too late to touch it.

> Nobody's even tried to do this side of things before. They always got
> bogged down in trying to parse config files and such.

It's actually because most of them were working in Perl, which encourages
deviant behavior where people delight in converting useful ideas into
illegible punctuation rather than actually getting anything done. Except
for that other Greg around here who's not involved in this discussion, his
Perl is pretty good.

[1] Josh is being aggressively bulked up right now for his next sumo
match.

[2] Like a rat, which would give you an excuse to add the long overdue
PL/Ratfor.

[3] This wouldn't actually help them learn anything, but it would make
their heads explode at which point all their problems are gone.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2008-12-04 06:01:45 Re: Simple postgresql.conf wizard
Previous Message Robert Haas 2008-12-04 05:54:50 Re: Simple postgresql.conf wizard