Quick Links

Re: [HACKERS] Changing the default configuration (was Re:

Lists:	pgsql-advocacypgsql-hackerspgsql-performance

From:	"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>
To:	"Greg Copeland" <greg(at)CopelandConsulting(dot)Net>
Cc:	"PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] PostgreSQL Benchmarks
Date:	2003-02-11 15:44:07
Message-ID:	303E00EBDD07B943924382E153890E5434A909@cuthbert.rcsinc.local
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

I've tested all the win32 versions of postgres I can get my hands on
(cygwin and not), and my general feeling is that they have problems with
insert performance with fsync() turned on, probably the fault of the os.
Select performance is not so much affected.

This is easily solved with transactions and other such things. Also
Postgres benefits from pl just like oracle.

May I make a suggestion that maybe it is time to start thinking about
tuning the default config file, IMHO its just a little bit too
conservative, and its hurting you in benchmarks being run by idiots, but
its still bad publicity. Any real database admin would know his test
are synthetic and not meaningful without having to look at the #s.

This is irritating me so much that I am going to put together a
benchmark of my own, a real world one, on (publicly available) real
world data. Mysql is a real dog in a lot of situations. The FCC
publishes a database of wireless transmitters that has tables with 10
million records in it. I'll pump that into pg, run some benchmarks,
real world queries, and we'll see who the faster database *really* is.
This is just a publicity issue, that's all. Its still annoying though.

I'll even run an open challenge to database admin to beat query
performance of postgres in such datasets, complex multi table joins,
etc. I'll even throw out the whole table locking issue and analyze
single user performance.

Merlin

_____________
How much of the performance difference is from the RDBMS, from the
middleware, and from the quality of implementation in the middleware.

While I'm not surprised that the the cygwin version of PostgreSQL is
slow, those results don't tell me anything about the quality of the
middleware interface between PHP and PostgreSQL. Does anyone know if we
can rule out some of the performance loss by pinning it to bad
middleware implementation for PostgreSQL?

Regards,

--
Greg Copeland <greg(at)copelandconsulting(dot)net>
Copeland Computer Consulting

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>
Cc:	"PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Changing the default configuration (was Re: [HACKERS] PostgreSQL Benchmarks)
Date:	2003-02-11 16:20:14
Message-ID:	26004.1044980414@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com> writes:
> May I make a suggestion that maybe it is time to start thinking about
> tuning the default config file, IMHO its just a little bit too
> conservative,

It's a lot too conservative. I've been thinking for awhile that we
should adjust the defaults.

The original motivation for setting shared_buffers = 64 was so that
Postgres would start out-of-the-box on machines where SHMMAX is 1 meg
(64 buffers = 1/2 meg, leaving 1/2 meg for our other shared data
structures). At one time SHMMAX=1M was a pretty common stock kernel
setting. But our other data structures blew past the 1/2 meg mark
some time ago; at default settings the shmem request is now close to
1.5 meg. So people with SHMMAX=1M have already got to twiddle their
postgresql.conf settings, or preferably learn how to increase SHMMAX.
That means there is *no* defensible reason anymore for defaulting to
64 buffers.

We could retarget to try to stay under SHMMAX=4M, which I think is
the next boundary that's significant in terms of real-world platforms
(isn't that the default SHMMAX on some BSDen?). That would allow us
350 or so shared_buffers, which is better, but still not really a
serious choice for production work.

What I would really like to do is set the default shared_buffers to
1000. That would be 8 meg worth of shared buffer space. Coupled with
more-realistic settings for FSM size, we'd probably be talking a shared
memory request approaching 16 meg. This is not enough RAM to bother
any modern machine from a performance standpoint, but there are probably
quite a few platforms out there that would need an increase in their
stock SHMMAX kernel setting before they'd take it.

So what this comes down to is making it harder for people to get
Postgres running for the first time, versus making it more likely that
they'll see decent performance when they do get it running.

It's worth noting that increasing SHMMAX is not nearly as painful as
it was back when these decisions were taken. Most people have moved
to platforms where it doesn't even take a kernel rebuild, and we've
acquired documentation that tells how to do it on all(?) our supported
platforms. So I think it might be okay to expect people to do it.

The alternative approach is to leave the settings where they are, and
to try to put more emphasis in the documentation on the fact that the
factory-default settings produce a toy configuration that you *must*
adjust upward for decent performance. But we've not had a lot of
success spreading that word, I think. With SHMMMAX too small, you
do at least get a pretty specific error message telling you so.

Comments?

regards, tom lane

From:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 16:42:54
Message-ID:	1044981773.25889.165.camel@mouse.copelandconsulting.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 2003-02-11 at 10:20, Tom Lane wrote:
> "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com> writes:
> > May I make a suggestion that maybe it is time to start thinking about
> > tuning the default config file, IMHO its just a little bit too
> > conservative,
>
> It's a lot too conservative. I've been thinking for awhile that we
> should adjust the defaults.
>
> The original motivation for setting shared_buffers = 64 was so that
> Postgres would start out-of-the-box on machines where SHMMAX is 1 meg
> (64 buffers = 1/2 meg, leaving 1/2 meg for our other shared data
> structures). At one time SHMMAX=1M was a pretty common stock kernel
> setting. But our other data structures blew past the 1/2 meg mark
> some time ago; at default settings the shmem request is now close to
> 1.5 meg. So people with SHMMAX=1M have already got to twiddle their
> postgresql.conf settings, or preferably learn how to increase SHMMAX.
> That means there is *no* defensible reason anymore for defaulting to
> 64 buffers.
>
> We could retarget to try to stay under SHMMAX=4M, which I think is
> the next boundary that's significant in terms of real-world platforms
> (isn't that the default SHMMAX on some BSDen?). That would allow us
> 350 or so shared_buffers, which is better, but still not really a
> serious choice for production work.
>
> What I would really like to do is set the default shared_buffers to
> 1000. That would be 8 meg worth of shared buffer space. Coupled with
> more-realistic settings for FSM size, we'd probably be talking a shared
> memory request approaching 16 meg. This is not enough RAM to bother
> any modern machine from a performance standpoint, but there are probably
> quite a few platforms out there that would need an increase in their
> stock SHMMAX kernel setting before they'd take it.
>
> So what this comes down to is making it harder for people to get
> Postgres running for the first time, versus making it more likely that
> they'll see decent performance when they do get it running.
>
> It's worth noting that increasing SHMMAX is not nearly as painful as
> it was back when these decisions were taken. Most people have moved
> to platforms where it doesn't even take a kernel rebuild, and we've
> acquired documentation that tells how to do it on all(?) our supported
> platforms. So I think it might be okay to expect people to do it.
>
> The alternative approach is to leave the settings where they are, and
> to try to put more emphasis in the documentation on the fact that the
> factory-default settings produce a toy configuration that you *must*
> adjust upward for decent performance. But we've not had a lot of
> success spreading that word, I think. With SHMMMAX too small, you
> do at least get a pretty specific error message telling you so.
>
> Comments?

I'd personally rather have people stumble trying to get PostgreSQL
running, up front, rather than allowing the lowest common denominator
more easily run PostgreSQL only to be disappointed with it and move on.

After it's all said and done, I would rather someone simply say, "it's
beyond my skill set", and attempt to get help or walk away. That seems
better than them being able to run it and say, "it's a dog", spreading
word-of-mouth as such after they left PostgreSQL behind. Worse yet,
those that do walk away and claim it performs horribly are probably
doing more harm to the PostgreSQL community than expecting someone to be
able to install software ever can.

Nutshell:
"Easy to install but is horribly slow."

"Took a couple of minutes to configure and it rocks!"

Seems fairly cut-n-dry to me. ;)

Regards,

--
Greg Copeland <greg(at)copelandconsulting(dot)net>
Copeland Computer Consulting

From:	Patrick Welche <prlw1(at)newn(dot)cam(dot)ac(dot)uk>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re: [HACKERS] PostgreSQL Benchmarks)
Date:	2003-02-11 16:44:34
Message-ID:	20030211164434.B15688@quartz.newn.cam.ac.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, Feb 11, 2003 at 11:20:14AM -0500, Tom Lane wrote:
...
> We could retarget to try to stay under SHMMAX=4M, which I think is
> the next boundary that's significant in terms of real-world platforms
> (isn't that the default SHMMAX on some BSDen?).
...

Assuming 1 page = 4k, and number of pages is correct in GENERIC kernel configs,
SHMMAX=4M for NetBSD (8M for i386, x86_64)

Cheers,

Patrick

From:	Jason Hihn <jhihn(at)paytimepayroll(dot)com>
To:
Cc:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:03:13
Message-ID:	NGBBLHANMLKMHPDGJGAPGEALCCAA.jhihn@paytimepayroll.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

>Nutshell:
> "Easy to install but is horribly slow."
>
> or
>
> "Took a couple of minutes to configure and it rocks!"

Since when is it easy to install on win32?
The easiest way I know of is through Cygwin, then you have to worry about
installing the IPC service (an getting the right version too!) I've
installed versions 6.1 to 7.1, but I almost gave up on the windows install.
At least in 6.x you had very comprehensive installation guide with a TOC.

Versus the competition which are you going to choose if you're a wanna-be
DBA? The one with all he hoops to jump through, or the one that comes with a
setup.exe?

Now I actually am in support of making it more aggressive, but it should
wait until we too have a setup.exe for the native windows port. (Changing it
on *n*x platforms is of little benefit because most benchmarks seem to run
it on w32 anyway :-( )

Just my $.02. I reserve the right to be wrong.
-J

From:	Justin Clift <justin(at)postgresql(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:08:22
Message-ID:	3E492E06.2030702@postgresql.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom Lane wrote:
<snip>
> What I would really like to do is set the default shared_buffers to
> 1000. That would be 8 meg worth of shared buffer space. Coupled with
> more-realistic settings for FSM size, we'd probably be talking a shared
> memory request approaching 16 meg. This is not enough RAM to bother
> any modern machine from a performance standpoint, but there are probably
> quite a few platforms out there that would need an increase in their
> stock SHMMAX kernel setting before they'd take it.
<snip>

Totally agree with this. We really, really, really, really need to get
the default to a point where we have _decent_ default performance.

> The alternative approach is to leave the settings where they are, and
> to try to put more emphasis in the documentation on the fact that the
> factory-default settings produce a toy configuration that you *must*
> adjust upward for decent performance. But we've not had a lot of
> success spreading that word, I think. With SHMMMAX too small, you
> do at least get a pretty specific error message telling you so.
>
> Comments?

Yep.

Here's an *unfortunately very common* scenario, that again
unfortunately, a _seemingly large_ amount of people fall for.

a) Someone decides to "benchmark" database XYZ vs PostgreSQL vs other
databases

b) Said benchmarking person knows very little about PostgreSQL, so they
install the RPM's, packages, or whatever, and "it works". Then they run
whatever benchmark they've downloaded, or designed, or whatever

c) PostgreSQL, being practically unconfigured, runs at the pace of a
slow, mostly-disabled snail.

d) Said benchmarking person gets better performance from the other
databases (also set to their default settings) and thinks "PostgreSQL
has lots of features, and it's free, but it's Too Slow".

Yes, this kind of testing shouldn't even _pretend_ to have any real
world credibility.

e) Said benchmarking person tells everyone they know, _and_ everyone
they meet about their results. Some of them even create nice looking or
profesional looking web pages about it.

f) People who know even _less_ than the benchmarking person hear about
the test, or read the result, and don't know any better than to believe
it at face value. So, they install whatever system was recommended.

g) Over time, the benchmarking person gets the hang of their chosen
database more and writes further articles about it, and doesn't
generally look any further afield than it for say... a couple of years.
By this time, they've already influenced a couple of thousand people
in the non-optimal direction.

h) Arrgh. With better defaults, our next release would _appear_ to be a
lot faster to quite a few people, just because they have no idea about
tuning.

So, as sad as this scenario is, better defaults will probably encourage
a lot more newbies to get involved, and that'll eventually translate
into a lot more experienced users, and a few more coders to assist. ;-)

Personally I'd be a bunch happier if we set the buffers so high that we
definitely have decent performance, and the people that want to run
PostgreSQL are forced to make the choice of either:

1) Adjust their system settings to allow PostgreSQL to run properly, or

2) Manually adjust the PostgreSQL settings to run memory-constrained

This way, PostgreSQL either runs decently, or they are _aware_ that
they're limiting it. That should cut down on the false benchmarks
(hopefully).

:-)

Regards and best wishes,

Justin Clift

> regards, tom lane

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi

From:	Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>
Cc:	"PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re: [HACKERS] PostgreSQL Benchmarks)
Date:	2003-02-11 17:10:48
Message-ID:	20030211171048.579E1103F6@polaris.pinpointresearch.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

A quick-'n'-dirty first step would be more comments in postgresql.conf. Most
of the lines are commented out which would imply "use the default" but the
default is not shown. (I realize this has the difficulty of defaults that
change depending upon how PostgreSQL was configured/compiled but perhaps
postgresql.conf could be built by the make process based on the configuration
options.)

If postgresql.conf were commented with recommendations it would probably be
all I need though perhaps a recommendation to edit that file should be
displayed at the conclusion of "make install".

Cheers,
Steve

On Tuesday 11 February 2003 8:20 am, Tom Lane wrote:
> "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com> writes:
> > May I make a suggestion that maybe it is time to start thinking about
> > tuning the default config file, IMHO its just a little bit too
> > conservative,
>
> It's a lot too conservative. I've been thinking for awhile that we
> should adjust the defaults.
>
> The original motivation for setting shared_buffers = 64 was so that
> Postgres would start out-of-the-box on machines where SHMMAX is 1 meg
> (64 buffers = 1/2 meg, leaving 1/2 meg for our other shared data
> structures). At one time SHMMAX=1M was a pretty common stock kernel
> setting. But our other data structures blew past the 1/2 meg mark
> some time ago; at default settings the shmem request is now close to
> 1.5 meg. So people with SHMMAX=1M have already got to twiddle their
> postgresql.conf settings, or preferably learn how to increase SHMMAX.
> That means there is *no* defensible reason anymore for defaulting to
> 64 buffers.
>
> We could retarget to try to stay under SHMMAX=4M, which I think is
> the next boundary that's significant in terms of real-world platforms
> (isn't that the default SHMMAX on some BSDen?). That would allow us
> 350 or so shared_buffers, which is better, but still not really a
> serious choice for production work.
>
> What I would really like to do is set the default shared_buffers to
> 1000. That would be 8 meg worth of shared buffer space. Coupled with
> more-realistic settings for FSM size, we'd probably be talking a shared
> memory request approaching 16 meg. This is not enough RAM to bother
> any modern machine from a performance standpoint, but there are probably
> quite a few platforms out there that would need an increase in their
> stock SHMMAX kernel setting before they'd take it.
>
> So what this comes down to is making it harder for people to get
> Postgres running for the first time, versus making it more likely that
> they'll see decent performance when they do get it running.
>
> It's worth noting that increasing SHMMAX is not nearly as painful as
> it was back when these decisions were taken. Most people have moved
> to platforms where it doesn't even take a kernel rebuild, and we've
> acquired documentation that tells how to do it on all(?) our supported
> platforms. So I think it might be okay to expect people to do it.
>
> The alternative approach is to leave the settings where they are, and
> to try to put more emphasis in the documentation on the fact that the
> factory-default settings produce a toy configuration that you *must*
> adjust upward for decent performance. But we've not had a lot of
> success spreading that word, I think. With SHMMMAX too small, you
> do at least get a pretty specific error message telling you so.
>
> Comments?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org

From:	mlw <pgsql(at)mohawksoft(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:12:04
Message-ID:	3E492EE4.1080502@mohawksoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom Lane wrote:

>"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com> writes:
>
>
>>May I make a suggestion that maybe it is time to start thinking about
>>tuning the default config file, IMHO its just a little bit too
>>conservative,
>>
>>
>
>It's a lot too conservative. I've been thinking for awhile that we
>should adjust the defaults.
>
>
>
One of the things I did on my Windows install was to have a number of
default configuration files, postgresql.conf.small,
postgresql.conf.medium, postgresql.conf.large.

Rather than choose one, in the "initdb" script, ask for or determine the
mount of shared memory, memory, etc.

Another pet peeve I have is forcing the configuration files to be in the
database directory. We had this argument in 7.1 days, and I submitted a
patch that allowed a configuration file to be specified as a command
line parameter. One of the things that Oracle does better is separating
the "configuration" from the data.

It is an easy patch to allow PostgreSQL to use a separate configuration
directory, and specify the data directory within the configuration file
(The way any logical application works), and, NO, symlinks are not a
solution, they are a kludge.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Justin Clift <justin(at)postgresql(dot)org>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re: [HACKERS] PostgreSQL Benchmarks)
Date:	2003-02-11 17:18:10
Message-ID:	26500.1044983890@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Justin Clift <justin(at)postgresql(dot)org> writes:
> Personally I'd be a bunch happier if we set the buffers so high that we
> definitely have decent performance, and the people that want to run
> PostgreSQL are forced to make the choice of either:
> 1) Adjust their system settings to allow PostgreSQL to run properly, or
> 2) Manually adjust the PostgreSQL settings to run memory-constrained
> This way, PostgreSQL either runs decently, or they are _aware_ that
> they're limiting it.

Yeah, that is the subtext here. If you can't increase SHMMAX then you
can always trim the postgresql.conf parameters --- but theoretically,
at least, you should then have a clue that you're running a
badly-configured setup ...

regards, tom lane

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Justin Clift <justin(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:18:48
Message-ID:	200302110918.48614.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom, Justin,

> > What I would really like to do is set the default shared_buffers to
> > 1000. That would be 8 meg worth of shared buffer space. Coupled with
> > more-realistic settings for FSM size, we'd probably be talking a shared
> > memory request approaching 16 meg. This is not enough RAM to bother
> > any modern machine from a performance standpoint, but there are probably
> > quite a few platforms out there that would need an increase in their
> > stock SHMMAX kernel setting before they'd take it.

What if we supplied several sample .conf files, and let the user choose which
to copy into the database directory? We could have a "high read
performance" profile, and a "transaction database" profile, and a
"workstation" profile, and a "low impact" profile. We could even supply a
Perl script that would adjust SHMMAX and SHMMALL on platforms where this can
be done from the command line.

--
Josh Berkus
Aglio Database Solutions
San Francisco

From:	Rod Taylor <rbt(at)rbt(dot)ca>
To:	Steve Crawford <scrawford(at)pinpointresearch(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, Postgresql Advocacy <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:21:13
Message-ID:	1044984072.79087.6.camel@jester
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 2003-02-11 at 12:10, Steve Crawford wrote:
> A quick-'n'-dirty first step would be more comments in postgresql.conf. Most

This will not solve the issue with the large number of users who have no
interest in looking at the config file -- but are interested in
publishing their results.

--
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

From:	mlw <pgsql(at)mohawksoft(dot)com>
To:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-11 17:23:42
Message-ID:	3E49319E.80206@mohawksoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Greg Copeland wrote:

>
>
>I'd personally rather have people stumble trying to get PostgreSQL
>running, up front, rather than allowing the lowest common denominator
>more easily run PostgreSQL only to be disappointed with it and move on.
>
>After it's all said and done, I would rather someone simply say, "it's
>beyond my skill set", and attempt to get help or walk away. That seems
>better than them being able to run it and say, "it's a dog", spreading
>word-of-mouth as such after they left PostgreSQL behind. Worse yet,
>those that do walk away and claim it performs horribly are probably
>doing more harm to the PostgreSQL community than expecting someone to be
>able to install software ever can.
>
<RANT>

And that my friends is why PostgreSQL is still relatively obscure.

This attitude sucks. If you want a product to be used, you must put the
effort into making it usable.

It is a no-brainer to make the default configuration file suitable for
the majority of users. It is lunacy to create a default configuration
which provides poor performance for over 90% of the users, but which
allows the lowest common denominator to work.

A product must not perform poorly out of the box, period. A good product
manager would choose one of two possible configurations, (a) a high
speed fairly optimized system from the get-go, or (b) it does not run
unless you create the configuration file. Option (c) out of the box it
works like crap, is not an option.

This is why open source gets such a bad reputation. Outright contempt
for the user who may not know the product as well as those developing
it. This attitude really sucks and it turns people off. We want people
to use PostgreSQL, to do that we must make PostgreSQL usable. Usability
IS important.
</RANT>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Justin Clift <justin(at)postgresql(dot)org>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:26:05
Message-ID:	26582.1044984365@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> What if we supplied several sample .conf files, and let the user choose which
> to copy into the database directory? We could have a "high read
> performance" profile, and a "transaction database" profile, and a
> "workstation" profile, and a "low impact" profile.

Uh ... do we have a basis for recommending any particular sets of
parameters for these different scenarios? This could be a good idea
in the abstract, but I'm not sure I know enough to fill in the details.

A lower-tech way to accomplish the same result is to document these
alternatives in postgresql.conf comments and encourage people to review
that file, as Steve Crawford just suggested. But first we need the raw
knowledge.

regards, tom lane

From:	Kaare Rasmussen <kar(at)kakidata(dot)dk>
To:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-11 17:26:46
Message-ID:	200302111826.46674.kar@kakidata.dk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> What if we supplied several sample .conf files, and let the user choose
> which to copy into the database directory? We could have a "high read

Exactly my first thought when reading the proposal for a setting suited for
performance tests.

> performance" profile, and a "transaction database" profile, and a
> "workstation" profile, and a "low impact" profile. We could even supply a

And a .benchmark profile :-)

> Perl script that would adjust SHMMAX and SHMMALL on platforms where this
> can be done from the command line.

Or maybe configuration could be adjusted with ./configure if SHMMAX can be
determined at that point?

--
Kaare Rasmussen --Linux, spil,-- Tlf: 3816 2582
Kaki Data tshirts, merchandize Fax: 3816 2501
Howitzvej 75 Åben 12.00-18.00 Email: kar(at)kakidata(dot)dk
2000 Frederiksberg Lørdag 12.00-16.00 Web: www.suse.dk

From:	Justin Clift <justin(at)postgresql(dot)org>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:29:19
Message-ID:	3E4932EF.3070604@postgresql.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Josh Berkus wrote:
> Tom, Justin,
<snip>
>
> What if we supplied several sample .conf files, and let the user choose which
> to copy into the database directory? We could have a "high read
> performance" profile, and a "transaction database" profile, and a
> "workstation" profile, and a "low impact" profile. We could even supply a
> Perl script that would adjust SHMMAX and SHMMALL on platforms where this can
> be done from the command line.

This might have value as the next step in the process of:

a) Are we going to have better defaults?

b) Let's stick with the current approach.

If we decide to go with better (changed) defaults, we may also be able
to figure out a way of having profiles that could optionally be chosen from.

As a longer term thought, it would be nice if the profiles weren't just
hard-coded example files, but more of:

pg_autotune --setprofile=xxx

Or similar utility, and it did all the work. Named profiles being one
capability, and other tuning measurements (i.e. cpu costings, disk
performance profiles, etc) being the others.

Regards and best wishes,

Justin Clift

From:	Justin Clift <justin(at)postgresql(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:34:13
Message-ID:	3E493415.9090004@postgresql.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom Lane wrote:
<snip>
> Uh ... do we have a basis for recommending any particular sets of
> parameters for these different scenarios? This could be a good idea
> in the abstract, but I'm not sure I know enough to fill in the details.
>
> A lower-tech way to accomplish the same result is to document these
> alternatives in postgresql.conf comments and encourage people to review
> that file, as Steve Crawford just suggested. But first we need the raw
> knowledge.

Without too much hacking around, you could pretty easily adapt the
pg_autotune code to do proper profiles of a system with different settings.

i.e. increment one setting at a time, run pgbench on it with some decent
amount of transactions and users, stuff the results into a different
database. Aggregate data over time kind of thing. Let it run for a
week, etc.

If it's helpful, there's a 100% spare Althon 1.6Ghz box around with
(choose your OS) + Adaptec 29160 + 512MB RAM + 2 x 9GB Seagate Cheetah
10k rpm drives hanging around. No stress to set that up and let it run
any long terms tests you'd like plus send back results.

Regards and best wishes,

Justin Clift

> regards, tom lane

From:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>
To:	mlw <pgsql(at)mohawksoft(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-11 17:36:17
Message-ID:	1044984976.2518.191.camel@mouse.copelandconsulting.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 2003-02-11 at 11:23, mlw wrote:
> Greg Copeland wrote:
>
> >
> >
> >I'd personally rather have people stumble trying to get PostgreSQL
> >running, up front, rather than allowing the lowest common denominator
> >more easily run PostgreSQL only to be disappointed with it and move on.
> >
> >After it's all said and done, I would rather someone simply say, "it's
> >beyond my skill set", and attempt to get help or walk away. That seems
> >better than them being able to run it and say, "it's a dog", spreading
> >word-of-mouth as such after they left PostgreSQL behind. Worse yet,
> >those that do walk away and claim it performs horribly are probably
> >doing more harm to the PostgreSQL community than expecting someone to be
> >able to install software ever can.
> >
> <RANT>
>
> And that my friends is why PostgreSQL is still relatively obscure.
>
> This attitude sucks. If you want a product to be used, you must put the
> effort into making it usable.
>

Ah..okay....

> It is a no-brainer to make the default configuration file suitable for
> the majority of users. It is lunacy to create a default configuration
> which provides poor performance for over 90% of the users, but which
> allows the lowest common denominator to work.
>

I think you read something into my email which I did not imply. I'm
certainly not advocating a default configuration file assuming 512M of
share memory or some such insane value.

Basically, you're arguing that they should keep doing exactly what they
are doing. It's currently known to be causing problems and propagating
the misconception that PostgreSQL is unable to perform under any
circumstance. I'm arguing that who cares if 5% of the potential user
base has to learn to properly install software. Either they'll read and
learn, ask for assistance, or walk away. All of which are better than
Jonny-come-lately offering up a meaningless benchmark which others are
happy to eat with rather large spoons.

> A product must not perform poorly out of the box, period. A good product
> manager would choose one of two possible configurations, (a) a high
> speed fairly optimized system from the get-go, or (b) it does not run
> unless you create the configuration file. Option (c) out of the box it
> works like crap, is not an option.
>

That's the problem. Option (c) is what we currently have. I'm amazed
that you even have a problem with option (a), as that's what I'm
suggesting. The problem is, potentially for some minority of users, it
may not run out of the box. As such, I'm more than happy with this
situation than 90% of the user base being stuck with a crappy default
configuration.

Oddly enough, your option (b) is even worse than what you are ranting at
me about. Go figure.

> This is why open source gets such a bad reputation. Outright contempt
> for the user who may not know the product as well as those developing
> it. This attitude really sucks and it turns people off. We want people
> to use PostgreSQL, to do that we must make PostgreSQL usable. Usability
> IS important.
> </RANT>

There is no contempt here. Clearly you've read your own bias into this
thread. If you go back and re-read my posting, I think it's VERY clear
that it's entirely about usability.

Regards,

--
Greg Copeland <greg(at)copelandconsulting(dot)net>
Copeland Computer Consulting

From:	"Jon Griffin" <jon(at)jongriffin(dot)com>
To:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy] PostgreSQL Benchmarks)
Date:	2003-02-11 17:38:18
Message-ID:	49391.192.168.1.12.1044985098.squirrel@mail.mayuli.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

FYI, my stock linux 2.4.19 gentoo kernel has:
kernel.shmall = 2097152
kernel.shmmax = 33554432

sysctl -a

So it appears that linux at least is way above your 8 meg point, unless I
am missing something.

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Justin Clift <justin(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:48:39
Message-ID:	200302110948.39283.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom, Justin,

> > Uh ... do we have a basis for recommending any particular sets of
> > parameters for these different scenarios? This could be a good idea
> > in the abstract, but I'm not sure I know enough to fill in the details.

Sure.
Mostly-Read database, few users, good hardware, complex queries:
= High shared buffers and sort mem, high geqo and join collapse thresholds,
moderate fsm settings, defaults for WAL.
Same as above with many users and simple queries (webserver) =
same as above, except lower sort mem and higher connection limit
High-Transaction Database =
Moderate shared buffers and sort mem, high FSM settings, increase WAL files
and buffers.
Workstation =
Moderate to low shared buffers and sort mem, moderate FSM, defaults for WAL,
etc.
Low-Impact server = current defaults, more or less.

While none of these settings will be *perfect* for anyone, they will be
considerably better than what's shipping with postgresql. And, based on my
"Learning Perl" knowledge, I'm pretty sure I could write the program.

All we'd need to do is argue out, on the PERFORMANCE list, what's a good value
for each profile. That's the tough part. The Perl script is easy.

> > A lower-tech way to accomplish the same result is to document these
> > alternatives in postgresql.conf comments and encourage people to review
> > that file, as Steve Crawford just suggested. But first we need the raw
> > knowledge.

That's also not a bad approach ... the CONF file should be more heavily
commented, period, regardless of what approach we take. I volunteer to work
on this with other participants.

> Without too much hacking around, you could pretty easily adapt the
> pg_autotune code to do proper profiles of a system with different settings.

No offense, Justin, but I don't know anyone else who's gotten your pg_autotune
script to run other than you. And pg_bench has not been useful performance
measure for any real database server I have worked on so far.

I'd be glad to help improve pg_autotune, with two caveats:
1) We will still need to figure out the "profiles" above so that we have
decent starting values.
2) I suggest that we do pg_autotune in Perl or Python or another higher-level
language. This would enable several performance buffs who don't do C to
contribute to it, and a performance-tuning script is a higher-level-language
sort of function, anyway.

--
Josh Berkus
Aglio Database Solutions
San Francisco

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Justin Clift <justin(at)postgresql(dot)org>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 17:52:55
Message-ID:	26850.1044985975@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Justin Clift <justin(at)postgresql(dot)org> writes:
> Tom Lane wrote:
>> Uh ... do we have a basis for recommending any particular sets of
>> parameters for these different scenarios? This could be a good idea
>> in the abstract, but I'm not sure I know enough to fill in the details.

> Without too much hacking around, you could pretty easily adapt the
> pg_autotune code to do proper profiles of a system with different settings.

> i.e. increment one setting at a time, run pgbench on it with some decent
> amount of transactions and users, stuff the results into a different
> database.

If I thought that pgbench was representative of anything, or even
capable of reliably producing repeatable numbers, then I might subscribe
to results derived this way. But I have little or no confidence in
pgbench. Certainly I don't see how you'd use it to produce
recommendations for a range of application scenarios, when it's only
one very narrow scenario itself.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	mlw <pgsql(at)mohawksoft(dot)com>
Cc:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-11 17:54:37
Message-ID:	26874.1044986077@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

mlw <pgsql(at)mohawksoft(dot)com> writes:
> This attitude sucks. If you want a product to be used, you must put the
> effort into making it usable.
> [snip]

AFAICT, you are flaming Greg for recommending the exact same thing you
are recommending. Please calm down and read again.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Jon Griffin" <jon(at)jongriffin(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy] PostgreSQL Benchmarks)
Date:	2003-02-11 18:01:13
Message-ID:	26924.1044986473@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

"Jon Griffin" <jon(at)jongriffin(dot)com> writes:
> So it appears that linux at least is way above your 8 meg point, unless I
> am missing something.

Yeah, AFAIK all recent Linuxen are well above the range of parameters
that I was suggesting (and even if they weren't, Linux is particularly
easy to change the SHMMAX setting on). It's other Unixoid platforms
that are likely to have a problem. Particularly the ones where you
have to rebuild the kernel to change SHMMAX; people may be afraid to
do that.

Does anyone know whether cygwin has a setting comparable to SHMMAX,
and if so what is its default value? How about the upcoming native
Windows port --- any issues there?

regards, tom lane

From:	Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-11 18:03:45
Message-ID:	1044986625.12931.27.camel@camel
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 2003-02-11 at 12:08, Justin Clift wrote:
> b) Said benchmarking person knows very little about PostgreSQL, so they
> install the RPM's, packages, or whatever, and "it works". Then they run
> whatever benchmark they've downloaded, or designed, or whatever
>

Out of curiosity, how feasible is it for the rpm/package/deb/exe
maintainers to modify their supplied postgresql.conf settings when
building said distribution? AFAIK the minimum default SHHMAX setting on
Red Hat 8.0 is 32MB, seems like bumping shared buffers to work with that
amount would be acceptable inside the 8.0 rpm's.

Robert Treat

From:	mlw <pgsql(at)mohawksoft(dot)com>
To:	Greg Copeland <greg(at)copelandconsulting(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-11 18:27:19
Message-ID:	3E494087.2010308@mohawksoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Apology

After Mark calms down and, in fact, sees that Greg was saying the right
thing after all, chagrin is the only word.

I'm sorry.

Greg Copeland wrote:

>On Tue, 2003-02-11 at 11:23, mlw wrote:
>
>
>>Greg Copeland wrote:
>>
>>
>>
>>>
>>>
>>>I'd personally rather have people stumble trying to get PostgreSQL
>>>running, up front, rather than allowing the lowest common denominator
>>>more easily run PostgreSQL only to be disappointed with it and move on.
>>>
>>>After it's all said and done, I would rather someone simply say, "it's
>>>beyond my skill set", and attempt to get help or walk away. That seems
>>>better than them being able to run it and say, "it's a dog", spreading
>>>word-of-mouth as such after they left PostgreSQL behind. Worse yet,
>>>those that do walk away and claim it performs horribly are probably
>>>doing more harm to the PostgreSQL community than expecting someone to be
>>>able to install software ever can.
>>>
>>>
>>>
>><RANT>
>>
>>And that my friends is why PostgreSQL is still relatively obscure.
>>
>>This attitude sucks. If you want a product to be used, you must put the
>>effort into making it usable.
>>
>>
>>
>
>
>Ah..okay....
>
>
>
>
>>It is a no-brainer to make the default configuration file suitable for
>>the majority of users. It is lunacy to create a default configuration
>>which provides poor performance for over 90% of the users, but which
>>allows the lowest common denominator to work.
>>
>>
>>
>
>I think you read something into my email which I did not imply. I'm
>certainly not advocating a default configuration file assuming 512M of
>share memory or some such insane value.
>
>Basically, you're arguing that they should keep doing exactly what they
>are doing. It's currently known to be causing problems and propagating
>the misconception that PostgreSQL is unable to perform under any
>circumstance. I'm arguing that who cares if 5% of the potential user
>base has to learn to properly install software. Either they'll read and
>learn, ask for assistance, or walk away. All of which are better than
>Jonny-come-lately offering up a meaningless benchmark which others are
>happy to eat with rather large spoons.
>
>
>
>
>>A product must not perform poorly out of the box, period. A good product
>>manager would choose one of two possible configurations, (a) a high
>>speed fairly optimized system from the get-go, or (b) it does not run
>>unless you create the configuration file. Option (c) out of the box it
>>works like crap, is not an option.
>>
>>
>>
>
>That's the problem. Option (c) is what we currently have. I'm amazed
>that you even have a problem with option (a), as that's what I'm
>suggesting. The problem is, potentially for some minority of users, it
>may not run out of the box. As such, I'm more than happy with this
>situation than 90% of the user base being stuck with a crappy default
>configuration.
>
>Oddly enough, your option (b) is even worse than what you are ranting at
>me about. Go figure.
>
>
>
>>This is why open source gets such a bad reputation. Outright contempt
>>for the user who may not know the product as well as those developing
>>it. This attitude really sucks and it turns people off. We want people
>>to use PostgreSQL, to do that we must make PostgreSQL usable. Usability
>>IS important.
>></RANT>
>>
>>
>
>
>There is no contempt here. Clearly you've read your own bias into this
>thread. If you go back and re-read my posting, I think it's VERY clear
>that it's entirely about usability.
>
>
>Regards,
>
>
>

From:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-11 18:34:32
Message-ID:	Pine.LNX.4.33.0302111132030.29884-100000@css120.ihs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

My other pet peeve is the default max connections setting. This should be
higher if possible, but of course, there's always the possibility of
running out of file descriptors.

Apache has a default max children of 150, and if using PHP or another
language that runs as an apache module, it is quite possible to use up all
the pgsql backend slots before using up all the apache child slots.

Is setting the max connections to something like 200 reasonable, or likely
to cause too many problems?

From:	"Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-11 18:53:51
Message-ID:	1044989631.32185.151.camel@zeutrh80
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 2003-02-11 at 13:01, Tom Lane wrote:
> "Jon Griffin" <jon(at)jongriffin(dot)com> writes:
> > So it appears that linux at least is way above your 8 meg point, unless I
> > am missing something.
>
> Yeah, AFAIK all recent Linuxen are well above the range of parameters
> that I was suggesting (and even if they weren't, Linux is particularly
> easy to change the SHMMAX setting on). It's other Unixoid platforms
> that are likely to have a problem. Particularly the ones where you
> have to rebuild the kernel to change SHMMAX; people may be afraid to
> do that.

The issue as I see it is:
Better performing vs. More Compatible Out of the box Defaults.

Perhaps a compromise (hack?):
Set the default to some default value that performs well, a value we all
agree is not too big (16M? 32M?). On startup, if the OS can't give us
what we want, instead of failing, we can try again with a smaller
amount, perhaps half the default, if that fails try again with half
until we reach some bottom threshold (1M?).

The argument against this might be: When I set shared_buffers=X, I want
X shared buffers. I don't want it to fail silently and give me less than
what I need / want. To address this we might want to add a guc option
that controls this behavior. So we ship postgresql.conf with 32M of
shared memory and auto_shared_mem_reduction = true. With a comment that
the administrator might want to turn this off for production.

Thoughts?

I think this will allow most uninformed users get decent performing
defaults as most systems will accommodate this larger value.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-11 18:55:29
Message-ID:	27232.1044989729@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

"scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> writes:
> Is setting the max connections to something like 200 reasonable, or likely
> to cause too many problems?

That would likely run into number-of-semaphores limitations (SEMMNI,
SEMMNS). We do not seem to have as good documentation about changing
that as we do about changing the SHMMAX setting, so I'm not sure I want
to buy into the "it's okay to expect people to fix this before they can
start Postgres the first time" argument here.

Also, max-connections doesn't silently skew your testing: if you need
to raise it, you *will* know it.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy] PostgreSQL Benchmarks)
Date:	2003-02-11 19:06:32
Message-ID:	27295.1044990392@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

"Matthew T. O'Connor" <matthew(at)zeut(dot)net> writes:
> ... So we ship postgresql.conf with 32M of
> shared memory and auto_shared_mem_reduction = true. With a comment that
> the administrator might want to turn this off for production.

This really doesn't address Justin's point about clueless benchmarkers,
however. In fact I fear it would make that problem worse: if Joe Blow
says he got horrible performance, who knows whether he was running with
a reasonable number of buffers or not? Especially when you ask him
"did you have lots of shared buffers" and he responds "yes, of course,
it says 32M right here".

We've recently been moving away from the notion that it's okay to
silently lose functionality in order to run on a given system. For
example, if you want to install without readline, you now have to
explicitly tell configure that, because we heard "why don't I have
history in psql" way too often from people who just ran configure
and paid no attention to what it told them.

I think that what this discussion is really leading up to is that we
are going to decide to apply the same principle to performance. The
out-of-the-box settings ought to give reasonable performance, and if
your system can't handle it, you should have to take explicit action
to acknowledge the fact that you aren't going to get reasonable
performance.

regards, tom lane

From:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 19:16:15
Message-ID:	1044990974.2501.207.camel@mouse.copelandconsulting.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 2003-02-11 at 12:55, Tom Lane wrote:
> "scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> writes:
> > Is setting the max connections to something like 200 reasonable, or likely
> > to cause too many problems?
>
> That would likely run into number-of-semaphores limitations (SEMMNI,
> SEMMNS). We do not seem to have as good documentation about changing
> that as we do about changing the SHMMAX setting, so I'm not sure I want
> to buy into the "it's okay to expect people to fix this before they can
> start Postgres the first time" argument here.
>
> Also, max-connections doesn't silently skew your testing: if you need
> to raise it, you *will* know it.
>

Besides, I'm not sure that it makes sense to let other product needs
dictate the default configurations for this one. It would be one thing
if the vast majority of people only used PostgreSQL with Apache. I know
I'm using it in environments in which no way relate to the web. I'm
thinking I'm not alone.

Regards,

--
Greg Copeland <greg(at)copelandconsulting(dot)net>
Copeland Computer Consulting

From:	Jeff Hoffmann <jeff(at)propertykey(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-11 19:36:05
Message-ID:	3E4950A5.1050905@propertykey.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom Lane wrote:

> I think that what this discussion is really leading up to is that we
> are going to decide to apply the same principle to performance. The
> out-of-the-box settings ought to give reasonable performance, and if
> your system can't handle it, you should have to take explicit action
> to acknowledge the fact that you aren't going to get reasonable
> performance.

What I don't understand is why this is such a huge issue. Set it to a
reasonable level (be it 4M or whatever the concensus is) & let the
packagers worry about it if that's not appropriate. Isn't it their job
to have a good out-of-the-package experience? Won't they have better
knowledge of what the system limits are for the packages they develop
for? Worst case, couldn't they have a standard conf package & a special
"high-performance" conf package in addition to all the base packages?
After all, it's the users of the RPMs that are the real problem, not
usually the people that compile it on their own. If you were having
problems with the "compile-it-yourself" audience, couldn't you just hit
them over the head three or four times (configure, install, initdb &
failed startup to name a few) reminding them to change it if it wasn't
appropriate. What more can you really do? At some point, the end user
has to bear some responsibility...

Jeff Hoffmann
PropertyKey.com

From:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-11 19:54:06
Message-ID:	Pine.LNX.4.33.0302111224240.29989-100000@css120.ihs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 11 Feb 2003, Tom Lane wrote:

> "scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> writes:
> > Is setting the max connections to something like 200 reasonable, or likely
> > to cause too many problems?
>
> That would likely run into number-of-semaphores limitations (SEMMNI,
> SEMMNS). We do not seem to have as good documentation about changing
> that as we do about changing the SHMMAX setting, so I'm not sure I want
> to buy into the "it's okay to expect people to fix this before they can
> start Postgres the first time" argument here.
>
> Also, max-connections doesn't silently skew your testing: if you need
> to raise it, you *will* know it.

True, but unfortunately, the time you usually learn that the first time is
when your web server starts issuing error messages about not being able to
connect to the database. i.e. it fails at the worst possible time.

OK. I just did some very simple testing in RH Linux 7.2 and here's what I
found about file handles: default max appears to be 8192 now, not 4096.

With max file handles set to 4096, I run out of handles when opening about
450 or more simultaneous connections. At 8192, the default for RH72, I
pretty much run out of memory on a 512 Meg box and start swapping
massively long before I can exhaust the file handle pool.

At 200 connections, I use about half of all my file descriptors out of
4096, which seems pretty safe to me.

Note that setting the max connections to 200 in the conf does NOT result
in huge allocations of file handles right away, but only while the
database is under load, so this leads us to the other possible problem,
that the database will exhaust file handles if we set this number too
high, as opposed to not being able to connect because it's too low.

I'm guessing that 200 or less is pretty safe on most modern flavors of
Unix, but I'm not one of those folks who keeps the older flavors happy
really, so I can't speak for them.

Back in the day, a P100 with 30 or 40 connections was a heavy load,
nowadays, a typical workstation has 512 Meg ram or more, and a 1.5+GHz
CPU, so I can see increasing this setting too. I'd rather the only issue
for the user be adjusting their kernel than having to up the connection
limit in postgresql. I can up the max file handles in Linux on the fly,
with no one noticeing it, I have to stop and restart postgresql to make
the max backends take affect, so that's another reason not to have too low
a limit.

Is there a place on the web somewhere that lists the default settings for
most major unixes for file handles, inodes, and shared memory?

From:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 20:10:17
Message-ID:	Pine.LNX.4.33.0302111300160.29989-100000@css120.ihs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On 11 Feb 2003, Greg Copeland wrote:

> On Tue, 2003-02-11 at 12:55, Tom Lane wrote:
> > "scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> writes:
> > > Is setting the max connections to something like 200 reasonable, or likely
> > > to cause too many problems?
> >
> > That would likely run into number-of-semaphores limitations (SEMMNI,
> > SEMMNS). We do not seem to have as good documentation about changing
> > that as we do about changing the SHMMAX setting, so I'm not sure I want
> > to buy into the "it's okay to expect people to fix this before they can
> > start Postgres the first time" argument here.
> >
> > Also, max-connections doesn't silently skew your testing: if you need
> > to raise it, you *will* know it.
> >
>
> Besides, I'm not sure that it makes sense to let other product needs
> dictate the default configurations for this one. It would be one thing
> if the vast majority of people only used PostgreSQL with Apache. I know
> I'm using it in environments in which no way relate to the web. I'm
> thinking I'm not alone.

True, but even so, 32 max connections is a bit light. I have more
pgsql databases than that on my box now. My point in my previous answer
to Tom was that you HAVE to shut down postgresql to change this. It
doesn't allocate tons of semaphores on startup, just when the child
processes are spawned, and I'd rather have the user adjust their OS to
meet the higher need than have to shut down and restart postgresql as
well. This is one of the settings that make it feel like a "toy" when you
first open it.

How many other high quality databases in the whole world restrict max
connections to 32? The original choice of 32 was set because the original
choice of 64 shared memory blocks as the most we could hope for on common
OS installs. Now that we're looking at cranking that up to 1000,
shouldn't max connections get a look too?

You don't have to be using apache to need more than 32 simo connections.
Heck, how many postgresql databases do you figure are in production with
that setting still in there? My guess is not many.

I'm not saying we should do this to make benchmarks better either, I'm
saying we should do it to improve the user experience. A limit of 32
connects makes things tough for a beginning DBA, not only does he find out
the problem while his database is under load the first time, but then he
can't fix it without shutting down and restarting postgresql. If the max
is set to 200 or 500 and he starts running out of semaphores, that's a
problem he can address while his database is still up and running in most
operating systems, at least in the ones I use.

So, my main point is that any setting that requires you to shut down
postgresql to make the change, we should pick a compromise value that
means you never likely will have to shut down the database once you've
started it up and it's under load. shared buffers, max connects, etc...
should not need tweaking for 95% or more of the users if we can help it.
It would be nice if we could find a set of numbers that reduce the number
of problems users have, so all I'm doing is looking for the sweetspot,
which is NOT 32 max connections.

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-11 21:13:37
Message-ID:	Pine.LNX.4.44.0302112211150.7753-100000@peter.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom Lane writes:

> We could retarget to try to stay under SHMMAX=4M, which I think is
> the next boundary that's significant in terms of real-world platforms
> (isn't that the default SHMMAX on some BSDen?). That would allow us
> 350 or so shared_buffers, which is better, but still not really a
> serious choice for production work.

What is a serious choice for production work? And what is the ideal
choice? The answer probably involves some variables, but maybe we should
get values for those variables in each case and work from there.

--
Peter Eisentraut peter_e(at)gmx(dot)net

From:	Lamar Owen <lamar(dot)owen(at)wgcr(dot)org>
To:	Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-11 21:53:39
Message-ID:	200302111653.39763.lamar.owen@wgcr.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tuesday 11 February 2003 13:03, Robert Treat wrote:
> On Tue, 2003-02-11 at 12:08, Justin Clift wrote:
> > b) Said benchmarking person knows very little about PostgreSQL, so they
> > install the RPM's, packages, or whatever, and "it works". Then they run
> > whatever benchmark they've downloaded, or designed, or whatever

> Out of curiosity, how feasible is it for the rpm/package/deb/exe
> maintainers to modify their supplied postgresql.conf settings when
> building said distribution? AFAIK the minimum default SHHMAX setting on
> Red Hat 8.0 is 32MB, seems like bumping shared buffers to work with that
> amount would be acceptable inside the 8.0 rpm's.

Yes, this is easy to do. But what is a sane default? I can patch any file
I'd like to, but my preference is to patch as little as possible, as I'm
trying to be generic here. I can't assume Red Hat 8 in the source RPM, and
my binaries are to be preferred only if the distributor doesn't have updated
ones.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

From:	"Rick Gigger" <rick(at)alpinenetworking(dot)com>
To:	"Greg Copeland" <greg(at)CopelandConsulting(dot)Net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>, "PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 00:25:29
Message-ID:	006c01c2d22d$4028c450$0a00000a@grouch
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> On Tue, 2003-02-11 at 10:20, Tom Lane wrote:
> > "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com> writes:
> > > May I make a suggestion that maybe it is time to start thinking about
> > > tuning the default config file, IMHO its just a little bit too
> > > conservative,
> >
> > It's a lot too conservative. I've been thinking for awhile that we
> > should adjust the defaults.
> >
> > The original motivation for setting shared_buffers = 64 was so that
> > Postgres would start out-of-the-box on machines where SHMMAX is 1 meg
> > (64 buffers = 1/2 meg, leaving 1/2 meg for our other shared data
> > structures). At one time SHMMAX=1M was a pretty common stock kernel
> > setting. But our other data structures blew past the 1/2 meg mark
> > some time ago; at default settings the shmem request is now close to
> > 1.5 meg. So people with SHMMAX=1M have already got to twiddle their
> > postgresql.conf settings, or preferably learn how to increase SHMMAX.
> > That means there is *no* defensible reason anymore for defaulting to
> > 64 buffers.
> >
> > We could retarget to try to stay under SHMMAX=4M, which I think is
> > the next boundary that's significant in terms of real-world platforms
> > (isn't that the default SHMMAX on some BSDen?). That would allow us
> > 350 or so shared_buffers, which is better, but still not really a
> > serious choice for production work.
> >
> > What I would really like to do is set the default shared_buffers to
> > 1000. That would be 8 meg worth of shared buffer space. Coupled with
> > more-realistic settings for FSM size, we'd probably be talking a shared
> > memory request approaching 16 meg. This is not enough RAM to bother
> > any modern machine from a performance standpoint, but there are probably
> > quite a few platforms out there that would need an increase in their
> > stock SHMMAX kernel setting before they'd take it.
> >
> > So what this comes down to is making it harder for people to get
> > Postgres running for the first time, versus making it more likely that
> > they'll see decent performance when they do get it running.
> >
> > It's worth noting that increasing SHMMAX is not nearly as painful as
> > it was back when these decisions were taken. Most people have moved
> > to platforms where it doesn't even take a kernel rebuild, and we've
> > acquired documentation that tells how to do it on all(?) our supported
> > platforms. So I think it might be okay to expect people to do it.
> >
> > The alternative approach is to leave the settings where they are, and
> > to try to put more emphasis in the documentation on the fact that the
> > factory-default settings produce a toy configuration that you *must*
> > adjust upward for decent performance. But we've not had a lot of
> > success spreading that word, I think. With SHMMMAX too small, you
> > do at least get a pretty specific error message telling you so.
> >
> > Comments?
>
> I'd personally rather have people stumble trying to get PostgreSQL
> running, up front, rather than allowing the lowest common denominator
> more easily run PostgreSQL only to be disappointed with it and move on.
>
> After it's all said and done, I would rather someone simply say, "it's
> beyond my skill set", and attempt to get help or walk away. That seems
> better than them being able to run it and say, "it's a dog", spreading
> word-of-mouth as such after they left PostgreSQL behind. Worse yet,
> those that do walk away and claim it performs horribly are probably
> doing more harm to the PostgreSQL community than expecting someone to be
> able to install software ever can.
>
> Nutshell:
> "Easy to install but is horribly slow."
>
> or
>
> "Took a couple of minutes to configure and it rocks!"
>
>
>
> Seems fairly cut-n-dry to me. ;)

The type of person who can't configure it or doesnt' think to try is
probably not doing a project that requires any serious performance. As long
as you are running it on decent hardware postgres will run fantastic for
anything but a very heavy load. I think there may be many people out there
who have little experience but want an RDBMS to manage their data. Those
people need something very, very easy. Look at the following that mysql
gets despite how poor of a product it is. It's very, very easy. Mysql
works great for many peoples needs but then when they need to do something
real they need to move to a different database entirely. I think there is a
huge advantage to having a product that can be set up very quickly out of
the box. Those who need serious performance, hopefully for ther employers
sake, will be more like to take a few minutes to do some quick performance
tuning.

Rick Gigger

From:	Curt Sampson <cjs(at)cynic(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-12 00:41:45
Message-ID:	Pine.NEB.4.51.0302120934450.6267@angelic.cynic.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 11 Feb 2003, Tom Lane wrote:

> It's a lot too conservative. I've been thinking for awhile that we
> should adjust the defaults.

Some of these issues could be made to Just Go Away with some code
changes. For example, using mmap rather than SysV shared memory
would automatically optimize your memory usage, and get rid of the
double-buffering problem as well. If we could find a way to avoid using
semephores proportional to the number of connections we have, then you
wouldn't have to worry about that configuration parameter, either.

In fact, some of this stuff might well improve our portability, too.
For example, mmap is a POSIX standard, whereas shmget is only an X/Open
standard. That makes me suspect that mmap is more widely available on
non-Unix platforms. (But I could be wrong.)

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

From:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To:	Rick Gigger <rick(at)alpinenetworking(dot)com>
Cc:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 00:42:06
Message-ID:	Pine.LNX.4.33.0302111734290.1977-100000@css120.ihs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 11 Feb 2003, Rick Gigger wrote:

> The type of person who can't configure it or doesnt' think to try is
> probably not doing a project that requires any serious performance. As long
> as you are running it on decent hardware postgres will run fantastic for
> anything but a very heavy load. I think there may be many people out there
> who have little experience but want an RDBMS to manage their data. Those
> people need something very, very easy. Look at the following that mysql
> gets despite how poor of a product it is. It's very, very easy. Mysql
> works great for many peoples needs but then when they need to do something
> real they need to move to a different database entirely. I think there is a
> huge advantage to having a product that can be set up very quickly out of
> the box. Those who need serious performance, hopefully for ther employers
> sake, will be more like to take a few minutes to do some quick performance
> tuning.

Very good point. I'm pushing for changes that will NOT negatively impact
joe beginner on the major platforms (Linux, BSD, Windows) in terms of
install. I figure anyone installing on big iron already knows enough
about their OS we don't have to worry about shared buffers being too big
for that machine.

So, a compromise of faster performance out of the box, with little or no
negative user impact seems the sweet spot here.

I'm thinking a good knee setting for each one, where not too much memory /
semaphores / file handles get gobbled up, but the database isn't pokey.

The poor performance of Postgresql in it's current default configuration
HAS cost us users, trust me, I know a few we've almost lost where I work
that I converted after some quick tweaking of their database.

In it's stock form Postgresql is very slow at large simple queries, like
'select * from table1 t1 natural join table2 t2 where t1.field='a'; where
you get back something like 10,000 rows. The real bottleneck here is
sort_mem. A simple bump up to 8192 or so makes the database much more
responsive.

If we're looking at changing default settings for 7.4, then we should look
at changing ALL of them that matter, since we'll have the most time to
shake out problems if we do them early, and we won't have four or five
rounds of setting different defaults over time and finding the limitations
of the HOST OSes one at a time.

From:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To:	Curt Sampson <cjs(at)cynic(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 01:02:09
Message-ID:	Pine.LNX.4.33.0302111800080.1977-100000@css120.ihs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Wed, 12 Feb 2003, Curt Sampson wrote:

> On Tue, 11 Feb 2003, Tom Lane wrote:
>
> > It's a lot too conservative. I've been thinking for awhile that we
> > should adjust the defaults.
>
> Some of these issues could be made to Just Go Away with some code
> changes. For example, using mmap rather than SysV shared memory
> would automatically optimize your memory usage, and get rid of the
> double-buffering problem as well. If we could find a way to avoid using
> semephores proportional to the number of connections we have, then you
> wouldn't have to worry about that configuration parameter, either.
>
> In fact, some of this stuff might well improve our portability, too.
> For example, mmap is a POSIX standard, whereas shmget is only an X/Open
> standard. That makes me suspect that mmap is more widely available on
> non-Unix platforms. (But I could be wrong.)

I'll vote for mmap. I use the mm libs with apache/openldap/authldap and
it is very fast and pretty common nowadays. It seems quite stable as
well.

From:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To:	tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc:	justin(at)postgresql(dot)org, josh(at)agliodbs(dot)com, merlin(dot)moncure(at)rcsonline(dot)com, pgsql-hackers(at)postgresql(dot)org, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration
Date:	2003-02-12 01:10:00
Message-ID:	20030212.101000.74752335.t-ishii@sra.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> If I thought that pgbench was representative of anything, or even
> capable of reliably producing repeatable numbers, then I might subscribe
> to results derived this way. But I have little or no confidence in
> pgbench. Certainly I don't see how you'd use it to produce
> recommendations for a range of application scenarios, when it's only
> one very narrow scenario itself.

Sigh. People always complain "pgbench does not reliably producing
repeatable numbers" or something then say "that's because pgbench's
transaction has too much contention on the branches table". So I added
-N option to pgbench which makes pgbench not to do any UPDATE to
the branches table. But still people continue to complian...

There should be many factors that would produce non-repeatable
results exist, for instance kenel buffer, PostgreSQL's buffer manager,
pgbench itself etc. etc... So far it seems no one has ever made clean
explanation why non-repeatable results happen...
--
Tatsuo Ishii

From:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To:	scott(dot)marlowe(at)ihs(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Changing the default configuration
Date:	2003-02-12 01:10:08
Message-ID:	20030212.101008.41630074.t-ishii@sra.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> My other pet peeve is the default max connections setting. This should be
> higher if possible, but of course, there's always the possibility of
> running out of file descriptors.
>
> Apache has a default max children of 150, and if using PHP or another
> language that runs as an apache module, it is quite possible to use up all
> the pgsql backend slots before using up all the apache child slots.
>
> Is setting the max connections to something like 200 reasonable, or likely
> to cause too many problems?

It likely. First you will ran out kernel file descriptors. This could
be solved by increasing the kernel table or lowering
max_files_per_process, though. Second the total throughput will
rapidly descrease if you don't have enough RAM and many
CPUs. PostgreSQL can not handle many concurrent
connections/transactions effectively. I recommend to employ some kind
of connection pooling software and lower the max connections.
--
Tatsuo Ishii

From:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To:	matthew(at)zeut(dot)net
Cc:	pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject:	Re: Changing the default configuration
Date:	2003-02-12 01:10:26
Message-ID:	20030212.101026.71083976.t-ishii@sra.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

It's interesting that people focus on shared_buffers. From my
experience the most dominating parameter for performance is
wal_sync_method. It sometimes makes ~20% performance difference. On
the otherhand, shared_buffers does very little for
performance. Moreover too many shared_buffers cause performance
degration! I guess this is due to the poor design of bufmgr. Until it
is fixed, just increasing the number of shared_buffers a bit, say
1024, is enough IMHO.
--
Tatsuo Ishii

From:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration
Date:	2003-02-12 01:23:28
Message-ID:	Pine.LNX.4.33.0302111815590.1977-100000@css120.ihs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Wed, 12 Feb 2003, Tatsuo Ishii wrote:

> > My other pet peeve is the default max connections setting. This should be
> > higher if possible, but of course, there's always the possibility of
> > running out of file descriptors.
> >
> > Apache has a default max children of 150, and if using PHP or another
> > language that runs as an apache module, it is quite possible to use up all
> > the pgsql backend slots before using up all the apache child slots.
> >
> > Is setting the max connections to something like 200 reasonable, or likely
> > to cause too many problems?
>
> It likely. First you will ran out kernel file descriptors. This could
> be solved by increasing the kernel table or lowering
> max_files_per_process, though. Second the total throughput will
> rapidly descrease if you don't have enough RAM and many
> CPUs. PostgreSQL can not handle many concurrent
> connections/transactions effectively. I recommend to employ some kind
> of connection pooling software and lower the max connections.

Don't know if you saw my other message, but increasing max connects to 200
used about 10% of all my semaphores and about 10% of my file handles.
That was while running pgbench to create 200 simo sessions.

Keep in mind, on my fairly small intranet database server, I routinely
have >32 connections, most coming from outside my webserver. Probably no
more than 4 or 5 connects at a time come from there. These are all things
like Windows boxes with ODBC running access or something similar. Many of
the connections are idle 98% of the time, and use little or no real
resources, even getting swapped out should the server need the spare
memory (it doesn't :-) that machine is set to 120 max simos if I remember
correctly.

while 200 may seem high, 32 definitely seems low. So, what IS a good
compromise? for this and ALL the other settings that should probably be a
bit higher. I'm guessing sort_mem or 4 or 8 meg hits the knee for most
folks, and the max fsm settings tom has suggested make sense.

What wal_sync method should we make default? Or should we pick one based
on the OS the user is running?

From:	Bruno Wolff III <bruno(at)wolff(dot)to>
To:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc:	Rick Gigger <rick(at)alpinenetworking(dot)com>, Greg Copeland <greg(at)CopelandConsulting(dot)Net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 01:37:48
Message-ID:	20030212013748.GA5831@wolff.to
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, Feb 11, 2003 at 17:42:06 -0700,
"scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> wrote:
>
> The poor performance of Postgresql in it's current default configuration
> HAS cost us users, trust me, I know a few we've almost lost where I work
> that I converted after some quick tweaking of their database.

About two years ago I talked some people into trying it at work to
use with IMP/Horde which had been having some corruption problems
while using MySQL (though it wasn't necessarily a problem with MySQL).
I told them to be sure to use 7.1. When they tried it out it couldn't
keep up with the load. I asked the guys what they tried and found out
they couldn't find 7.1 rpms and didn't want to compile from source and
so ended up using 7.0.?. Also as far as I could tell from talking to them,
they didn't do any tuning at all. They weren't interested in taking another
look at it after that. We are still using MySQL with that system today.

One of our DBAs is using it for some trial projects (including one for me)
even though we have a site license for Oracle.

From:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To:	scott(dot)marlowe(at)ihs(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Changing the default configuration
Date:	2003-02-12 02:00:00
Message-ID:	20030212.110000.97298009.t-ishii@sra.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> > It likely. First you will ran out kernel file descriptors. This could
> > be solved by increasing the kernel table or lowering
> > max_files_per_process, though. Second the total throughput will
> > rapidly descrease if you don't have enough RAM and many
> > CPUs. PostgreSQL can not handle many concurrent
> > connections/transactions effectively. I recommend to employ some kind
> > of connection pooling software and lower the max connections.
>
> Don't know if you saw my other message, but increasing max connects to 200
> used about 10% of all my semaphores and about 10% of my file handles.
> That was while running pgbench to create 200 simo sessions.

I'm not talking about semaphores. You see the low usage of file
descriptors is just because pgbench uses very few tables.

> Keep in mind, on my fairly small intranet database server, I routinely
> have >32 connections, most coming from outside my webserver. Probably no
> more than 4 or 5 connects at a time come from there. These are all things
> like Windows boxes with ODBC running access or something similar. Many of
> the connections are idle 98% of the time, and use little or no real
> resources, even getting swapped out should the server need the spare
> memory (it doesn't :-) that machine is set to 120 max simos if I remember
> correctly.
>
> while 200 may seem high, 32 definitely seems low. So, what IS a good
> compromise? for this and ALL the other settings that should probably be a
> bit higher. I'm guessing sort_mem or 4 or 8 meg hits the knee for most
> folks, and the max fsm settings tom has suggested make sense.

32 is not too low if the kernel file descriptors is not
increased. Beware that running out of the kernel file descriptors is a
serious problem for the entire system, not only for PostgreSQL.

> What wal_sync method should we make default? Or should we pick one based
> on the OS the user is running?

It's really depending on the OS or kernel version. I saw open_sync is
best for certain version of Linux kernel, while fdatasync is good for
another version of kernel. I'm not sure, but it could be possible that
the file system type might affect the wal_sync choice.
--
Tatsuo Ishii

From:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To:	"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>, "Greg Copeland" <greg(at)CopelandConsulting(dot)Net>
Cc:	"PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] PostgreSQL Benchmarks
Date:	2003-02-12 02:42:42
Message-ID:	GNELIHDDFBOCMGBFGEFOIEHPCFAA.chriskl@familyhealth.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Why don't we include a postgresql.conf.recommended along with our
postgresql.conf.sample. That shouldn't be too hard. We can just jack up
the shared buffers and wal buffers and everything - it doesn't matter if
it's not perfect, but it will at least give people an idea of what needs to
be increased, etc to get good results.

I'm currently benchmarking our new DB server before we put it into
production. I plan to publish the results from that shortly.

Regards,

Chris

> -----Original Message-----
> From: pgsql-advocacy-owner(at)postgresql(dot)org
> [mailto:pgsql-advocacy-owner(at)postgresql(dot)org]On Behalf Of Merlin Moncure
> Sent: Tuesday, 11 February 2003 11:44 PM
> To: Greg Copeland
> Cc: PostgresSQL Hackers Mailing List; pgsql-advocacy(at)postgresql(dot)org
> Subject: Re: [pgsql-advocacy] [HACKERS] PostgreSQL Benchmarks
>
>
> I've tested all the win32 versions of postgres I can get my hands on
> (cygwin and not), and my general feeling is that they have problems with
> insert performance with fsync() turned on, probably the fault of the os.
> Select performance is not so much affected.
>
> This is easily solved with transactions and other such things. Also
> Postgres benefits from pl just like oracle.
>
> May I make a suggestion that maybe it is time to start thinking about
> tuning the default config file, IMHO its just a little bit too
> conservative, and its hurting you in benchmarks being run by idiots, but
> its still bad publicity. Any real database admin would know his test
> are synthetic and not meaningful without having to look at the #s.
>
> This is irritating me so much that I am going to put together a
> benchmark of my own, a real world one, on (publicly available) real
> world data. Mysql is a real dog in a lot of situations. The FCC
> publishes a database of wireless transmitters that has tables with 10
> million records in it. I'll pump that into pg, run some benchmarks,
> real world queries, and we'll see who the faster database *really* is.
> This is just a publicity issue, that's all. Its still annoying though.
>
> I'll even run an open challenge to database admin to beat query
> performance of postgres in such datasets, complex multi table joins,
> etc. I'll even throw out the whole table locking issue and analyze
> single user performance.
>
> Merlin
>
>
>
> _____________
> How much of the performance difference is from the RDBMS, from the
> middleware, and from the quality of implementation in the middleware.
>
> While I'm not surprised that the the cygwin version of PostgreSQL is
> slow, those results don't tell me anything about the quality of the
> middleware interface between PHP and PostgreSQL. Does anyone know if we
> can rule out some of the performance loss by pinning it to bad
> middleware implementation for PostgreSQL?
>
>
> Regards,
>
> --
> Greg Copeland <greg(at)copelandconsulting(dot)net>
> Copeland Computer Consulting
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

From:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To:	"mlw" <pgsql(at)mohawksoft(dot)com>, "Greg Copeland" <greg(at)CopelandConsulting(dot)Net>
Cc:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>, "PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 02:47:12
Message-ID:	GNELIHDDFBOCMGBFGEFOMEHPCFAA.chriskl@familyhealth.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> >After it's all said and done, I would rather someone simply say, "it's
> >beyond my skill set", and attempt to get help or walk away. That seems
> >better than them being able to run it and say, "it's a dog", spreading
> >word-of-mouth as such after they left PostgreSQL behind. Worse yet,
> >those that do walk away and claim it performs horribly are probably
> >doing more harm to the PostgreSQL community than expecting someone to be
> >able to install software ever can.
> >
> <RANT>
>
> And that my friends is why PostgreSQL is still relatively obscure.

Dude - I hang out on PHPBuilder's database forums and you wouldn't believe
how often the "oh, don't use Postgres, it has a history of database
corruption problems" thing is mentioned.

Chris

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-12 04:24:26
Message-ID:	495.1045023866@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

"scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> writes:
> ... The original choice of 32 was set because the original
> choice of 64 shared memory blocks as the most we could hope for on common
> OS installs. Now that we're looking at cranking that up to 1000,
> shouldn't max connections get a look too?

Actually I think max-connections at 32 was set because of SEMMAX limits,
and had only the most marginal connection to shared_buffers (anyone care
to troll the archives to check?) But sure, let's take another look at
the realistic limits today.

> ... If he starts running out of semaphores, that's a
> problem he can address while his database is still up and running in most
> operating systems, at least in the ones I use.

Back in the day, this took a kernel rebuild and system reboot to fix.
If this has changed, great ... but on exactly which Unixen can you
alter SEMMAX on the fly?

> So, my main point is that any setting that requires you to shut down
> postgresql to make the change, we should pick a compromise value that
> means you never likely will have to shut down the database once you've
> started it up and it's under load.

When I started using Postgres, it did not allocate the max number of
semas it might need at startup, but was instead prone to fail when you
tried to open the 17th or 33rd or so connection. It was universally
agreed to be an improvement to refuse to start at all if we could not
meet the specified max_connections setting. I don't want to backtrack
from that. If we can up the default max_connections setting, great ...
but let's not increase the odds of failing under load.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 05:27:31
Message-ID:	932.1045027651@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Tom Lane writes:
>> We could retarget to try to stay under SHMMAX=4M, which I think is
>> the next boundary that's significant in terms of real-world platforms
>> (isn't that the default SHMMAX on some BSDen?). That would allow us
>> 350 or so shared_buffers, which is better, but still not really a
>> serious choice for production work.

> What is a serious choice for production work?

Well, as I commented later in that mail, I feel that 1000 buffers is
a reasonable choice --- but I have to admit that I have no hard data
to back up that feeling. Perhaps we should take this to the
pgsql-perform list and argue about reasonable choices.

A separate line of investigation is "what is the lowest common
denominator nowadays?" I think we've established that SHMMAX=1M
is obsolete, but what replaces it as the next LCD? 4M seems to be
correct for some BSD flavors, and I can confirm that that's the
current default for Mac OS X --- any other comments?

regards, tom lane

From:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc:	"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>, "PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 05:32:41
Message-ID:	GNELIHDDFBOCMGBFGEFOCEICCFAA.chriskl@familyhealth.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> >> We could retarget to try to stay under SHMMAX=4M, which I think is
> >> the next boundary that's significant in terms of real-world platforms
> >> (isn't that the default SHMMAX on some BSDen?). That would allow us
> >> 350 or so shared_buffers, which is better, but still not really a
> >> serious choice for production work.
>
> > What is a serious choice for production work?
>
> Well, as I commented later in that mail, I feel that 1000 buffers is
> a reasonable choice --- but I have to admit that I have no hard data
> to back up that feeling. Perhaps we should take this to the
> pgsql-perform list and argue about reasonable choices.

Damn. Another list I have to subscribe to!

The results I just posted indicate that 1000 buffers is really quite bad
performance comaped to 4000, perhaps up to 100 TPS for selects and 30 TPS
for TPC-B.

Still, that 1000 is in itself vastly better than 64!!

Chris

From:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc:	"Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>, "PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 05:33:52
Message-ID:	GNELIHDDFBOCMGBFGEFOIEICCFAA.chriskl@familyhealth.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> A separate line of investigation is "what is the lowest common
> denominator nowadays?" I think we've established that SHMMAX=1M
> is obsolete, but what replaces it as the next LCD? 4M seems to be
> correct for some BSD flavors, and I can confirm that that's the
> current default for Mac OS X --- any other comments?

It's 1025 * 4k pages on FreeBSD = 4MB

Chris

From:	"Shridhar Daithankar<shridhar_daithankar(at)persistent(dot)co(dot)in>" <shridhar_daithankar(at)persistent(dot)co(dot)in>
To:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-12 06:21:44
Message-ID:	200302121151.44843.shridhar_daithankar@persistent.co.in
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tuesday 11 Feb 2003 10:56 pm, you wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
> > What if we supplied several sample .conf files, and let the user choose
> > which to copy into the database directory? We could have a "high read
> > performance" profile, and a "transaction database" profile, and a
> > "workstation" profile, and a "low impact" profile.
>
> Uh ... do we have a basis for recommending any particular sets of
> parameters for these different scenarios? This could be a good idea
> in the abstract, but I'm not sure I know enough to fill in the details.

Let's take very simple scenario to supply pre-configured postgresql.conf.

Assume that SHMMAX=Total memory/2 and supply different config files for

64MB/128Mb/256MB/512MB and above.

Is it simple enough?

Shridhar

From:	Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc:	scott(dot)marlowe(at)ihs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Changing the default configuration
Date:	2003-02-12 16:36:19
Message-ID:	1045067779.12931.258.camel@camel
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 2003-02-11 at 21:00, Tatsuo Ishii wrote:
> >
> > while 200 may seem high, 32 definitely seems low. So, what IS a good
> > compromise? for this and ALL the other settings that should probably be a
> > bit higher. I'm guessing sort_mem or 4 or 8 meg hits the knee for most
> > folks, and the max fsm settings tom has suggested make sense.
>
> 32 is not too low if the kernel file descriptors is not
> increased. Beware that running out of the kernel file descriptors is a
> serious problem for the entire system, not only for PostgreSQL.
>

Had this happen at a previous employer, and it definitely is bad. I
believe we had to do a reboot to clear it up. And we saw the problem a
couple of times since the sys admin wasn't able to deduce what had
happened the first time we got it. IIRC the problem hit somewhere around
150 connections, so we ran with 128 max. I think this is a safe number
on most servers these days (running linux as least) though out of the
box I might be more inclined to limit it to 64. If you do hit a file
descriptor problem, *you are hosed*.

Robert Treat

From:	Andrew Sullivan <andrew(at)libertyrms(dot)info>
To:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 16:39:41
Message-ID:	20030212113940.H13362@mail.libertyrms.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, Feb 11, 2003 at 05:25:29PM -0700, Rick Gigger wrote:

> The type of person who can't configure it or doesnt' think to try is
> probably not doing a project that requires any serious performance.

I have piles of email, have fielded thousands of phone calls, and
have had many conversations which prove that claim false. People
think that computers are magic. That they don't think the machines
require a little bit of attention is nowise an indication that they
don't need the system to come with reasonable defaults.

--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<andrew(at)libertyrms(dot)info> M2P 2A8
+1 416 646 3304 x110

From:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>
To:	Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, scott(dot)marlowe(at)ihs(dot)com, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration
Date:	2003-02-12 16:43:05
Message-ID:	1045068185.2518.244.camel@mouse.copelandconsulting.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Wed, 2003-02-12 at 10:36, Robert Treat wrote:
> On Tue, 2003-02-11 at 21:00, Tatsuo Ishii wrote:
> > >
> > > while 200 may seem high, 32 definitely seems low. So, what IS a good
> > > compromise? for this and ALL the other settings that should probably be a
> > > bit higher. I'm guessing sort_mem or 4 or 8 meg hits the knee for most
> > > folks, and the max fsm settings tom has suggested make sense.
> >
> > 32 is not too low if the kernel file descriptors is not
> > increased. Beware that running out of the kernel file descriptors is a
> > serious problem for the entire system, not only for PostgreSQL.
> >
>
> Had this happen at a previous employer, and it definitely is bad. I
> believe we had to do a reboot to clear it up. And we saw the problem a
> couple of times since the sys admin wasn't able to deduce what had
> happened the first time we got it. IIRC the problem hit somewhere around
> 150 connections, so we ran with 128 max. I think this is a safe number
> on most servers these days (running linux as least) though out of the
> box I might be more inclined to limit it to 64. If you do hit a file
> descriptor problem, *you are hosed*.
>

That does seem like a more reasonable upper limit. I would rather see
people have to knowingly increase the limit rather than bump into system
upper limits and start scratching their heads trying to figure out what
the heck is going on.

--
Greg Copeland <greg(at)copelandconsulting(dot)net>
Copeland Computer Consulting

From:	Rod Taylor <rbt(at)rbt(dot)ca>
To:	Andrew Sullivan <andrew(at)libertyrms(dot)info>
Cc:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, Postgresql Advocacy <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 16:43:15
Message-ID:	1045068194.80901.22.camel@jester
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Wed, 2003-02-12 at 11:39, Andrew Sullivan wrote:
> On Tue, Feb 11, 2003 at 05:25:29PM -0700, Rick Gigger wrote:
>
> > The type of person who can't configure it or doesnt' think to try is
> > probably not doing a project that requires any serious performance.
>
> I have piles of email, have fielded thousands of phone calls, and
> have had many conversations which prove that claim false. People

But IBM told me computers are self healing, so if there is a performance
problem should it just fix itself?

--
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, scott(dot)marlowe(at)ihs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Changing the default configuration
Date:	2003-02-12 16:48:57
Message-ID:	5061.1045068537@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
> Had this happen at a previous employer, and it definitely is bad. I
> believe we had to do a reboot to clear it up. And we saw the problem a
> couple of times since the sys admin wasn't able to deduce what had
> happened the first time we got it. IIRC the problem hit somewhere around
> 150 connections, so we ran with 128 max. I think this is a safe number
> on most servers these days (running linux as least) though out of the
> box I might be more inclined to limit it to 64. If you do hit a file
> descriptor problem, *you are hosed*.

If you want to run lots of connections, it's a real good idea to set
max_files_per_process to positively ensure Postgres won't overflow
your kernel file table, ie, max_connections * max_files_per_process
should be less than the file table size.

Before about 7.2, we didn't have max_files_per_process, and would
naively believe whatever sysconf() told us was an okay number of files
to open. Unfortunately, way too many kernels promise more than they
can deliver ...

regards, tom lane

From:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-12 18:26:49
Message-ID:	Pine.LNX.4.33.0302121015050.5420-100000@css120.ihs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 11 Feb 2003, Tom Lane wrote:

> "scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> writes:
> > ... If he starts running out of semaphores, that's a
> > problem he can address while his database is still up and running in most
> > operating systems, at least in the ones I use.
>
> Back in the day, this took a kernel rebuild and system reboot to fix.
> If this has changed, great ... but on exactly which Unixen can you
> alter SEMMAX on the fly?

Tom, now you're making me all misty eyed for 14" platter 10 Meg hard
drives and paper tape readers. :-)

Seriously, I know Linux can change these on the fly, and I'm pretty sure
Solaris can too. I haven't played with BSD for a while so can't speak
about that. Anyone else know?

> > So, my main point is that any setting that requires you to shut down
> > postgresql to make the change, we should pick a compromise value that
> > means you never likely will have to shut down the database once you've
> > started it up and it's under load.
>
> When I started using Postgres, it did not allocate the max number of
> semas it might need at startup, but was instead prone to fail when you
> tried to open the 17th or 33rd or so connection. It was universally
> agreed to be an improvement to refuse to start at all if we could not
> meet the specified max_connections setting. I don't want to backtrack
> from that. If we can up the default max_connections setting, great ...
> but let's not increase the odds of failing under load.

I don't want to backtrack either, and I prefer that we now grab the
semaphores we need at startup.

Note that on a stock RH 72 box, the max number of
backends you can startup before you exhaust semphores is 2047 backends,
more than I'd ever want to try and run on normal PC hardware. So, on a
linux box 150 to 200 max backends comes no where near exhausting
semaphores.

I imagine that any "joe average" who doesn't really understand sysadmin
duties that well and is trying for the first time to install Postgresql
WILL be doing so on one of three general platforms, Linux, BSD, or
Windows. As long as the initial settings use only 10% or so of the file
handle and / or semaphore resources on each of those systems, we're
probably safe.

64 or 128 seems like a nice power of two number that is likely to keep us
safe on inital installs while forestalling problems with too many
connections.

Just for score, here's the default max output of rh72's kernel config:

kernel.sem = 250 32000 32 128
fs.file-max = 8192

Note that while older kernels needed to have max inodes bumped up as well,
nowadays that doesn't seem to be a problem, or they just set it really
high and I can't hit the ceiling on my workstation without swap storms.

the definitions of the kernel.sem line are:

kernel.sem: max_sem_per_id max_sem_total max_ops_sem_call max_sem_ids

I'll try to get FreeBSD running today and see what research I can find on
it, but 5.0 is likely to be a whole new beast for me, so if someone can
tell us what the maxes are by default on different BSDs and what the
settings are in postgresql that can exhaust them that gets us closer.

Like I've said before, anyone running HPUX, Irix, Solaris, or any other
"Industrial Strength Unix" should already know to increase all these
things and likely had to long before Postgresql showed up on their box, so
a setting that keeps pgsql from coming up won't be likely, and if it
happens, they'll most likely know how to handle it.

BSD and Linux users are more likely to contain the group of folks who
don't know all this and don't ever want to (not that all BSD/Linux users
are like that, just that the sub group mostly exists on those platforms,
and windows as well.) So the default settings really probably should be
determined, for the most part, by the needs of those users.

From:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration
Date:	2003-02-12 18:30:13
Message-ID:	Pine.LNX.4.33.0302121127140.5420-100000@css120.ihs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Oh, another setting that should be a "default" for most users is to initdb
automatically with locale of C. If they need a different locale, they
should have to pick it.

The performance of Postgresql with a locale other than C when doing like
and such is a serious ding. I'd much rather have the user experience the
faster searches first, then get to test with other locales and see if
performance is good enough, than to start out slow and wonder why they
need to change their initdb settings to get decent performance on a where
clause with like in it.

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, <pgsql-advocacy(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-12 23:52:25
Message-ID:	Pine.LNX.4.44.0302121912150.2811-100000@peter.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom Lane writes:

> Well, as I commented later in that mail, I feel that 1000 buffers is
> a reasonable choice --- but I have to admit that I have no hard data
> to back up that feeling.

I know you like it in that range, and 4 or 8 MB of buffers by default
should not be a problem. But personally I think if the optimal buffer
size does not depend on both the physical RAM you want to dedicate to
PostgreSQL and the nature and size of the database, then we have achieved
a medium revolution in computer science. ;-)

--
Peter Eisentraut peter_e(at)gmx(dot)net

From:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To:	"Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, "Tatsuo Ishii" <t-ishii(at)sra(dot)co(dot)jp>
Cc:	<scott(dot)marlowe(at)ihs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration
Date:	2003-02-13 01:43:23
Message-ID:	GNELIHDDFBOCMGBFGEFOAEIGCFAA.chriskl@familyhealth.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> Had this happen at a previous employer, and it definitely is bad. I
> believe we had to do a reboot to clear it up. And we saw the problem a
> couple of times since the sys admin wasn't able to deduce what had
> happened the first time we got it. IIRC the problem hit somewhere around
> 150 connections, so we ran with 128 max. I think this is a safe number
> on most servers these days (running linux as least) though out of the
> box I might be more inclined to limit it to 64. If you do hit a file
> descriptor problem, *you are hosed*.

Just yesterday I managed to hose my new Postgres installation during a
particular benchmarking run. Postgres did restart itself nicely though. I
have no idea why that particular run caused problems when all other runs
with identical settings didn't. I checked the log and saw file descriptor
probs. I was doing 128 connections with 128 max connetions. This was the
log:

> 2003-02-12 04:16:15 LOG: PGSTAT: cannot open temp stats file
> /usr/local/pgsql/data/global/pgstat.tmp.41388: Too many open files in
> system
> 2003-02-12 04:16:15 LOG: PGSTAT: cannot open temp stats file
> /usr/local/pgsql/data/global/pgstat.tmp.41388: Too many open files in
> system
> 2003-02-12 04:16:39 PANIC: could not open transaction-commit log
> directory
> (/usr/local/pgsql/data/pg_clog): Too many open files in system
> 2003-02-12 04:16:39 LOG: statement: SET autocommit TO 'on';VACUUM
> ANALYZE
> 2003-02-12 04:16:39 LOG: PGSTAT: cannot open temp stats file
> /usr/local/pgsql/data/global/pgstat.tmp.41388: Too many open files in
> system

This was the MIB:

> kern.maxfiles: 1064
> kern.maxfilesperproc: 957

This was the solution:

> sysctl -w kern.maxfiles=65536
> sysctl -w kern.maxfilesperproc=8192
>
> .. and then stick
>
> kern.maxfiles=65536
> kern.maxfilesperproc=8192
>
> in /etc/sysctl.conf so its set during a reboot.

Which just goes to highlight the importance of rigorously testing a
production installation...

Chris

From:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Greg Copeland" <greg(at)CopelandConsulting(dot)Net>, "PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-13 01:47:28
Message-ID:	GNELIHDDFBOCMGBFGEFOIEIGCFAA.chriskl@familyhealth.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> Seriously, I know Linux can change these on the fly, and I'm pretty sure
> Solaris can too. I haven't played with BSD for a while so can't speak
> about that. Anyone else know?

You cannot change SHMMAX on the fly on FreeBSD.

Chris

From:	Larry Rosenman <ler(at)lerctr(dot)org>
To:	Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-13 01:51:38
Message-ID:	58760000.1045101098@lerlaptop.lerctr.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

--On Thursday, February 13, 2003 09:47:28 +0800 Christopher Kings-Lynne
<chriskl(at)familyhealth(dot)com(dot)au> wrote:

>> Seriously, I know Linux can change these on the fly, and I'm pretty sure
>> Solaris can too. I haven't played with BSD for a while so can't speak
>> about that. Anyone else know?
>
> You cannot change SHMMAX on the fly on FreeBSD.
Yes you can, on recent 4-STABLE:

Password:
lerlaptop# sysctl kern.ipc.shmmax=66000000
kern.ipc.shmmax: 33554432 -> 66000000
lerlaptop#uname -a
FreeBSD lerlaptop.lerctr.org 4.7-STABLE FreeBSD 4.7-STABLE #38: Mon Feb 3
21:51:25 CST 2003
ler(at)lerlaptop(dot)lerctr(dot)org:/usr/obj/usr/src/sys/LERLAPTOP i386
lerlaptop#

>
> Chris
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: ler(at)lerctr(dot)org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

From:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To:	Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Copeland <greg(at)CopelandConsulting(dot)Net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-13 02:36:45
Message-ID:	200302130236.h1D2ajh17527@candle.pha.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Christopher Kings-Lynne wrote:
> > Seriously, I know Linux can change these on the fly, and I'm pretty sure
> > Solaris can too. I haven't played with BSD for a while so can't speak
> > about that. Anyone else know?
>
> You cannot change SHMMAX on the fly on FreeBSD.

And part of the reason is because some/most BSD's map the page tables
into physical RAM (kernel space) rather than use some shared page table
mechanism. This is good because it prevents the shared memory from
being swapped out (performance disaster).

It doesn't actually allocate RAM unless someone needs it, but it does
lock the shared memory into a specific fixed location for all processes.

The more flexible approach is to make shared memory act just like the
memory of a user process, and have other user processes share those page
tables, but that adds extra overhead and can cause the memory to behave
just like user memory (swapable).

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-13 03:08:26
Message-ID:	9700.1045105706@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> I know you like it in that range, and 4 or 8 MB of buffers by default
> should not be a problem. But personally I think if the optimal buffer
> size does not depend on both the physical RAM you want to dedicate to
> PostgreSQL and the nature and size of the database, then we have achieved
> a medium revolution in computer science. ;-)

But this is not about "optimal" settings. This is about "pretty good"
settings. As long as we can get past the knee of the performance curve,
I think we've done what should be expected of a default parameter set.

I believe that 1000 buffers is enough to get past the knee in most
scenarios. Again, I haven't got hard evidence, but that's my best
guess.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, Greg Copeland <greg(at)CopelandConsulting(dot)Net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-13 03:18:35
Message-ID:	9792.1045106315@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

>> You cannot change SHMMAX on the fly on FreeBSD.

I think we suffered some topic drift here --- wasn't the last question
about whether SEMMAX can be increased on-the-fly? That wouldn't have
anything to do with memory-mapping strategies...

regards, tom lane

From:	Curt Sampson <cjs(at)cynic(dot)net>
To:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Copeland <greg(at)CopelandConsulting(dot)Net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-13 04:32:15
Message-ID:	Pine.NEB.4.51.0302131325430.1487@angelic.cynic.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Wed, 12 Feb 2003, Bruce Momjian wrote:

> Christopher Kings-Lynne wrote:
>
> > You cannot change SHMMAX on the fly on FreeBSD.
>
> And part of the reason is because some/most BSD's map the page tables
> into physical RAM (kernel space) rather than use some shared page table
> mechanism. This is good because it prevents the shared memory from
> being swapped out (performance disaster).

Not at all! In all the BSDs, as far as I'm aware, SysV shared memory is
just normal mmap'd memory.

FreeBSD offers a sysctl that lets you mlock() that memory, and that is
helpful only because postgres insists on taking data blocks that are
already in memory, fully sharable amongst all back ends and ready to be
used, and making a copy of that data to be shared amongst all back ends.

> It doesn't actually allocate RAM unless someone needs it, but it does
> lock the shared memory into a specific fixed location for all processes.

I don't believe that the shared memory is not locked to a specific VM
address for every process. There's certainly no reason it needs to be.

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc:	Greg Copeland <greg(at)CopelandConsulting(dot)Net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-13 05:15:02
Message-ID:	10628.1045113302@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

"scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> writes:
> Tom, now you're making me all misty eyed for 14" platter 10 Meg hard
> drives and paper tape readers. :-)

<python> Och, we used to *dream* of 10 meg drives... </python>

regards, tom lane

From:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-13 06:12:17
Message-ID:	200302130612.h1D6CHc14823@candle.pha.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Peter Eisentraut wrote:
> Tom Lane writes:
>
> > Well, as I commented later in that mail, I feel that 1000 buffers is
> > a reasonable choice --- but I have to admit that I have no hard data
> > to back up that feeling.
>
> I know you like it in that range, and 4 or 8 MB of buffers by default
> should not be a problem. But personally I think if the optimal buffer
> size does not depend on both the physical RAM you want to dedicate to
> PostgreSQL and the nature and size of the database, then we have achieved
> a medium revolution in computer science. ;-)

I have thought about this and I have an idea. Basically, increasing the
default values may get us closer, but it will discourage some to tweek,
and it will cause problems with some OS's that have small SysV params.

So, my idea is to add a message at the end of initdb that states people
should run the pgtune script before running a production server.

The pgtune script will basically allow us to query the user, test the OS
version and perhaps parameters, and modify postgresql.conf with
reasonable values. I think this is the only way to cleanly get folks
close to where they should be.

For example, we can ask them how many rows and tables they will be
changing, on average, between VACUUM runs. That will allow us set the
FSM params. We can ask them about using 25% of their RAM for shared
buffers. If they have other major apps running on the server or have
small tables, we can make no changes. We can basically ask them
questions and use that info to set values.

We can even ask about sort usage maybe and set sort memory. We can even
control checkpoint_segments this way if they say they will have high
database write activity and don't worry about disk space usage. We may
even be able to compute some random page cost estimate.

Seems a script is going to be the best way to test values and assist
folks in making reasonable decisions about each parameter. Of course,
they can still edit the file, and we can ask them if they want
assistance to set each parameter or leave it alone.

I would restrict the script to only deal with tuning values, and tell
people they still need to review that file for other useful parameters.

Another option would be to make a big checklist or web page that asks
such questions and computes proper values, but it seems a script would
be easiest. We can even support '?' which would explain why the
question is being ask and how it affects the value.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-13 15:06:29
Message-ID:	13183.1045148789@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> So, my idea is to add a message at the end of initdb that states people
> should run the pgtune script before running a production server.

Do people read what initdb has to say?

IIRC, the RPM install scripts hide initdb's output from the user
entirely. I wouldn't put much faith in such a message as having any
real effect on people...

regards, tom lane

From:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-13 17:10:56
Message-ID:	200302131710.h1DHAui22918@candle.pha.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > So, my idea is to add a message at the end of initdb that states people
> > should run the pgtune script before running a production server.
>
> Do people read what initdb has to say?
>
> IIRC, the RPM install scripts hide initdb's output from the user
> entirely. I wouldn't put much faith in such a message as having any
> real effect on people...

Yes, that is a problem. We could show something in the server logs if
pg_tune hasn't been run. Not sure what else we can do, but it would
give folks a one-stop thing to run to deal with performance
configuration.

We could prevent the postmaster from starting unless they run pg_tune or
if they have modified postgresql.conf from the default. Of course,
that's pretty drastic.

From:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To:	Curt Sampson <cjs(at)cynic(dot)net>
Cc:	Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Copeland <greg(at)CopelandConsulting(dot)Net>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-13 18:49:06
Message-ID:	200302131849.h1DIn6r01827@candle.pha.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

I was speaking of the 4.4 BSD. FreeBSD has the merged VM, and I think
NetBSD only recently did that. BSD/OS does do the locking by default
and it maps into the kernel address space. I believe FreeBSD has a
sysctl to control locking of SysV memory.

One advantage of having it all at the same VM address is that they can
use the same page tables for virtual address lookups.

---------------------------------------------------------------------------

Curt Sampson wrote:
> On Wed, 12 Feb 2003, Bruce Momjian wrote:
>
> > Christopher Kings-Lynne wrote:
> >
> > > You cannot change SHMMAX on the fly on FreeBSD.
> >
> > And part of the reason is because some/most BSD's map the page tables
> > into physical RAM (kernel space) rather than use some shared page table
> > mechanism. This is good because it prevents the shared memory from
> > being swapped out (performance disaster).
>
> Not at all! In all the BSDs, as far as I'm aware, SysV shared memory is
> just normal mmap'd memory.
>
> FreeBSD offers a sysctl that lets you mlock() that memory, and that is
> helpful only because postgres insists on taking data blocks that are
> already in memory, fully sharable amongst all back ends and ready to be
> used, and making a copy of that data to be shared amongst all back ends.
>
> > It doesn't actually allocate RAM unless someone needs it, but it does
> > lock the shared memory into a specific fixed location for all processes.
>
> I don't believe that the shared memory is not locked to a specific VM
> address for every process. There's certainly no reason it needs to be.
>
> cjs
> --
> Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
> Don't you know, in this new Dark Age, we're all light. --XTC
>

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-13 20:47:08
Message-ID:	87znozkiwj.fsf@stark.dyndns.tv
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

> On Wed, 12 Feb 2003, Bruce Momjian wrote:
> >
> > And part of the reason is because some/most BSD's map the page tables
> > into physical RAM (kernel space) rather than use some shared page table
> > mechanism. This is good because it prevents the shared memory from
> > being swapped out (performance disaster).

Well, it'll only be swapped out if it's not being used...

In any case you can use madvise() to try to avoid that, but it doesn't seem
likely to be a problem since they would probably be the most heavily used
pages in postgres.

--
greg

From:	Kevin Brown <kevin(at)sysexperts(dot)com>
To:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: Changing the default configuration (was Re:
Date:	2003-02-14 03:26:05
Message-ID:	20030214032605.GA18932@filer
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Josh Berkus wrote:
> > > Uh ... do we have a basis for recommending any particular sets of
> > > parameters for these different scenarios? This could be a good idea
> > > in the abstract, but I'm not sure I know enough to fill in the details.
>
> Sure.
> Mostly-Read database, few users, good hardware, complex queries:
> = High shared buffers and sort mem, high geqo and join collapse thresholds,
> moderate fsm settings, defaults for WAL.
> Same as above with many users and simple queries (webserver) =
> same as above, except lower sort mem and higher connection limit
> High-Transaction Database =
> Moderate shared buffers and sort mem, high FSM settings, increase WAL files
> and buffers.
> Workstation =
> Moderate to low shared buffers and sort mem, moderate FSM, defaults for WAL,
> etc.
> Low-Impact server = current defaults, more or less.

Okay, but there should probably be one more, called "Benchmark". The
real problem is what values to use for it. :-)

--
Kevin Brown kevin(at)sysexperts(dot)com

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc:	justin(at)postgresql(dot)org, merlin(dot)moncure(at)rcsonline(dot)com, pgsql-hackers(at)postgresql(dot)org, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration
Date:	2003-02-14 04:00:35
Message-ID:	200302132000.35639.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tatsuo,

> Sigh. People always complain "pgbench does not reliably producing
> repeatable numbers" or something then say "that's because pgbench's
> transaction has too much contention on the branches table". So I added
> -N option to pgbench which makes pgbench not to do any UPDATE to
> the branches table. But still people continue to complian...

Hey, pg_bench is a good start on a Postgres performance tester, and it's much,
much better than what there was before you came along ... which was nothing.
Thank you again for contributing it.

pg_bench is, however, only a start on a performance tester, and we'd need to
build it up before we could use it as the basis of a PG tuner.

--
Josh Berkus
Aglio Database Solutions
San Francisco

From:	Kevin Brown <kevin(at)sysexperts(dot)com>
To:	PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-14 05:06:00
Message-ID:	20030214050600.GP1833@filer
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Bruce Momjian wrote:
> We could prevent the postmaster from starting unless they run pg_tune or
> if they have modified postgresql.conf from the default. Of course,
> that's pretty drastic.

If you're going to do that, then you may as well make the defaults
something that will perform reasonably well under the widest
circumstances possible and let the postmaster fail when it can't
acquire the resources those defaults demand.

What I'd do is go ahead and make the defaults something reasonable,
and if the postmaster can't allocate, say, enough shared memory pages,
then it should issue an error message saying not only that it wasn't
able to allocate enough shared memory, but also which parameter to
change and (if it's not too much trouble to implement) what it can be
changed to in order to get past that part of the initialization (this
means that the postmaster has to figure out how much shared memory it
can actually allocate, via a binary search allocate/free method). It
should also warn that by lowering the value, the resulting performance
may be much less than satisfactory, and that the alternative (to
increase SHMMAX, in this example) should be used if good performance
is desired.

That way, someone whose only concern is to make it work will be able
to do so without having to do a lot of experimentation, and will get
plenty of warning that the result isn't likely to work very well.

And we end up getting better benchmarks in the cases where people
don't have to touch the default config. :-)

--
Kevin Brown kevin(at)sysexperts(dot)com

From:	Kevin Brown <kevin(at)sysexperts(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Tuning scenarios (was Changing the default configuration)
Date:	2003-02-14 05:31:24
Message-ID:	20030214053124.GQ1833@filer
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom Lane wrote:
> If I thought that pgbench was representative of anything, or even
> capable of reliably producing repeatable numbers, then I might subscribe
> to results derived this way. But I have little or no confidence in
> pgbench. Certainly I don't see how you'd use it to produce
> recommendations for a range of application scenarios, when it's only
> one very narrow scenario itself.

So let's say you were designing a tool to help someone get reasonable
performance out of a PostgreSQL installation. What scenarios would
you include in such a tool, and what information would you want out of
it?

You don't have any real confidence in pgbench. Fair enough. What
*would* you have confidence in?

--
Kevin Brown kevin(at)sysexperts(dot)com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kevin Brown <kevin(at)sysexperts(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Tuning scenarios (was Changing the default configuration)
Date:	2003-02-14 06:04:34
Message-ID:	4590.1045202674@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> You don't have any real confidence in pgbench. Fair enough. What
> *would* you have confidence in?

Measurements on your actual application?

In fairness to pgbench, most of its problems come from people running
it at tiny scale factors, where it reduces to an exercise in how many
angels can dance on the same pin (or, how many backends can contend to
update the same row). And in that regime it runs into two or three
different Postgres limitations that might or might not have any
relevance to your real-world application --- dead-index-row references
used to be the worst, but I think probably aren't anymore in 7.3.
But those same limitations cause the results to be unstable from run
to run, which is why I don't have a lot of faith in reports of pgbench
numbers. You need to work quite hard to get reproducible numbers out
of it.

No, I don't have a better benchmark in my pocket :-(

regards, tom lane

From:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To:	"PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Offering tuned config files
Date:	2003-02-14 06:12:50
Message-ID:	GNELIHDDFBOCMGBFGEFOEEJACFAA.chriskl@familyhealth.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

OK,

Here's a stab at some extra conf files. Feel free to shoot them down.

If we can come up with at least _some_ alternative files that we can put
somewhere for them to see when postgres is installed, then at least people
can see what variables will affect what...

I didn't see the point of a 'workstation' option, the default is fine for
that.

Chris

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org]On Behalf Of Kevin Brown
> Sent: Friday, 14 February 2003 11:26 AM
> To: PostgresSQL Hackers Mailing List; pgsql-advocacy(at)postgresql(dot)org
> Subject: Re: [HACKERS] Changing the default configuration (was Re:
> [pgsql-advocacy]
>
>
> Josh Berkus wrote:
> > > > Uh ... do we have a basis for recommending any particular sets of
> > > > parameters for these different scenarios? This could be a good idea
> > > > in the abstract, but I'm not sure I know enough to fill in
> the details.
> >
> > Sure.
> > Mostly-Read database, few users, good hardware, complex queries:
> > = High shared buffers and sort mem, high geqo and join
> collapse thresholds,
> > moderate fsm settings, defaults for WAL.
> > Same as above with many users and simple queries (webserver) =
> > same as above, except lower sort mem and higher connection limit
> > High-Transaction Database =
> > Moderate shared buffers and sort mem, high FSM settings,
> increase WAL files
> > and buffers.
> > Workstation =
> > Moderate to low shared buffers and sort mem, moderate FSM,
> defaults for WAL,
> > etc.
> > Low-Impact server = current defaults, more or less.
>
> Okay, but there should probably be one more, called "Benchmark". The
> real problem is what values to use for it. :-)
>
>
>
> --
> Kevin Brown kevin(at)sysexperts(dot)com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>

Attachment	Content-Type	Size
postgresql.conf.sample-olap	application/octet-stream	5.5 KB
postgresql.conf.sample-web	application/octet-stream	5.3 KB
postgresql.conf.sample-writeheavy	application/octet-stream	5.2 KB

From:	Kevin Brown <kevin(at)sysexperts(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Tuning scenarios (was Changing the default configuration)
Date:	2003-02-14 11:48:43
Message-ID:	20030214114843.GB1847@filer
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Tom Lane wrote:
> Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> > You don't have any real confidence in pgbench. Fair enough. What
> > *would* you have confidence in?
>
> Measurements on your actual application?

That unfortunately doesn't help us a whole lot in figuring out
defaults that will perform reasonably well under broad conditions,
unless there's some way to determine a reasonably consistent pattern
(or set of patterns) amongst a lot of those applications.

> In fairness to pgbench, most of its problems come from people running
> it at tiny scale factors, where it reduces to an exercise in how many
> angels can dance on the same pin (or, how many backends can contend to
> update the same row).

This isn't easy to fix, but I don't think it's impossible either.
It's probably sufficient to make the defaults dependent on information
gathered about the system. I'd think total system memory would be the
primary thing to consider, since most database engines are pretty fast
once all the data and indexes are cached. :-)

> And in that regime it runs into two or three different Postgres
> limitations that might or might not have any relevance to your
> real-world application --- dead-index-row references used to be the
> worst, but I think probably aren't anymore in 7.3. But those same
> limitations cause the results to be unstable from run to run, which
> is why I don't have a lot of faith in reports of pgbench numbers.
> You need to work quite hard to get reproducible numbers out of it.

The interesting question is whether that's more an indictment of how
PG does things or how pg_bench does things. I imagine it's probably
difficult to get truly reproducible numbers out of pretty much any
benchmark coupled with pretty much any database engine. There are
simply far too many parameters to tweak on any but the simplest
database engines, and we haven't even started talking about tuning the
OS around the database...

And benchmarks (as well as real-world applications) will always run
into limitations of the database (locking mechanisms, IPC limits,
etc.). In fact, that's another useful purpose: to see where the
limits of the database are.

Despite the limits, it's probably better to have a benchmark that only
gives you an order of magnitude idea of what to expect than to not
have anything at all. And thus we're more or less right back where we
started: what kinds of benchmarking tests should go into a benchmark
for the purposes of tuning a database system (PG in particular but the
answer might easily apply to others as well) so that it will perform
decently, if not optimally, under the most likely loads?

I think we might be able to come up with some reasonable answers to
that, as long as we don't expect too much out of the resulting
benchmark. The right people to ask are probably the people who are
actually running production databases.

Anyone wanna chime in here with some opinions and perspectives?

--
Kevin Brown kevin(at)sysexperts(dot)com

From:	Manfred Koizar <mkoi-pg(at)aon(dot)at>
To:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
Cc:	"PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Offering tuned config files
Date:	2003-02-14 11:58:57
Message-ID:	c9hp4vcn1crqa51jjsroni901umf727ah9@4ax.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Fri, 14 Feb 2003 14:12:50 +0800, "Christopher Kings-Lynne"
<chriskl(at)familyhealth(dot)com(dot)au> wrote:
>Here's a stab at some extra conf files. Feel free to shoot them down.

No intent to shoot anything down, just random thoughts:

effective_cache_size = 20000 (~ 160 MB) should be more adequate for a
256 MB machine than the extremely conservative default of 1000. I
admit that the effect of this change is hard to benchmark. A way too
low (or too high) setting may lead the planner to wrong conclusions.

More parameters affecting the planner:
#cpu_tuple_cost = 0.01
#cpu_index_tuple_cost = 0.001
#cpu_operator_cost = 0.0025

Are these still good defaults? I have no hard facts, but ISTM that
CPU speed is increasing more rapidly than disk access speed.

In postgresql.conf.sample-writeheavy you have:
commit_delay = 10000

Is this still needed with "ganged WAL writes"? Tom?

Servus
Manfred

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kevin Brown <kevin(at)sysexperts(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Tuning scenarios (was Changing the default configuration)
Date:	2003-02-14 15:05:44
Message-ID:	7027.1045235144@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> Tom Lane wrote:
>> ... But those same
>> limitations cause the results to be unstable from run to run, which
>> is why I don't have a lot of faith in reports of pgbench numbers.
>> You need to work quite hard to get reproducible numbers out of it.

> The interesting question is whether that's more an indictment of how
> PG does things or how pg_bench does things.

I didn't draw a conclusion on that ;-). I merely pointed out that the
numbers are unstable, and therefore not to be trusted without a great
deal of context ...

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Manfred Koizar <mkoi-pg(at)aon(dot)at>
Cc:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>, "PostgresSQL Hackers Mailing List" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Offering tuned config files
Date:	2003-02-14 15:07:45
Message-ID:	7050.1045235265@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Manfred Koizar <mkoi-pg(at)aon(dot)at> writes:
> In postgresql.conf.sample-writeheavy you have:
> commit_delay = 10000
> Is this still needed with "ganged WAL writes"? Tom?

I doubt that the current options for grouped commits are worth anything
at the moment. Chris, do you have any evidence backing up using
commit_delay with 7.3?

regards, tom lane

From:	johnnnnnn <john(at)phaedrusdeinus(dot)org>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Tuning scenarios (was Changing the default configuration)
Date:	2003-02-14 16:33:14
Message-ID:	20030214163314.GK11017@performics.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Fri, Feb 14, 2003 at 03:48:43AM -0800, Kevin Brown wrote:
> That unfortunately doesn't help us a whole lot in figuring out
> defaults that will perform reasonably well under broad conditions,
> unless there's some way to determine a reasonably consistent pattern
> (or set of patterns) amongst a lot of those applications.

When moving to a new DB or DB box, we always run a series of
benchmarks to make sure there aren't any surprises
performance-wise. Our database activity, and thus our benchmarks, are
broken up into roughly three different patterns:

1- Transaction processing: small number of arbitrary small
(single-row) selects intermixed with large numbers of small inserts
and updates.

2- Reporting: large reads joining 6-12 tables, usually involving
calculations and/or aggregation.

3- Application (object retrieval): large numbers of arbitrary,
single-row selects and updates, with smaller numbers of single row
inserts.

We use our own application code to do our benchmarks, so they're not
general enough for your use, but it might be worthwhile to profile
each of those different patterns, or allow DB admins to limit it to a
relevant subset. Other patterns i can think of include logging (large
number of single row inserts, no updates, occasional large, simple
(1-3 table) selects), mining (complicated selects over 10 or more
tables), automated (small inserts/updates, with triggers cascading
everywhere), etc.

The problem becomes dealing with the large amounts of data necessary
to frame all of these patterns. An additional wrinkle is accomodating
both columns with well-distributed data and columns that are top-heavy
or which only have one of a small number of values. Plus indexed vs
unindexed columns.

Or, somewhat orthogonally, you could allow pgbench to take a workload
of different sql statements (with frequencies), and execute those
statements instead of the built-in transaction. Then it would be easy
enough to contribute a library of pattern workloads, or for the DBA to
write one herself.

Just my two cents.

-johnnnnnnnnnn

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Kevin Brown <kevin(at)sysexperts(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Tuning scenarios (was Changing the default configuration)
Date:	2003-02-14 17:36:02
Message-ID:	200302140936.02011.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Kevin,

> I think we might be able to come up with some reasonable answers to
> that, as long as we don't expect too much out of the resulting
> benchmark. The right people to ask are probably the people who are
> actually running production databases.
>
> Anyone wanna chime in here with some opinions and perspectives?

<grin> I thought you'd *never* ask.

(for background: I'm a consultant, and I administrate 6 postgresql databases
for 5 different clients)

First off, one can't do a useful performance test on the sort of random data
which can be generated by a script. The only really useful tests come from
testing on a copy of the user's own database(s), or on a real database of
some sort.

For new installations, we'd need to make a copy of a public domain or OSS
database as a part of our performance testing tool. This database would need
at least 10 tables, some of them quite large, with FK relationships.

Second, there are five kinds of query tests relevant to performmance:

A) Rapid-fire simple select queries.
B) Large complex select queries, combining at least 2 of: aggregates,
sub-selects, unions, unindexed text searches, and outer joins.
C) Rapid-fire small (<10 rows) update/insert/delete queries.
D) Large update queries (> 10,000 rows, possibly in more than one table)
E) Long-running PL procedures.

Testing on these five types of operations give an all-around test of server
performance. Fortunately, for many installations, not all tests are
relevant; in fact, for many, only 2 of the 5 above are relevant. For
example, for a PHP-Nuke installation, you'd only need to test on A and C. As
another example, an OLAP reporting server would only need to test on B.

Unfortunately, for any real production server, you need to test all the
different operations concurrently at the appropriate multi-user level.
Meaning that for one of my servers (a company-wide calendaring tool) I'd need
to run tests on A, B, C, and E all simultaneously ... for that matter, A and
C by themselves would require multiple connections.

So, once again, if we're talking about a testing database, we would need
twenty examples of A and C, ten of each of B and D, and at least 3 of E that
we could run. For real production databases, the user could supply "pools"
of the 5 types of operations from their real query base.

Thirdly, we're up against the problem that there are several factors which can
have a much more profound effect on database performance than *any* amount of
tuning postgresql.conf, even given a particular hardware platform. In my
experience, these factors include (in no particular order):
1) Location of the pg_xlog for heavy-update databases.
2) Location of other files on multi-disk systems
3) Choice of RAID and controller for RAID systems.
4) Filesystem choice and parameters
5) VACUUM/FULL/ANALYZE/REINDEX frequency and strategy
6) Database schema design
7) Indexing
Thus the user would have to be somehow informed that they need to examine all
of the above, possibly before running the tuning utility.

Therefore, any tuning utility would have to:
1) Inform the user about the other factors affecting performance and notify
them that they have to deal with these.

2) Ask the user for all of the following data:
a) How much RAM does your system have?
b) How many concurrent users, typically?
c) How often do you run VACUUM/FULL/ANALYZE?
d) Which of the Five Basic Operations does your database perform frequently?
(this question could be reduced to "what kind of database do you have?"
web site database = A and C
reporting database = A and B
transaction processing = A, C, D and possibly E etc.)
e) For each of the 5 operations, how many times per minute is it run?
f) Do you care about crash recovery? (if not, we can turn off fsync)
g) (for users testing on their own database) Please make a copy of your
database, and provide 5 pools of operation examples.

3) The the script would need to draw random operations from the pool, with
operation type randomly drawn weighted by relative frequency for that type of
operation. Each operation would be timed and scores kept per type of
operation.

4) Based on the above scores, the tuning tool could adjust the following
parameters:
For A) shared_buffers
For B) shared_buffers and sort_mem (and Tom's new JOIN COLLAPSE settings)
For C) and D) wal settings and FSM settings
For E) shared_buffers, wal, and FSM

5) Then run 3) again.

The problem is that the above process becomes insurmountably complex when we
are testing for several types of operations simultaneously. For example, if
operation D is slow, we might dramatically increase FSM, but that could take
away memory needed for op. B, making op. B run slower. So if we're running
concurrently, we could could find the adjustments made for each type of
operation contradictory, and the script would be more likely to end up in an
endless loop than at a balance. If we don't run the different types of
operations simultaneously, then it's not a good test; the optimal settings
for op. B, for example, may make ops. A and C slow down and vice-versa.

So we'd actually need to run an optimization for each type of desired
operation seperately, and then compare settings, adjust to a balance
(weighted according to the expected relative frequency), and re-test
concurrently. Aieee!

Personally, I think this is a project in and of itself. GBorg, anyone?

--
Josh Berkus
Aglio Database Solutions
San Francisco

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Tuning scenarios (was Changing the default configuration)
Date:	2003-02-14 18:10:00
Message-ID:	200302141010.00403.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Folks,

I forgot one question:

> 2) Ask the user for all of the following data:
> a) How much RAM does your system have?
> b) How many concurrent users, typically?
> c) How often do you run VACUUM/FULL/ANALYZE?
> d) Which of the Five Basic Operations does your database perform
> frequently? (this question could be reduced to "what kind of database do
> you have?" web site database = A and C
> reporting database = A and B
> transaction processing = A, C, D and possibly E etc.)
> e) For each of the 5 operations, how many times per minute is it run?
> f) Do you care about crash recovery? (if not, we can turn off fsync)
> g) (for users testing on their own database) Please make a copy of your
> database, and provide 5 pools of operation examples.
h) (for users using the test database) How large do you expect your main
tables to be in your database? (then the test database would need to have
its tables trimmed to match this estimate)

--
Josh Berkus
Aglio Database Solutions
San Francisco

From:	Tilo Schwarz <mail(at)tilo-schwarz(dot)de>
To:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-14 21:55:51
Message-ID:	200302142255.51683.mail@tilo-schwarz.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Bruce Momjian writes:
> Tom Lane wrote:
> > Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > > So, my idea is to add a message at the end of initdb that states people
> > > should run the pgtune script before running a production server.
> >
> > Do people read what initdb has to say?
> >
> > IIRC, the RPM install scripts hide initdb's output from the user
> > entirely. I wouldn't put much faith in such a message as having any
> > real effect on people...
>
> Yes, that is a problem. We could show something in the server logs if
> pg_tune hasn't been run. Not sure what else we can do, but it would
> give folks a one-stop thing to run to deal with performance
> configuration.
>
> We could prevent the postmaster from starting unless they run pg_tune or
> if they have modified postgresql.conf from the default. Of course,
> that's pretty drastic.

I don't think, that's drastic, if it's done in a user friendy way ;-):

I work with Postgresql for half a year now (and like it very much), but I must
admit, that it takes time to understand the various tuning parameters (what
is not surprising, because you need understand to a certain degree, what's
going on under the hood). Now think of the following reasoning:

- If the resouces of a system (like shared mem, max open files etc.) are not
known, it's pretty difficult to set good default values. That's why it is so
difficult to ship Postgresql with a postgresql.conf file which works nicely
on all systems on this planet.
- On the other hand, if the resouces of a system _are_ known, I bet the people
on this list can set much better default values than any newbie or a static
out-of-the-box postgresql.conf.

Thus the know how which is somehow in the heads of the gurus should be
condensed into a tune program which can be run on a system to detect the
system resources and which dumps a reasonable postgresql.conf. Those defaults
won't be perfect (because the application is not known yet) but much better
than the newbie or out-of-the-box settings.

If the tune program detects, that the system resouces are so limited, that it
makes basically no sense to run Postgresql there, it tells the user what the
options are: Increase the system resources (and how to do it if possible) or
"downtune" the "least reasonable" postgresql.conf file by hand. Given the
resources of average systems today, the chances are much higher, that users
leave Postgresql because "it's slower than other databases" than that they
get upset, because it doesn't start right away the first time.

Now how to make sure, the tune program gets run before postmaster starts the
first time? Prevent postmaster from starting, unless the tune program was run
and fail with a clear error message. The message should state, that the tune
program needs to be run first, why it needs to be run first and the command
line showing how to do that.

If I think back, I would have been happy to see such a message, you just copy
and paste the command to your shell, run the command and a few seconds later
you can restart postmaster with resonable settings. And the big distributors
have their own scipts anyway, so they can put it just before initdb.

Regards,

Tilo

From:	Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc:	Manfred Koizar <mkoi-pg(at)aon(dot)at>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Offering tuned config files
Date:	2003-02-15 14:15:01
Message-ID:	20030215221429.J63048-100000@houston.familyhealth.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

No, not really - I can do some more testing with pgbench to see what
happens though...I'll do it on monday

Chris

On Fri, 14 Feb 2003, Tom Lane wrote:

> Manfred Koizar <mkoi-pg(at)aon(dot)at> writes:
> > In postgresql.conf.sample-writeheavy you have:
> > commit_delay = 10000
> > Is this still needed with "ganged WAL writes"? Tom?
>
> I doubt that the current options for grouped commits are worth anything
> at the moment. Chris, do you have any evidence backing up using
> commit_delay with 7.3?
>
> regards, tom lane
>

From:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration (was Re:
Date:	2003-02-18 02:49:31
Message-ID:	200302180249.h1I2nVW06457@candle.pha.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

People seemed to like the idea:

Add a script to ask system configuration questions and tune
postgresql.conf.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Peter Eisentraut wrote:
> > Tom Lane writes:
> >
> > > Well, as I commented later in that mail, I feel that 1000 buffers is
> > > a reasonable choice --- but I have to admit that I have no hard data
> > > to back up that feeling.
> >
> > I know you like it in that range, and 4 or 8 MB of buffers by default
> > should not be a problem. But personally I think if the optimal buffer
> > size does not depend on both the physical RAM you want to dedicate to
> > PostgreSQL and the nature and size of the database, then we have achieved
> > a medium revolution in computer science. ;-)
>
> I have thought about this and I have an idea. Basically, increasing the
> default values may get us closer, but it will discourage some to tweek,
> and it will cause problems with some OS's that have small SysV params.
>
> So, my idea is to add a message at the end of initdb that states people
> should run the pgtune script before running a production server.
>
> The pgtune script will basically allow us to query the user, test the OS
> version and perhaps parameters, and modify postgresql.conf with
> reasonable values. I think this is the only way to cleanly get folks
> close to where they should be.
>
> For example, we can ask them how many rows and tables they will be
> changing, on average, between VACUUM runs. That will allow us set the
> FSM params. We can ask them about using 25% of their RAM for shared
> buffers. If they have other major apps running on the server or have
> small tables, we can make no changes. We can basically ask them
> questions and use that info to set values.
>
> We can even ask about sort usage maybe and set sort memory. We can even
> control checkpoint_segments this way if they say they will have high
> database write activity and don't worry about disk space usage. We may
> even be able to compute some random page cost estimate.
>
> Seems a script is going to be the best way to test values and assist
> folks in making reasonable decisions about each parameter. Of course,
> they can still edit the file, and we can ask them if they want
> assistance to set each parameter or leave it alone.
>
> I would restrict the script to only deal with tuning values, and tell
> people they still need to review that file for other useful parameters.
>
> Another option would be to make a big checklist or web page that asks
> such questions and computes proper values, but it seems a script would
> be easiest. We can even support '?' which would explain why the
> question is being ask and how it affects the value.
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square, Pennsylvania 19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>

From:	Tilo Schwarz <mail(at)tilo-schwarz(dot)de>
To:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Changing the default configuration (was Re: [pgsql-advocacy]
Date:	2003-02-19 22:22:44
Message-ID:	200302192322.44593.mail@tilo-schwarz.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Bruce Momjian writes:
> People seemed to like the idea:
>
> Add a script to ask system configuration questions and tune
> postgresql.conf.
>

Definitely! But it should have some sort of "This is my first database
installation"-mode, which means, that either only some very basic questions
(or none at all) are asked, or each question is followed by a reasonable
default value and a "if unsure, press <ENTER>" message. Otherwise the first
time user might get scared of all those questions...

Tilo

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Tuning scenarios (was Changing the default configuration)
Date:	2003-02-20 22:33:02
Message-ID:	200302201433.02654.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

Robert,

> > 1) Location of the pg_xlog for heavy-update databases.
>
> I see you put this up pretty high on the list. Do you feel this is the
> most important thing you can do? For example, if you had a two drive
> installation, would you load the OS and main database files on 1 disk
> and put the pg_xlog on the second disk above all other configurations?

Yes, actually. On machines with 2 IDE disks, I've found that this can make
as much as 30% difference in speed of serial/large UPDATE statements.

> Ideally I recommend 3 disks, one for os, one for data, one for xlog; but
> if you only had 2 would the added speed benefits be worth the additional
> recovery complexity (if you data/xlog are on the same disk, you have 1
> point of failure, one disk for backing up)

On the other hand, with the xlog on a seperate disk, the xlog and the database
disks are unlikely to fail at the same time. So I don't personally see it as
a recovery problem, but a benefit.

--
-Josh Berkus
Aglio Database Solutions
San Francisco

From:	Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Tuning scenarios (was Changing the default
Date:	2003-02-20 23:35:44
Message-ID:	1045784144.15881.92.camel@camel
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Thu, 2003-02-20 at 17:33, Josh Berkus wrote:
>
> Robert,
>
> > > 1) Location of the pg_xlog for heavy-update databases.
> >
> > I see you put this up pretty high on the list. Do you feel this is the
> > most important thing you can do? For example, if you had a two drive
> > installation, would you load the OS and main database files on 1 disk
> > and put the pg_xlog on the second disk above all other configurations?
>
> Yes, actually. On machines with 2 IDE disks, I've found that this can make
> as much as 30% difference in speed of serial/large UPDATE statements.

Do you know how well those numbers hold up under scsi and/ or raid based
system? (I'd assume anyone doing serious work would run scsi)

>
> > Ideally I recommend 3 disks, one for os, one for data, one for xlog; but
> > if you only had 2 would the added speed benefits be worth the additional
> > recovery complexity (if you data/xlog are on the same disk, you have 1
> > point of failure, one disk for backing up)
>
> On the other hand, with the xlog on a seperate disk, the xlog and the database
> disks are unlikely to fail at the same time. So I don't personally see it as
> a recovery problem, but a benefit.
>

ok (playing a bit of devil's advocate here), but you have two possible
points of failure, the data disk and the xlog disk. If either one goes,
your in trouble. OTOH if you put the OS disk on one drive and it goes,
your database and xlog are still safe on the other drive.

Robert Treat

From:	Andrew Sullivan <andrew(at)libertyrms(dot)info>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Tuning scenarios (was Changing the default
Date:	2003-02-20 23:49:28
Message-ID:	20030220184928.D30080@mail.libertyrms.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Thu, Feb 20, 2003 at 06:35:44PM -0500, Robert Treat wrote:
> Do you know how well those numbers hold up under scsi and/ or raid based
> system? (I'd assume anyone doing serious work would run scsi)

On some Sun E450s we have used, the machines are unusable for any
load with xlog on the same disk (in the case I'm remembering, these
are older 5400 RPM drives). Moving the xlog changed us for
<hazymemory>something like 10tps to something like 30tps</hazymemory>
in one seat-of-the-pants case. Sorry I can't be more specific.

> ok (playing a bit of devil's advocate here), but you have two possible
> points of failure, the data disk and the xlog disk. If either one goes,
> your in trouble. OTOH if you put the OS disk on one drive and it goes,
> your database and xlog are still safe on the other drive.

If you're really worried about that, use RAID 1+0.

--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<andrew(at)libertyrms(dot)info> M2P 2A8
+1 416 646 3304 x110

From:	"Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc:	tgl(at)sss(dot)pgh(dot)pa(dot)us, justin(at)postgresql(dot)org, josh(at)agliodbs(dot)com, merlin(dot)moncure(at)rcsonline(dot)com, pgsql-hackers(at)postgresql(dot)org, pgsql-advocacy(at)postgresql(dot)org
Subject:	Re: [HACKERS] Changing the default configuration
Date:	2003-04-29 04:41:38
Message-ID:	1051591298.7047.11.camel@zeutrh9
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-advocacy pgsql-hackers pgsql-performance

On Tue, 2003-02-11 at 20:10, Tatsuo Ishii wrote:
> Sigh. People always complain "pgbench does not reliably producing
> repeatable numbers" or something then say "that's because pgbench's
> transaction has too much contention on the branches table". So I added
> -N option to pgbench which makes pgbench not to do any UPDATE to
> the branches table. But still people continue to complian...

What exactly does the -N option do? I see no mention of it in the
README.pgbench, which might be part of reason people "continue to
complain".