Re: Overhauling GUCS

Lists: pgsql-hackers
From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Overhauling GUCS
Date: 2008-05-31 00:34:38
Message-ID: 48409D1E.3070208@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus and I had the fortune to share a plane flight when leaving pgCon.
This gave us a chance to discuss my ideas around reforming PostgreSQL
configuration at some length. Here's what we came up with (Magnus will
presumably correct the parts I got wrong):

Currently, PostgreSQL,conf and our set of GUC configurations suffer from
4 large problems:

1. Most people have no idea how to set these.
2. The current postgresql.conf file is a huge mess of 194 options, the
vast majority of which most users will never touch.
3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
and the settings.sgml), which are only synched with each other manually.
4. We don't seem to be getting any closer to autotuning.

We think all of these are solvable for 8.4. We also think that it's
vitally important than any overhaul of the GUCS to be completed in one
version to minimize the pain involved.

Here's a list of the things we want to change. It's all a package and
should make sense if you take all the changes as a whole.

1) Add several pieces of extra information to guc.c in the form of extra
"gettext" commands: default value, subcategory, long description,
recommendations, enum lists.
2) Incorporate this data into pg_settings
3) Delete postgresql.conf.sample
4) Add a script called pg_generate_conf to generate a postgresql.conf
based on guc.c and command-line switches (rather than
postgresql.conf.sample), which would be called automatically by initdb.

For (4), pg_generate_conf would take the following switches:
-b , --basic = short conf file, listing only the 15-18 most commonly
changed options
-a , --advanced = conf file listing all 196+ options
-t, --terse = conf file lists only category headings and actual
settings, no comments
-n, --normal = conf file has category and subcategory settings, with
short desc comments
-v, --verbose = conf file lists full descriptions and recommendations in
comments with each option
-c "option = value" set specific option to specific value in the file
-f "filename" = take options and values from file "filename"

The default would be "-b, -n" with specific settings for shared_buffers
and wal_sync_method. The current postgresql.conf is a lot more like an
"-a, -v" file.

This would help us in the following ways:

A. there would now only be 2 places to maintain GUCS lists, the docs and
guc.c.
B. by having a generated postgresql.conf and an easy way to generate it,
writing autoconfiguration scripts (as well as shortcuts like SET
PERSISTENT) become vastly easier.
C. Most users would deal only with a limited set of 15-20 configuration
variables.

There's obviously some refinements needed, but what do people think of
the above general idea?


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-05-31 04:11:47
Message-ID: Pine.GSO.4.64.0805302257380.20830@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 30 May 2008, Josh Berkus wrote:

> 1) Add several pieces of extra information to guc.c in the form of extra
> "gettext" commands: default value, subcategory, long description,
> recommendations, enum lists.
> 2) Incorporate this data into pg_settings

When you last brought this up in February (I started on a long reply to
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00759.php that I
never quite finished) the thing I got stuck on was how to deal with the
way people tend to comment in these files as they change things.

One problem I didn't really see addressed by the improvements you're
suggesting is how to handle migrating customized settings to a new version
(I'm talking about 8.4->9.0 after this is in place, 8.3->8.4 is a whole
different problem). It would be nice to preserve "history" of what people
did like in your examples (which look exactly like what I find myself
running into in the field). Now, that will get a lot easier just by
virtue of having a smaller config file, but I think that adding something
into pg_settings that allows saving user-added commentary would be a nice
step toward some useful standardization on that side of things. It would
make future automated tools aimed at parsing and generating new files, as
part of things like version upgrades, a lot easier if there was a standard
way such comments were handled in addition to the raw data itself.

The other thing I'd like to see make its way into pg_settings, so that
tools can operate on it just by querying the database, is noting what file
the setting came from so that you can track things like include file
usage. I think with those two additions (comments and source file
tracking) it would even be concievable to clone a working facsimile of
even a complicated postgresql.conf file set remotely just by reading
pg_settings.

While a bit outside of the part you're specifically aiming to improve
here, if you could slip these two additions in I think it would be a boon
to future writers of multi-server management tools as well.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-05-31 07:34:34
Message-ID: 1212219274.4120.191.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Fri, 2008-05-30 at 17:34 -0700, Josh Berkus wrote:

> There's obviously some refinements needed, but what do people think of
> the above general idea?

My initial reaction is that this is too much change, though I agree with
many of the points and understand it is well intended.

We have many supporters, though 90% of them seldom touch the database
from one release to the next. Many are dismayed that every time they do
we've fiddled with the knobs so some don't work anymore, some are
missing or renamed and there's a few new ones. So if they don't know
what they're doing it is frequently because we keep moving the
goalposts.

We should also consider that most people with multiple databases are
running multiple versions of PostgreSQL. The main reason for that is
that we keep changing the behaviour of SQL between releases, so even if
you had a magic superfast upgrade utility, in perhaps 50% of cases they
won't use it because they have to do a full application re-test, which
costs time and money.

Trying to be a Postgres DBA is hard when each new release is configured
differently to the last one and we have a major release about 3-5 more
frequently than Oracle/SQLServer. That is probably largest source of
questions and annoyance from the students on the courses I run.

So my viewpoint is that we should be aggressively adding new features,
yet be conservative in changing existing behaviour: provide options for
behaves-like-previous-release and keep the administrative interfaces as
similar as possible between releases.

If you wish to make changes, I would suggest that you simply reorder
some of the parameters in the .conf file, so that the ones you think are
important are grouped at the top.

Another suggestion would be to allow a #include style interface within
postgresql.conf. We can then do this:

#include standard_postgresql.conf

# now we put Josh's favourite GUCs as overrides on the above
...

That keeps things simple because the standard_postgresql.conf looks
almost identical to what people are used to. It also provides a new
feature, allowing people to include site-wide defaults for various
settings to allow easy implementation of local policy. It also
demonstrates a best practice approach to managing GUCs.

Some other problems I see with GUCs

* It's not possible to set one parameter depending upon the setting of
another.

* It's always unclear which GUCs can be changed, and when. That is much
more infrequently understood than the meaning of them.

* We should rename effective_cache_size to something that doesn't sound
like it does what shared_buffers does

* There is no config verification utility, so if you make a change and
then try to restart and it won't, you are in trouble.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-05-31 15:14:58
Message-ID: 48416B72.3060203@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Fri, 2008-05-30 at 17:34 -0700, Josh Berkus wrote:

> We have many supporters, though 90% of them seldom touch the database
> from one release to the next. Many are dismayed that every time they do
> we've fiddled with the knobs so some don't work anymore, some are
> missing or renamed and there's a few new ones.

I certainly agree we don't want to make it more difficult.

>
> So if they don't know
> what they're doing it is frequently because we keep moving the
> goalposts.

No, its because they don't read the docs. We are not talking about
dumping SQL here :) I don't feel it is correct to not perform such a
needed cleanup for fear that some user can't be bothered to read the
documentation.

> We should also consider that most people with multiple databases are
> running multiple versions of PostgreSQL. The main reason for that is
> that we keep changing the behaviour of SQL between releases, so even if
> you had a magic superfast upgrade utility, in perhaps 50% of cases they
> won't use it because they have to do a full application re-test, which
> costs time and money.

This could certainly be a problem.

>
> Trying to be a Postgres DBA is hard when each new release is configured
> differently to the last one and we have a major release about 3-5 more
> frequently than Oracle/SQLServer. That is probably largest source of
> questions and annoyance from the students on the courses I run.

That is the nature of the beast though. We are still learning,
improving, engineering. It's part of progress of the database. I would
suggest that your students (which is what I tell mine) not upgrade every
release but instead try hanging out for 3 years per release.

>
> So my viewpoint is that we should be aggressively adding new features,
> yet be conservative in changing existing behaviour: provide options for
> behaves-like-previous-release and keep the administrative interfaces as
> similar as possible between releases.
>

I agree with this in principle but not on this argument.

> If you wish to make changes, I would suggest that you simply reorder
> some of the parameters in the .conf file, so that the ones you think are
> important are grouped at the top.

IMO, the only settings that should be in the postgresql.conf are the
ones that require a restart. The rest should be gathered from pg_settings.

>
> Another suggestion would be to allow a #include style interface within
> postgresql.conf. We can then do this:
>
> #include standard_postgresql.conf
>
> # now we put Josh's favourite GUCs as overrides on the above
> ...

I think that could get very messy but Apache allows this. It isn't
exactly a new idea and I wouldn't fight against it.

> Some other problems I see with GUCs
>
> * It's not possible to set one parameter depending upon the setting of
> another.

That is getting a bit more complicated than I think we need. Unless I am
misunderstanding what you are saying.

>
> * It's always unclear which GUCs can be changed, and when. That is much
> more infrequently understood than the meaning of them.
>

This is a definite problem. Even the docs are a bit difficult on this
one. It should be a nice simple grid :) or like I said above, only goes
in the conf if requires a restart.

Hmm,

SET effective_cache_size = '5000MB';
WARNING: You must issue a reload to the database for this to take effect.

> * We should rename effective_cache_size to something that doesn't sound
> like it does what shared_buffers does

Hmmm, I wonder if it is not the other way around. Although I know that
one thing I run into is that a lot of people think effective_cache is an
allocation.

shared_buffer_alloc?
shared_mem?

>
> * There is no config verification utility, so if you make a change and
> then try to restart and it won't, you are in trouble.
>

+1 +1

Sincerely,

Joshua D. Drake


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-05-31 15:16:09
Message-ID: fcf48561907ae85f377b6390133d6e18@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> Many are dismayed that every time they do we've fiddled with the
> knobs so some don't work anymore, some are missing or renamed and
> there's a few new ones.
...
> * We should rename effective_cache_size

*raises eyebrow*

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200805311114
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhBa3UACgkQvJuQZxSWSsg6pACgu2EJOyzj+ik79QwzdTTfi8Qj
aW0AoNayz4sniFVFxnQFgnxh7dPB6QOD
=Uyap
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-05-31 16:23:35
Message-ID: 16866.1212251015@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Currently, PostgreSQL,conf and our set of GUC configurations suffer from
> 4 large problems:

> 1. Most people have no idea how to set these.
> 2. The current postgresql.conf file is a huge mess of 194 options, the
> vast majority of which most users will never touch.
> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
> and the settings.sgml), which are only synched with each other manually.
> 4. We don't seem to be getting any closer to autotuning.

The proposal doesn't actually solve any of those problems.

> Here's a list of the things we want to change. It's all a package and
> should make sense if you take all the changes as a whole.

> 1) Add several pieces of extra information to guc.c in the form of extra
> "gettext" commands: default value, subcategory, long description,
> recommendations, enum lists.
> 2) Incorporate this data into pg_settings
> 3) Delete postgresql.conf.sample
> 4) Add a script called pg_generate_conf to generate a postgresql.conf
> based on guc.c and command-line switches (rather than
> postgresql.conf.sample), which would be called automatically by initdb.

I disagree with doing any of this. It doesn't result in any useful
reduction in maintenance effort, and what it does do is make it
impossible to keep control over the detailed layout, formatting,
commenting etc in a sample postgresql.conf. Nor do I think that
"generate a whole config file from scratch" is going to be a useful
behavior for tuning problems --- how will you merge it with what
you had before?

regards, tom lane


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-05-31 17:51:41
Message-ID: DFB26BD0-AD37-44F7-9255-BA78EC7E7FF8@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On May 31, 2008, at 09:23, Tom Lane wrote:

>> 1. Most people have no idea how to set these.
>> 2. The current postgresql.conf file is a huge mess of 194 options,
>> the
>> vast majority of which most users will never touch.
>> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
>> and the settings.sgml), which are only synched with each other
>> manually.
>> 4. We don't seem to be getting any closer to autotuning.
>
> The proposal doesn't actually solve any of those problems.

It solves #2 at least.

> I disagree with doing any of this. It doesn't result in any useful
> reduction in maintenance effort, and what it does do is make it
> impossible to keep control over the detailed layout, formatting,
> commenting etc in a sample postgresql.conf. Nor do I think that
> "generate a whole config file from scratch" is going to be a useful
> behavior for tuning problems --- how will you merge it with what
> you had before?

I'd love to see these issues resolved. The current postgresql.conf is
way over the top. Might you have a better idea?

Thanks,

David


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-05-31 18:44:37
Message-ID: 200805311144.38056.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon, Tom, Greg,

> Now, that will get a lot easier just by
> virtue of having a smaller config file, but I think that adding something
> into pg_settings that allows saving user-added commentary would be a nice
> step toward some useful standardization on that side of things. It would
> make future automated tools aimed at parsing and generating new files, as
> part of things like version upgrades, a lot easier if there was a standard
> way such comments were handled in addition to the raw data itself.

Hmmm. What about a COMMENT ON SETTING? That seems like it would serve the
purpose ... and make preserving user comments easier than the current
file-conversion approach.

> The other thing I'd like to see make its way into pg_settings, so that
> tools can operate on it just by querying the database, is noting what file
> the setting came from so that you can track things like include file
> usage. I think with those two additions (comments and source file
> tracking) it would even be concievable to clone a working facsimile of
> even a complicated postgresql.conf file set remotely just by reading
> pg_settings.

Hmmm. Any idea how to do this? It sounds like a good idea to me.

> So my viewpoint is that we should be aggressively adding new features,
> yet be conservative in changing existing behaviour: provide options for
> behaves-like-previous-release and keep the administrative interfaces as
> similar as possible between releases.

It's my viewpoint based on a lot of user feedback that the current
postgresql.conf is fundamentally broken and a major roadblock to PostgreSQL
adoption. This was a point with which there was pretty much universal
agreement when I talked with people at pgCon. That is, I beleive that you're
arguing for keeping .conf stable for the 5% of users who understand it and
ignoring the 95% of users who are baffled by it.

At this point, I think we are the only major SQL database without some form of
basic autotuning for the most significant settings (certainly MySQL, Oracle,
and SQL Server have it); we've been able to coast through that because
autotuning features are new in other DBs, but it's going to start to hurt us
pretty soon.

Now, I do believe that we should plan any GUCS overhaul to happen in one
postgresql version rather than phasing it in over multiple versions so that
administrators only need to get used to a new way once.

> > Currently, PostgreSQL,conf and our set of GUC configurations suffer from
> > 4 large problems:
> >
> > 1. Most people have no idea how to set these.
> > 2. The current postgresql.conf file is a huge mess of 194 options, the
> > vast majority of which most users will never touch.
> > 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
> > and the settings.sgml), which are only synched with each other manually.
> > 4. We don't seem to be getting any closer to autotuning.
>
> The proposal doesn't actually solve any of those problems.

OK, let me draw some lines:
1 & 2) by not having the settings be defined in a 500-line file, new users
would no longer be baffled by scores of settings which probably don't concern
them, trying to find the handful of settings which do.

3) We'd consolidate the GUC lists down from 3 places to 2, which is one less
area to synchronize. Magnus and I looked to see if it might be possible to
generate the docs from the same list, but it's not practical.

4) By shifting from a model where postgresql.conf is document-formatted and
hand-edited to one where it's machine generated, it becomes vastly easier to
write simple utilities to manage these settings. Right now, the big
"obstacle" to things like SET PERSISTENT is "how to we preseve the
hand-edited comments in the file" -- and the answer is we *don't.*

> I disagree with doing any of this. It doesn't result in any useful
> reduction in maintenance effort, and what it does do is make it
> impossible to keep control over the detailed layout, formatting,
> commenting etc in a sample postgresql.conf.

Have you *looked* at postgresql.conf.sample lately, Tom? It's a disaster.
Maintenance is already difficult, and becoming more so as we add settings.

Further, you and Simon seem to think that the current "narrative docs inside
the conf file" format has some kind of value which makes things easier for
DBAs. I don't believe it does, and I have yet to meet a *single* new
PostgreSQL user who wasn't confused and intimidated by the file.

> Nor do I think that
> "generate a whole config file from scratch" is going to be a useful
> behavior for tuning problems --- how will you merge it with what
> you had before?

Who's merging? I don't think you get the proposal. The whole "hand-edited"
approach to postgresql.conf should go away. It's not useful, it's not
educational, and it doesn't make PostgreSQL easy to manage.

Further, the lack of an easy way to manage settings via port access to
PostgreSQL is a significant inhibitor to adopting PostgreSQL in environments
with large numbers of servers. See prior arguments by the CPANEL folks about
why they don't support PostgreSQL, which is in turn a major reason why
PostgreSQL web hosting is hard to find.

I agree that editing the data about settings in the guc.c file is not ideal;
Magnus and I discussed that mainly because we wanted to preserve the
translation framework with gettext strings. If someone can think of a better
way to do this part, I'm all ears.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Josh Berkus" <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-05-31 19:01:23
Message-ID: 87zlq6jo1o.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"David E. Wheeler" <david(at)kineticode(dot)com> writes:

> I'd love to see these issues resolved. The current postgresql.conf is way over
> the top. Might you have a better idea?

I don't think fiddling with surface issues like the formatting of the
postgresql.conf is productive. Hiding parameters because you don't think
beginners need them is only going to frustrate those people who do need to
adjust them.

What might be productive is picking up a group of parameters and thinking hard
about what they mean in terms of user-visible real-world effects. If they can
be recast in terms of behaviour the user wants instead of internal
implementation details then that would translate into a massive simplification
as well as being easier to explain to users.

I think we do a pretty good job of this already. Witness things like
effective_cache_size -- imagine if this were "nested_loop_cache_hit_rate" for
example, good luck figuring out what to set it to.

The vacuum cost delay factors are probably ripe for such a recast though. I
think we need just one parameter "vacuum_io_bandwidth" or something like that.
The bgwriter parameters might also be a candidate but I'm less certain.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-05-31 19:36:54
Message-ID: 87tzgejmeh.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Josh Berkus" <josh(at)agliodbs(dot)com> writes:

> It's my viewpoint based on a lot of user feedback that the current
> postgresql.conf is fundamentally broken and a major roadblock to PostgreSQL
> adoption. This was a point with which there was pretty much universal
> agreement when I talked with people at pgCon.

Actually as a new DBA when I was first starting out with Postgres I found it
very convenient to have all the common parameters in one place where I could
just uncomment and adjust them. Instead of having to search through
documentation and find the default value from which

> 1 & 2) by not having the settings be defined in a 500-line file, new users
> would no longer be baffled by scores of settings which probably don't concern
> them, trying to find the handful of settings which do.

I'm not sure how an empty file is any less "baffling" than one listing the
default value for parameters they don't need yet.

> 3) We'd consolidate the GUC lists down from 3 places to 2, which is one less
> area to synchronize. Magnus and I looked to see if it might be possible to
> generate the docs from the same list, but it's not practical.

This seems like a trivial gain and one which is unlikely to outweigh the pain
of having to massage the info into C data structures.

> 4) By shifting from a model where postgresql.conf is document-formatted and
> hand-edited to one where it's machine generated, it becomes vastly easier to
> write simple utilities to manage these settings. Right now, the big
> "obstacle" to things like SET PERSISTENT is "how to we preseve the
> hand-edited comments in the file" -- and the answer is we *don't.*

What this sounds like is a sly way to try to get rid of postgresql.conf
entirely and replace it with parameters stored in the database so admins would
adjust the parameters using an SQL syntax rather than a text file.

There are pros and cons of such a system but I think for newbie admins that
would be a thousand times *more* baffling. You would have to learn new
commands and have no holistic view of what parameters had been set, what
related parameters might exist. You also have no way to keep the file in a
version control system or sync across servers etc.

> Have you *looked* at postgresql.conf.sample lately, Tom? It's a disaster.
> Maintenance is already difficult, and becoming more so as we add settings.

What difficulties?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-05-31 21:34:27
Message-ID: 0315F14C-DDE3-47AA-A76E-B04B0430464D@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On May 31, 2008, at 12:36, Gregory Stark wrote:

> What this sounds like is a sly way to try to get rid of
> postgresql.conf
> entirely and replace it with parameters stored in the database so
> admins would
> adjust the parameters using an SQL syntax rather than a text file.
>
> There are pros and cons of such a system but I think for newbie
> admins that
> would be a thousand times *more* baffling. You would have to learn new
> commands and have no holistic view of what parameters had been set,
> what
> related parameters might exist. You also have no way to keep the
> file in a
> version control system or sync across servers etc.

FWIW, this has not been a barrier to MySQL adoption.

Best,

David


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-05-31 22:32:51
Message-ID: 200806010032.53337.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> Currently, PostgreSQL,conf and our set of GUC configurations suffer from
> 4 large problems:

As we have talked about it before, you know that I agree that the GUC system
could use some improvements. But I'm a bit surprised about some of your
assessments.

> 1. Most people have no idea how to set these.

Could you clarify this? I can't really believe that people are incapable of
editing a configuration file.

> 2. The current postgresql.conf file is a huge mess of 194 options, the
> vast majority of which most users will never touch.

My opinion has always been that we should provide a default file with only the
essential options instead of all of them. I see this as a the major problem,
because people are overwhelmed and consequently don't set anything.

> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
> and the settings.sgml), which are only synched with each other manually.

While this is not ideal, I can't really see who this is a major problem, at
least from the perspective of the user.

> 4. We don't seem to be getting any closer to autotuning.

True. But how does your proposal address this?


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-05-31 22:38:40
Message-ID: 39CAD718-B770-4F80-92AC-647B0778DB1B@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On May 31, 2008, at 15:32, Peter Eisentraut wrote:

>> 1. Most people have no idea how to set these.
>
> Could you clarify this? I can't really believe that people are
> incapable of
> editing a configuration file.

I've been using PostgreSQL on and off, mostly on, for almost 10 years.
I still have no idea what 75% of those settings in postgresql.conf
mean or are for. There are an overwhelming number of them. I know that
5-8 of them I always touch, thanks largely to assistance now and then
from Josh Berkus, but the rest are just complexity to me. I don't
doubt that the vast majority of them are useful in one situation or
another, but unless I'm in one of those situations, I really don't
need to see them there and be confused by them.

Does that help?

>> 2. The current postgresql.conf file is a huge mess of 194 options,
>> the
>> vast majority of which most users will never touch.
>
> My opinion has always been that we should provide a default file
> with only the
> essential options instead of all of them. I see this as a the major
> problem,
> because people are overwhelmed and consequently don't set anything.

+1

Best,

David


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-05-31 23:32:48
Message-ID: 16ab21af4a6ad3385411e4178dcc1416@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

In the interest of constructive criticism, here's some
ways I think postgresql.conf could be improved.

* A weak +1 to the #include idea. I'm much more inclined to simply
add a new section to the bottom of the file and use version
control, but I could see ways in which include would be useful.

* Rearrange the settings to put the more "common" ones at
the top, even if it means messing up the categories a bit.
Having 'bonjour_name' at the top, and 'default_statistics_target'
buried way down below is crazy. The sections should be
looked at from a clean perspective: do we need them at all? Are their
better ways to arrange things? What order should they be in?

* Much more verbose comments. The abovementioned default_statistics_target
is a very important settings, but there is zero explanation in the file
of what it is. The only thing we're told is that it ranges from 10 - 1000.
We can do better than that. Users would absolutely love it if each item
had a clear explanation, and it would be well worth a slightly increased
file size. See the postfix main.cf file for a good example of such.

* Remove the confusing "commented out is default" bit entirely. Simply
set each value explicitly. Why should new users have to confront this?:

# The commented-out settings shown in this file represent the default values.
# Re-commenting a setting is NOT sufficient to revert it to the default value;
# you need to reload the server.

* Lose the tabbed indenting, which looks bad, especially for multi-line
comments. Just use spaces.

* Create a tool, or at least a best practices, for controlling and tracking
changes to the file.

* Put some doc URLs in the file, at the minimum per major section. At the
very bare minimum, a real URL at the top.

* Indicate which values can be changed per-session or per-role.

* Fix the disparity between the syntax in the file and the SET interface.
For example, work_mem = 16MB works in the conf file, but you have to write
SET work_mem = '16MB'. Easiest is probably just to quote everything in the conf.

* Lose the post-value, end-of-line comments, they just get in the way when making
changes and make the file harder to read by contributing to the wrap problem.

* I'm tempted by the argument of creating a separate file for the obscure
settings, but I think it would be too much pain, and nobody would ever agree on
which settings are 'important' and which are 'obscure'.

* It might be nice to mention other ways to reload the file, such as
'service postgresql reload', or whatever Windows uses.

* The word 'paramters' is still misspelled. :)

* That whole sentence about changing the parameters as command-line
options needs to go away.

* Since the executable is now named "postgres" (thank goodness we got
rid of "postmaster"), the file should be named 'postgres.conf'. This would
also be a way to quickly distinguish 'old' vs 'new' style conf files if
we end up making major changes to it.

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200805311911
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhB39sACgkQvJuQZxSWSshahQCg4V5QsO34HOhUDoPzT7STcR45
V5UAoPQxkmuk/oCYirTKxMAhV+Kh8Ytz
=7Lgk
-----END PGP SIGNATURE-----


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-01 01:41:55
Message-ID: 4841FE63.1020708@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Sabino Mullane wrote:
>
> * A weak +1 to the #include idea. I'm much more inclined to simply
> add a new section to the bottom of the file and use version
> control, but I could see ways in which include would be useful.
>
>
>
>

We already have include directives, and have had since 8.2.

Thus spake the docs:

"In addition to parameter settings, the postgresql.conf file can contain
/include directives/, which specify another file to read and process as
if it were inserted into the configuration file at this point. Include
directives simply look like:

include 'filename'

If the file name is not an absolute path, it is taken as relative to the
directory containing the referencing configuration file. Inclusions can
be nested."

cheers

andrew


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-01 02:24:02
Message-ID: 48420842.6020802@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Sabino Mullane wrote:

> * Much more verbose comments. The abovementioned default_statistics_target
> is a very important settings, but there is zero explanation in the file
> of what it is. The only thing we're told is that it ranges from 10 - 1000.
> We can do better than that. Users would absolutely love it if each item
> had a clear explanation, and it would be well worth a slightly increased
> file size. See the postfix main.cf file for a good example of such.

I kind of agree with this but actually think we should have the bare
minimum comments in the file. Why? Because our documentation links are
static. Each setting should have the URL to the full documentation on a
particular setting.

>
> * Create a tool, or at least a best practices, for controlling and tracking
> changes to the file.
>

This I disagree with. There are plenty of tools to handle this should
someone really want to. SVN, CVS, parrot, etc... Let systems management
be the domain of systems management.

>
> * Put some doc URLs in the file, at the minimum per major section. At the
> very bare minimum, a real URL at the top.
>

Hah! See above :)

>
> * Indicate which values can be changed per-session or per-role.
>

Agreed. Along with this although offtopic for this post is a grid in the
docs that are explicit about this.

>
> * Fix the disparity between the syntax in the file and the SET interface.
> For example, work_mem = 16MB works in the conf file, but you have to write
> SET work_mem = '16MB'. Easiest is probably just to quote everything in the conf.

Agreed.

>
> * I'm tempted by the argument of creating a separate file for the obscure
> settings, but I think it would be too much pain, and nobody would ever agree on
> which settings are 'important' and which are 'obscure'.
>

Actually I could buy into this. There really are only about a dozen must
change settings (if that). I could see something like:

Memory settings:

network etc/network.conf
include etc/memory.conf
logging etc/logging.conf

etc...

>
> * It might be nice to mention other ways to reload the file, such as
> 'service postgresql reload', or whatever Windows uses.
>

I think a url to the docs is a better idea here.

>
> * The word 'paramters' is still misspelled. :)
>

Heh.

> * Since the executable is now named "postgres" (thank goodness we got
> rid of "postmaster"), the file should be named 'postgres.conf'. This would
> also be a way to quickly distinguish 'old' vs 'new' style conf files if
> we end up making major changes to it.

It was never postmaster.conf (that I can recall). I don't see the issue
here. Consider apache... It isn't apache.conf.

I think postgresql.conf (as that is the name of the software) makes sense.

Sincerely,

Joshua D. Drake


From: Alexander Vysokovskih <av(at)ural(dot)org>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-01 05:35:53
Message-ID: 48423539.1020308@ural.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Can I express one's user view:

Greg Sabino Mullane writes:
> * Much more verbose comments. The abovementioned default_statistics_target
> is a very important settings, but there is zero explanation in the file
> of what it is. The only thing we're told is that it ranges from 10 - 1000.
> We can do better than that. Users would absolutely love it if each item
> had a clear explanation, and it would be well worth a slightly increased
> file size. See the postfix main.cf file for a good example of such.
>
Absolutely agreed :-). Total list of GUC and fully explanation each
of them is sufficient.
For "easy start", it might be a several postgresql.conf with
distribution, like mysql doing it,
or generate postgresql.conf by wizard-dialog before initdb starting, for
example.
But I think that all GUCs must present in configuration file.

Thanks.


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-01 16:11:27
Message-ID: 4842CA2F.2040107@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark wrote:
>
> I think we do a pretty good job of this already. Witness things like
> effective_cache_size -- imagine if this were "nested_loop_cache_hit_rate" for
> example, good luck figuring out what to set it to.

I think either of these are fine if we describe how to measure
them. Ideally if we had a GUC that said "log_nested_loop_cache_hit_rate"
that enabled some timing code (understandably with lots of overhead) that
made an attempt to measure the hit rate, it'd be easier to figure out
than the effective cache size, no?

> The vacuum cost delay factors are probably ripe for such a recast though. I
> think we need just one parameter "vacuum_io_bandwidth" or something like that.

+1; though perhaps the inverse of that is more useful. When my
machines are idle I'd be happy if they vacuum more. Wouldn't
we be better served specifying the I/O bandwidth of each device/tablespace
and letting vacuum use whatever portion would be otherwise idle?

> The bgwriter parameters might also be a candidate but I'm less certain.


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-01 21:13:05
Message-ID: 63029b9d3ba7ab56c1fc11e42619eaf1@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Andrew Dunstan wrote:

> We already have include directives, and have had since 8.2.

Heh, thanks - which proves how useless they are to me. :)

Joshua Drake wrote:

> I kind of agree with this but actually think we should have the bare
> minimum comments in the file. Why? Because our documentation links are
> static. Each setting should have the URL to the full documentation on a
> particular setting.

Ugh, why so much context switching? Put the docs next to the setting. URLs
are nice but not necessary. If you are arguing for minimum comments in
conf files, please make a patch for pg_hba.conf ;)

>> * Create a tool, or at least a best practices, for controlling and tracking
>> changes to the file.

> This I disagree with. There are plenty of tools to handle this should
> someone really want to. SVN, CVS, parrot, etc... Let systems management
> be the domain of systems management.

Well, perhaps just a note in the docs at least that one might want to put
postgresql.conf in version control. I've seen people not doing so more often
than you would think. Perhaps because they are DBAs and not sysadmins? I also
meant a tool to do things like verify that the changes are valid, as someone
else mentioned elsewhere in this thread.

>> * It might be nice to mention other ways to reload the file, such as
>> 'service postgresql reload', or whatever Windows uses.

> I think a url to the docs is a better idea here.

Good point. Maybe a sort of "DBA basics" page in the docs is warranted for
things like this.

>> * Since the executable is now named "postgres" (thank goodness we got
>> rid of "postmaster"), the file should be named 'postgres.conf'. This would
>> also be a way to quickly distinguish 'old' vs 'new' style conf files if
>> we end up making major changes to it.

> It was never postmaster.conf (that I can recall). I don't see the issue
> here. Consider apache... It isn't apache.conf.

Not saying it ever was postmaster.conf: just that I'm glad we finally
changed the name. As for the Apache project, the httpd executable reads the
httpd.conf file. Hence, one might expect the postgres executable to read a
postgres.conf file.

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
End Point Corporation
PGP Key: 0x14964AC8 200806011656
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhDEJMACgkQvJuQZxSWSsgeogCfT0g69NDoxyWGiWmDcB3PxH8h
wJ8AnjzssA7aIk0rBdJzL+bB5vSQSeBV
=lgZG
-----END PGP SIGNATURE-----


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 04:46:20
Message-ID: 48437B1C.5010708@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Sabino Mullane wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160

> Joshua Drake wrote:
>
>> I kind of agree with this but actually think we should have the bare
>> minimum comments in the file. Why? Because our documentation links are
>> static. Each setting should have the URL to the full documentation on a
>> particular setting.
>
> Ugh, why so much context switching? Put the docs next to the setting. URLs
> are nice but not necessary. If you are arguing for minimum comments in
> conf files, please make a patch for pg_hba.conf ;)

Hah! Well I don't know that a minimum of comments is what I am arguing
as much as not too much comments. The comments in general in the
postgresql.conf are useless unless you have previous knowledge. I really
think that if we take advantage of the fact that we have static URLs
that life would be easier overall.

>
>>> * Create a tool, or at least a best practices, for controlling and tracking
>>> changes to the file.
>
>> This I disagree with. There are plenty of tools to handle this should
>> someone really want to. SVN, CVS, parrot, etc... Let systems management
>> be the domain of systems management.
>
> Well, perhaps just a note in the docs at least that one might want to put
> postgresql.conf in version control.

I could certainly buy into this. No reason we can't help people better
administrators. I would suggest a link to a static wiki page (on
wiki.pg) that would link to each option?

> I've seen people not doing so more often
> than you would think. Perhaps because they are DBAs and not sysadmins? I also
> meant a tool to do things like verify that the changes are valid, as someone
> else mentioned elsewhere in this thread.

pg_ctl -D data check?

I would +1 that. Including (in later releases):

WARNING: You specify 66536 for shared buffers but you only have 131072
of memory. Consider decreasing the parameter.

Obviously we would need more non math friendly wording.

>
>>> * It might be nice to mention other ways to reload the file, such as
>>> 'service postgresql reload', or whatever Windows uses.
>
>> I think a url to the docs is a better idea here.
>
> Good point. Maybe a sort of "DBA basics" page in the docs is warranted for
> things like this.
>

Yeah I could buy into this.

>>> * Since the executable is now named "postgres" (thank goodness we got
>>> rid of "postmaster"), the file should be named 'postgres.conf'. This would
>>> also be a way to quickly distinguish 'old' vs 'new' style conf files if
>>> we end up making major changes to it.
>
>> It was never postmaster.conf (that I can recall). I don't see the issue
>> here. Consider apache... It isn't apache.conf.
>
> Not saying it ever was postmaster.conf: just that I'm glad we finally
> changed the name. As for the Apache project, the httpd executable reads the
> httpd.conf file. Hence, one might expect the postgres executable to read a
> postgres.conf file.

Maybe, but I think I would need more convincing.

Sincerely,

Joshua D. Drake


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 05:27:23
Message-ID: 3431.1212384443@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> Greg Sabino Mullane wrote:
>> Ugh, why so much context switching? Put the docs next to the setting. URLs
>> are nice but not necessary. If you are arguing for minimum comments in
>> conf files, please make a patch for pg_hba.conf ;)

> Hah! Well I don't know that a minimum of comments is what I am arguing
> as much as not too much comments. The comments in general in the
> postgresql.conf are useless unless you have previous knowledge. I really
> think that if we take advantage of the fact that we have static URLs
> that life would be easier overall.

Yeah ... this thread started out with the idea of reducing duplication,
so how did we arrive at wanting to duplicate the SGML documentation into
postgresql.conf.sample? I think we need to integrate, not duplicate.

(pg_hba.conf.sample is a pretty good example of what not to do, I think
... and it used to be even worse ...)

regards, tom lane


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 07:37:36
Message-ID: 937d27e10806020037q719f438cl9dfaf58ef6a273de@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:

>> I've seen people not doing so more often
>> than you would think. Perhaps because they are DBAs and not sysadmins? I
>> also
>> meant a tool to do things like verify that the changes are valid, as
>> someone
>> else mentioned elsewhere in this thread.
>
> pg_ctl -D data check?
>
> I would +1 that.

I would also really like to see that - though I'd also like to see an
SQL interface so we can check a config before saving when editing via
pgAdmin or similar.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 10:21:05
Message-ID: Pine.GSO.4.64.0806020549580.26912@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 1 Jun 2008, Joshua D. Drake wrote:

> Well I don't know that a minimum of comments is what I am arguing as
> much as not too much comments.

Josh's proposal included making three levels of documentation-level
comments available: terse, normal, and verbose. The verbose comment
level probably should include a web link to full documentation. The way
the comments litter the existing file, the status quo that's called normal
mode in this proposal, is IMHO a complete mess. Most use cases I can
think of want either no comments or really verbose ones, the limited
commentary in the current sample postgresql.conf seems a middle ground
that's not right for anybody.

The key thing thing here in my mind is that it should be possible to
transform between those three different verbosity levels without losing
any settings or user-added comments. They're really just different views
on the same data, and which view you're seeing should be easy to change
without touching the data.

I just extracted the original design proposal and some of the relevent
follow-up in this thread, made some additional suggestions, and put the
result at http://wiki.postgresql.org/wiki/GUCS_Overhaul I think reading
that version makes it a bit clearer what the proposed overhaul is aiming
for.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-02 10:24:26
Message-ID: Pine.GSO.4.64.0806020452220.26912@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 1 Jun 2008, Peter Eisentraut wrote:

> Josh Berkus wrote:
>> 1. Most people have no idea how to set these.
> Could you clarify this? I can't really believe that people are incapable of
> editing a configuration file.

The big problem isn't the editing, it's knowing what to set the
configuration values to.

This is not to say that editing a configuration file should be considered
reasonable. Any GUCS overhaul should include a design goal of being able
to completely manage the configuration system using, say, pgadmin (the
"manage settings via port access" part that Josh already mentioned).
This is why I was suggesting additions aimed at assimilating all the
things that are in the postgresql.conf file.

Joshua has been banging a drum for a while now that all this data needs to
get pushing into the database itself. The GUCS data is clearly structured
like a database table. Josh's suggested changes are basically adding all
the columns needed to it in order to handle everything you'd want to do to
the table. If you think of it in those terms and make it possible to
manipulate that data using the tools already available for updating
tables, you'll open up the potential to add a whole new class of
user-friendly applications for making configuration easier to manage.

However, I don't fully agree with taking that idea as far as Joshua has
suggested (only having the config data in the database), because having
everything in a simple text file that can be managed with SCM etc. has
significant value. It's nice to allow admins to be able to make simple
changes with just a file edit. It's nice that you can look at all the
parameters in one place and browse them. However, I do think that the
internal database representation must be capable of holding everything in
the original postgresql.conf file and producing an updated version of the
file, either locally or remotely, as needed.

>> 4. We don't seem to be getting any closer to autotuning.
> True. But how does your proposal address this?

The idea that Josh's suggestions are working toward is simplying the
construction of tools that operate on the server configuration file, so
that it's easier to write an autotuning tool. Right now, writing such a
tool in a generic way gets so bogged down just in parsing/manipulating the
postgresql.conf file that it's hard to focus on actually doing the tuning
part. If we go back to his original suggestion:
http://wiki.postgresql.org/wiki/GUCS_Overhaul
>> Add a script called pg_generate_conf to generate a postgresql.conf
>> based on guc.c and command-line switches (rather than
>> postgresql.conf.sample)

It's an easy jump from there to imagine a pg_generate_conf that provide a
"wizard" interface to update a configuration file. I forsee a little GUI
or web app that connects to a server on port 5432, finds out some basic
information about the server, and gives something like this:

Parameter Current Recommended Change?
shared_buffers 32MB 1024MB [X]
effective_cache_size 128MB 3000MB [ ]
work_mem 1MB 16MB [ ]

Josh has the actual brains behind such an app all planned out if you look
at his presentations, but without the larger overhaul it's just not
possible to make the implementation elegant.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: "Stephen R(dot) van den Berg" <srb(at)cuci(dot)nl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 12:35:48
Message-ID: 20080602123548.GA1590@cuci.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith wrote:
>Joshua has been banging a drum for a while now that all this data needs to
>get pushing into the database itself. The GUCS data is clearly structured

>tables, you'll open up the potential to add a whole new class of
>user-friendly applications for making configuration easier to manage.

>However, I don't fully agree with taking that idea as far as Joshua has
>suggested (only having the config data in the database), because having
>everything in a simple text file that can be managed with SCM etc. has
>significant value. It's nice to allow admins to be able to make simple
>changes with just a file edit. It's nice that you can look at all the
>parameters in one place and browse them. However, I do think that the
>internal database representation must be capable of holding everything in
>the original postgresql.conf file and producing an updated version of the
>file, either locally or remotely, as needed.

Depending on the complexity you want to have inside the generator, one
could imagine a middle ground solution like:

include "database-generated.conf"
include "local-overrides.conf"

Where the "database-generated.conf" does not necessarily needs a lot of
comments.

>Josh has the actual brains behind such an app all planned out if you look
>at his presentations, but without the larger overhaul it's just not
>possible to make the implementation elegant.

IMHO Greg's response is the most comprehensive and well-thought-through
contribution in the whole GUC thread.
--
Sincerely,
Stephen R. van den Berg.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-02 14:12:06
Message-ID: 17932.1212415926@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> Joshua has been banging a drum for a while now that all this data needs to
> get pushing into the database itself.

This is, very simply, not going to happen. Shall we go over the reasons
why not, one more time?

1. Quite a few of the GUC parameters are needed *before* one can ever
read the database; in particular the ones about file locations and
shared memory sizing.

2. Suppose you change a parameter in a way that breaks the DB (eg,
set shared_buffers to a value larger than your kernel allows) and
try to restart. Database doesn't start. If the parameter can only
be changed back within an operating database, you're hosed.

I have no objection to providing alternative ways to edit the
configuration data, but the primary source of the settings is
going to continue to be an editable text file. Any proposals for
alternatives-to-a-text-editor have to work within that reality.

regards, tom lane


From: "Stephen R(dot) van den Berg" <srb(at)cuci(dot)nl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 15:07:15
Message-ID: 20080602150715.GA4217@cuci.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
>Greg Smith <gsmith(at)gregsmith(dot)com> writes:
>> Joshua has been banging a drum for a while now that all this data needs to
>> get pushing into the database itself.

>This is, very simply, not going to happen. Shall we go over the reasons
>why not, one more time?

>1. Quite a few of the GUC parameters are needed *before* one can ever
>read the database; in particular the ones about file locations and
>shared memory sizing.

Obviously.

>2. Suppose you change a parameter in a way that breaks the DB (eg,
>set shared_buffers to a value larger than your kernel allows) and
>try to restart. Database doesn't start. If the parameter can only
>be changed back within an operating database, you're hosed.

That's why those parameters will always have to be read from a textfile.
Which doesn't prohibit that textfile to be generated from within the
database (once up and running). And yes, if someone changes the
parameter through the DB, then makes the DB write out the configfile,
then restarts the DB remotely, and the change didn't work out (i.e. DB
doesn't start, or crashes/hangs), he is hosed and needs to log in and change
the textfile with a texteditor, no way around it.

>I have no objection to providing alternative ways to edit the
>configuration data, but the primary source of the settings is
>going to continue to be an editable text file. Any proposals for
>alternatives-to-a-text-editor have to work within that reality.

I think everyone is aware of that.
--
Sincerely,
Stephen R. van den Berg.


From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 15:59:21
Message-ID: 484418D9.5040608@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
>
> Some other problems I see with GUCs
>
> * It's not possible to set one parameter depending upon the setting of
> another.
>

To me this is more critical.. Most people I have seen will increase one
or few but not all parameters related to memory which can result in loss
of performance and productivity in figuring out.

What happened to AvailRAM setting and base all memory gucs on that.
Ideally PostgreSQL should only create one big memory pool and allow all
other variables to change runtime via dba or some tuner process or
customized application as long as total is less than the allocated
shared_memory and local_memory settings. (This will also reduce the need
of restarting Postgres if a value needs to be changed)

-Jignesh

> * It's always unclear which GUCs can be changed, and when. That is much
> more infrequently understood than the meaning of them.
>
> * We should rename effective_cache_size to something that doesn't sound
> like it does what shared_buffers does
>
> * There is no config verification utility, so if you make a change and
> then try to restart and it won't, you are in trouble.
>
>


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-02 16:22:21
Message-ID: Pine.GSO.4.64.0806021206500.4018@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2 Jun 2008, Tom Lane wrote:

> Greg Smith <gsmith(at)gregsmith(dot)com> writes:
>> Joshua has been banging a drum for a while now that all this data needs to
>> get pushing into the database itself.
>
> This is, very simply, not going to happen.

Right, there are also technical challenges in the way of that ideal. I
was only mentioning the reasons why it might not be the best idea even if
it were feasible. However, I do not see why the limitations you bring up
must get in the way of thinking about how to interact and manage the
configuration data in a database context, even though it ultimately must
be imported and exported to a flat file.

The concerns you bring up again about leaving the database in an
unstartable state are a particularly real danger in the "only has access
to 5432" hosted provider case that this redesign is trying to satisfy. I
added a "Gotchas" section to the wiki page so that this issue doesn't get
forgotten about. The standard way to handle this situation is to have a
known good backup configuration floating around. Adding something in that
area may end up being a hard requirement before remote editing makes
sense.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 16:29:17
Message-ID: 1212424157.4120.371.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Mon, 2008-06-02 at 11:59 -0400, Jignesh K. Shah wrote:
>
> Simon Riggs wrote:
> >
> > Some other problems I see with GUCs
> >
> > * It's not possible to set one parameter depending upon the setting of
> > another.
> >
>
> To me this is more critical.. Most people I have seen will increase one
> or few but not all parameters related to memory which can result in loss
> of performance and productivity in figuring out.
>
> What happened to AvailRAM setting and base all memory gucs on that.
> Ideally PostgreSQL should only create one big memory pool and allow all
> other variables to change runtime via dba or some tuner process or
> customized application as long as total is less than the allocated
> shared_memory and local_memory settings. (This will also reduce the need
> of restarting Postgres if a value needs to be changed)

Agreed.

Right now, we can't even do that in code, let alone in config file.

If we had a smart_memory_config = on then we'd be able to say in the
backend:
if (smart_memory_config)
{
other_thing = 0.1 * Nbuffers;
}

but the GUCs are evaluated in alphabetical order, without any way of
putting dependencies between them. So they are notionally orthogonal.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 16:47:42
Message-ID: Pine.GSO.4.64.0806021227100.4018@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2 Jun 2008, Jignesh K. Shah wrote:

> Most people I have seen will increase one or few but not all parameters
> related to memory which can result in loss of performance and
> productivity in figuring out.

If it becomes easier to build a simple tool available to help people tune
their configurations, that should help here without having to do anything
more complicated than that.

> What happened to AvailRAM setting and base all memory gucs on that.

Like some of the other GUC simplification ideas that show up sometimes
(unifying all I/O and limiting background processes based on that total is
another), this is hard to do internally. Josh's proposal has a fair
amount of work involved, but the code itself doesn't need to be clever or
too intrusive. Unifying all the memory settings would require being both
clever and intrusive, and I doubt you'll find anybody who could pull it
off who isn't already overtasked with more important improvements for the
8.4 timeframe.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)sun(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-02 18:06:04
Message-ID: 200806021106.05833.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg,

> Like some of the other GUC simplification ideas that show up sometimes
> (unifying all I/O and limiting background processes based on that total
> is another), this is hard to do internally.  Josh's proposal has a fair
> amount of work involved, but the code itself doesn't need to be clever
> or too intrusive.  Unifying all the memory settings would require being
> both clever and intrusive, and I doubt you'll find anybody who could
> pull it off who isn't already overtasked with more important
> improvements for the 8.4 timeframe.

Plus, I'm a big fan of "enable an API" rather than "write a feature". I
think that there are people & companies out there who can write better
autotuning tools than I can, and I'd rather give them a place to plug
those tools in than trying to write autotuning into the postmaster.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Paul van den Bogaard <Paul(dot)Vandenbogaard(at)Sun(dot)COM>
To: josh(at)agliodbs(dot)com
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-03 12:19:38
Message-ID: 0B854D92-C5C2-4138-A0FE-F81892A33869@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

to add some complexity to this topic :-) Please note I admit upfront
I am not familiar with every parameter out there, but during my quest
in finding bottleneck while stressing the back-end I find many GUC
parameters with names that show they should be interesting.
I read the comments, the docs (that I know of), go into the source to
learn the parameters and their units. And wondering if altering a
setting would do miracles in my case.

Indeed how can I measure the effect of a new setting? Enhanced
throughput? I learned in the past that fiddling with parameter A in
a database that was not properly setup/used --and there were many of
these--, had side effects in other parts of the engine. Yes, increased
throughput was observed. A new hype created. In the end it turned out
the parameter A was not set correctly at all. That parameter B, once
set to a sane value, cured the wrong behavior, and parameter A was not
optimal at all after the cure. We were just side tracked because we
did not know. Incorrect "knowledge" was borne (parameter A
setting). Throughout the years this database product has matured,
many more parameter realized and much, much more instrumentation been
implemented. It still is quite a challenge to understand what is
happening. But proper analysis is possible indeed. The black box is
much more open now.

One current example: wal_writer_delay. In my team there is an advise
to set this parameter to 100. However, after implementing a counter
(home grown instrumentation) I now know that the background log flush
routine is never called when stressing the database. Therefore I now
think the best setting is 10000 (its maximum) since it does not do
useful work (in my context) and therefore should wake up as little
times as possible. Without this instrumentation I can only guess about
the usability of this parameter and spend many tests in order to get
an impression of its validity to me.

So overhauling the GUC parameters is one step, but adding proper
instrumentation in order to really measure the impact of the new
setting is necessary too. Especially when looking in the direction of
auto tuning. Proper measurement is crucial to enable correct analysis.

Of course I am in favor of doing this with DTrace, however not all
platforms can benefit in that case :-)

--Paul

On 2 jun 2008, at 20:06, Josh Berkus wr
> Greg,
>
>> Like some of the other GUC simplification ideas that show up
>> sometimes
>> (unifying all I/O and limiting background processes based on that
>> total
>> is another), this is hard to do internally. Josh's proposal has a
>> fair
>> amount of work involved, but the code itself doesn't need to be
>> clever
>> or too intrusive. Unifying all the memory settings would require
>> being
>> both clever and intrusive, and I doubt you'll find anybody who could
>> pull it off who isn't already overtasked with more important
>> improvements for the 8.4 timeframe.
>
> Plus, I'm a big fan of "enable an API" rather than "write a
> feature". I
> think that there are people & companies out there who can write better
> autotuning tools than I can, and I'd rather give them a place to plug
> those tools in than trying to write autotuning into the postmaster.
>
> --
> --Josh
>
> Josh Berkus
> PostgreSQL @ Sun
> San Francisco
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Regards,

Paul van den Bogaard

---------------------------------------------------------------------------------------------
Paul van den Bogaard Paul(dot)vandenBogaard(at)sun(dot)com
ISV-E -- ISV Engineering, Opensource Engineering group

Sun Microsystems, Inc phone: +31
334 515 918
Saturnus 1 extentsion:
x (70)15918
3824 ME Amersfoort mobile: +31
651 913 354
The Netherlands fax:
+31 334 515 001


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Paul van den Bogaard <Paul(dot)Vandenbogaard(at)Sun(dot)COM>
Cc: josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 03:48:24
Message-ID: Pine.GSO.4.64.0806032324550.10282@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 3 Jun 2008, Paul van den Bogaard wrote:

> So overhauling the GUC parameters is one step, but adding proper
> instrumentation in order to really measure the impact of the new setting
> is necessary too.

Correct, but completely off-topic regardless. One problem to be solved
here is to take PostgreSQL tuning from zero to, say, 50% automatic.
Wander the user lists for a few months; the number of completely
misconfigured systems out there is considerable, partly because the
default values for many parameters are completely unreasonable for modern
hardware and there's no easy way to improve on that without someone
educating themselves. Getting distracted by the requirements of the
high-end systems will give you a problem you have no hope of executing in
a reasonable time period.

By all means bring that up as a separate (and much, much larger) project:
"Database Benchmarking and Sensitivity Analysis of Performance Tuning
Parameters" would make a nice PhD project for somebody, and there's
probably a good patent in there somewhere. Even if you had such a tool,
it wouldn't be usable by non-experts unless the mundate GUC generation
issues are dealt with first, and that's where this is at right now.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Decibel! <decibel(at)decibel(dot)org>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 13:00:49
Message-ID: 20080604130049.GG65238@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, May 31, 2008 at 08:36:54PM +0100, Gregory Stark wrote:
> > 4) By shifting from a model where postgresql.conf is document-formatted and
> > hand-edited to one where it's machine generated, it becomes vastly easier to
> > write simple utilities to manage these settings. Right now, the big
> > "obstacle" to things like SET PERSISTENT is "how to we preseve the
> > hand-edited comments in the file" -- and the answer is we *don't.*
>
> What this sounds like is a sly way to try to get rid of postgresql.conf
> entirely and replace it with parameters stored in the database so admins would
> adjust the parameters using an SQL syntax rather than a text file.
>
> There are pros and cons of such a system but I think for newbie admins that
> would be a thousand times *more* baffling. You would have to learn new
> commands and have no holistic view of what parameters had been set, what
> related parameters might exist. You also have no way to keep the file in a
> version control system or sync across servers etc.

It doesn't have to be in the database. I kinda like Oracle's method
here; they default to storing config options in a binary file that
(iirc) you can change via commands, but they also provide a way to turn
that binary file into text (and back). That would allow for a lot of
options when it comes to configuring stuff.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828


From: Decibel! <decibel(at)decibel(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-04 13:18:11
Message-ID: 20080604131810.GH65238@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 02, 2008 at 10:12:06AM -0400, Tom Lane wrote:
> Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> > Joshua has been banging a drum for a while now that all this data needs to
> > get pushing into the database itself.
>
> This is, very simply, not going to happen. Shall we go over the reasons
> why not, one more time?
>
<snip>
>
> I have no objection to providing alternative ways to edit the
> configuration data, but the primary source of the settings is
> going to continue to be an editable text file. Any proposals for
> alternatives-to-a-text-editor have to work within that reality.

There's no reason that the server has to deal with a text file. I
completely agree that there must be a method to change settings even if
the database isn't running, but that method does not necessarily need to
be a text file. If we can come up with a standard API for reading and
writing config changes, we (or anyone else) can write any number of
tools to deal with the settings. And once we have an API, we can provide
a SQL interface on top of it.

Instead of focusing on exactly what the 'new postgresql.conf' is going
to look like we should focus on creating a clean configuration API. Once
we have that we can figure out what (if anything) we're doing with the
existing .conf, and what a new one (if it exists) might look like.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: "Decibel!" <decibel(at)decibel(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-04 14:19:47
Message-ID: 4846A483.8000001@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Decibel! wrote:
>>
>> There's no reason that the server has to deal with a text file. I
>> completely agree that there must be a method to change settings even if
>> the database isn't running, but that method does not necessarily need to
>> be a text file. If we can come up with a standard API for reading and
>> writing config changes, we (or anyone else) can write any number of
>> tools to deal with the settings. And once we have an API, we can provide
>> a SQL interface on top of it.
Once in a lifetime, a man should plant a tree, father a child, and write
an editor... :-)
Hiding the storage of config parameters opaquely behind an API is
something I've been hating for a long time on win32.

When reading this thread, I'm wondering if anybody ever saw a config
file for a complex software product that was easily editable and
understandable. I don't know one. If there was one, it'd be nice to know
it so we can learn from it.

IMHO the best compromise in machine and human readability is an XML
format. It's easily decorateable with comments, easily interpreted and a
pg_settings view could enhance it with even more comments, so an editor
using pgsql functions (to read and write postresql.conf.xml) could be
enabled to supply comprehensive help.

Regards,
Andreas


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
Cc: Decibel! <decibel(at)decibel(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-04 14:35:49
Message-ID: 20080604143548.GE14498@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Andreas Pflug <pgadmin(at)pse-consulting(dot)de> [080604 10:20]:

> Hiding the storage of config parameters opaquely behind an API is
> something I've been hating for a long time on win32.

;-)

> When reading this thread, I'm wondering if anybody ever saw a config
> file for a complex software product that was easily editable and
> understandable. I don't know one. If there was one, it'd be nice to know
> it so we can learn from it.

PostreSQL, Apache, X.org

They are all easily editable, and "understandable", in the sense that I
understand that I'm supposed to edit the line, changing the value
(following the comments list of accepted values)

They are "less understandable" if you mean that I know the implications
of any change I make. But guess what, having those values inputed
through some other mechanism (like a GUI config file editor, a SQL statement,
or a nice pgadmin-SQL-hiding-interface isn't going to change that part
of "understandable". That part of understandable only comes through
good documentation and reference material, which is universally
applicable to any config method.

> IMHO the best compromise in machine and human readability is an XML
> format. It's easily decorateable with comments, easily interpreted and a
> pg_settings view could enhance it with even more comments, so an editor
> using pgsql functions (to read and write postresql.conf.xml) could be
> enabled to supply comprehensive help.

Well, In my past, I've generally not got around to installing and using
software that reqired me to edit some jumble of XML. Ya, maybe I'm
lucky. And since I've got a lot invested in PG, I'ld be forced to of PG
moved to an XML config, but I'ld be forced to kicking and screaming...

I just *know* that I'ld reload/restart postmaster some time, and the
config file wouldn't be quite correct, and I'ld search for 10 minutes
trying to find the extra (or lack) ", or missing closing /... But maybe
most people are better at parsing XML than me. And that also may be
because I've actively avoided it for so long ;-)

I don't know *how* I'ld decorate my XML config file with comments and
history the way I do my text-based #-commented config files.

a.

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: "Decibel!" <decibel(at)decibel(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-04 15:10:37
Message-ID: 4846B06D.8030101@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Aidan Van Dyk wrote:
> * Andreas Pflug <pgadmin(at)pse-consulting(dot)de> [080604 10:20]:
>
>
>> Hiding the storage of config parameters opaquely behind an API is
>> something I've been hating for a long time on win32.
>>
>
> ;-)
>
>
>> When reading this thread, I'm wondering if anybody ever saw a config
>> file for a complex software product that was easily editable and
>> understandable. I don't know one. If there was one, it'd be nice to know
>> it so we can learn from it.
>>
>
> PostreSQL, Apache, X.org
>
> They are all easily editable, and "understandable", in the sense that I
> understand that I'm supposed to edit the line, changing the value
> (following the comments list of accepted values)
>
> They are "less understandable" if you mean that I know the implications
> of any change I make. But guess what, having those values inputed
> through some other mechanism (like a GUI config file editor, a SQL statement,
> or a nice pgadmin-SQL-hiding-interface isn't going to change that part
> of "understandable". That part of understandable only comes through
> good documentation and reference material, which is universally
> applicable to any config method.
>

Right. On the editing side, a column "link" in pg_settings that can be
used to construct an URL to postgresql.org/docs/xxx#yyy could help
creating editors that support the user. Whatever a text config file will
look like, you need to know exactly which parameter to use and where to
locate it; even structuring parameters won't help too much for the
typical starter task "I installed pgsql, what to do next".
>
>> IMHO the best compromise in machine and human readability is an XML
>> format. It's easily decorateable with comments, easily interpreted and a
>> pg_settings view could enhance it with even more comments, so an editor
>> using pgsql functions (to read and write postresql.conf.xml) could be
>> enabled to supply comprehensive help.
>>
>
> Well, In my past, I've generally not got around to installing and using
> software that reqired me to edit some jumble of XML. Ya, maybe I'm
> lucky. And since I've got a lot invested in PG, I'ld be forced to of PG
> moved to an XML config, but I'ld be forced to kicking and screaming...
>
> I just *know* that I'ld reload/restart postmaster some time, and the
> config file wouldn't be quite correct, and I'ld search for 10 minutes
> trying to find the extra (or lack) ", or missing closing /... But maybe
> most people are better at parsing XML than me. And that also may be
> because I've actively avoided it for so long ;-)
>
Well I'm an XML evangelist either. But the usual "commenting out a
parameter will reset it to default on reload, no?" caveat isn't funny
either, or duplicate parameter settings scattered throughout your file.
This may be avoided by *preferably* editing the parameters through pgsql
itself; the current postgresql.conf file format isn't too machine write
friendly (as I know since I wrote the pgadmin config file editor). But
having a config file that can't be used with simple editors at all is a
nightmare.

Regards,
Andreas


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
Cc: Aidan Van Dyk <aidan(at)highrise(dot)ca>, "Decibel!" <decibel(at)decibel(dot)org>, Greg Smith <gsmith(at)gregsmith(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-04 15:40:50
Message-ID: 25261.1212594050@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andreas Pflug <pgadmin(at)pse-consulting(dot)de> writes:
> Well I'm an XML evangelist either. But the usual "commenting out a
> parameter will reset it to default on reload, no?" caveat isn't funny
> either, or duplicate parameter settings scattered throughout your file.

Surely everyone who is opining on this thread knows that we fixed that
in 8.3.

regards, tom lane


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 17:00:13
Message-ID: 72DA5875-8E89-4A6F-8D0B-7F9A869D79B9@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 3, 2008, at 20:48, Greg Smith wrote:

> Correct, but completely off-topic regardless. One problem to be
> solved here is to take PostgreSQL tuning from zero to, say, 50%
> automatic. Wander the user lists for a few months; the number of
> completely misconfigured systems out there is considerable, partly
> because the default values for many parameters are completely
> unreasonable for modern hardware and there's no easy way to improve
> on that without someone educating themselves. Getting distracted by
> the requirements of the high-end systems will give you a problem you
> have no hope of executing in a reasonable time period.

Exactly. The issue is that application developers, who are not DBAs,
have no idea how to tune PostgreSQL, and postgresql.conf is daunting
and confusing. So they use a different database that's "faster".

I think that right now postgresql.conf is designed for full-time DBAs,
rather than folks who might want to target PostgreSQL for an
application they're developing. We want to attract the latter
(without, of course, any expense with the former). Changing how
configuration works so that it's easier to understand and, if
possible, at least partly automatically tunable would go a long way
towards making PostgreSQL friendlier for developers, IMHO.

Best,

David


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 17:02:01
Message-ID: 9140940C-6744-4040-B91F-E8017C4D73EC@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 4, 2008, at 07:19, Andreas Pflug wrote:

> IMHO the best compromise in machine and human readability is an XML
> format. It's easily decorateable with comments, easily interpreted
> and a pg_settings view could enhance it with even more comments, so
> an editor using pgsql functions (to read and write
> postresql.conf.xml) could be enabled to supply comprehensive help.

I hope that was a joke…

Best,

David


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 17:09:54
Message-ID: 2f7bced0cfe9a1bcc946c7e607528fda@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> Exactly. The issue is that application developers, who are not DBAs,
> have no idea how to tune PostgreSQL, and postgresql.conf is daunting
> and confusing. So they use a different database that's "faster".

Changing some of our defaults would go a long way as well.

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhGzCYACgkQvJuQZxSWSsi2AQCfZFa8PBZ16ljeFNmhY/YlkxF9
8KcAnjciYml1icuuxSPiKIyeE8YrVxOA
=q4Cf
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 18:22:38
Message-ID: 27574.1212603758@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"David E. Wheeler" <david(at)kineticode(dot)com> writes:
> Exactly. The issue is that application developers, who are not DBAs,
> have no idea how to tune PostgreSQL, and postgresql.conf is daunting
> and confusing. So they use a different database that's "faster".

I take it you haven't looked at mysql's configuration file lately.

They aren't actually in any better shape than we are, except that
they supply several "preconfigured" sample files for people to choose
from.

regards, tom lane


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 18:55:53
Message-ID: ED198E85-983C-4B54-8F1A-592A2528FAA5@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 4, 2008, at 11:22, Tom Lane wrote:

> "David E. Wheeler" <david(at)kineticode(dot)com> writes:
>> Exactly. The issue is that application developers, who are not DBAs,
>> have no idea how to tune PostgreSQL, and postgresql.conf is daunting
>> and confusing. So they use a different database that's "faster".
>
> I take it you haven't looked at mysql's configuration file lately.

I'm not much into MySQL, but in the work I've done with it, I've had
to create /etc/my.cnf myself. There *is* no configuration file
configuring MySQL until that file is created, is there? So there is no
configuration to learn at first. I'm not saying that this is
necessarily admirable -- it's kind of the opposite end of the spectrum
(PostgreSQL: "Here is every configuration tweak you could ever
possibly want, have fun!" vs MySQL: "There is no configuration until
you need one, then you have to find the docs for it."

> They aren't actually in any better shape than we are, except that
> they supply several "preconfigured" sample files for people to choose
> from.

Which would be a good start, if nothing else…

Best,

David


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 19:48:47
Message-ID: 4846F19F.7060408@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David E. Wheeler wrote:
> On Jun 4, 2008, at 11:22, Tom Lane wrote:
>
>> "David E. Wheeler" <david(at)kineticode(dot)com> writes:
>>> Exactly. The issue is that application developers, who are not DBAs,
>>> have no idea how to tune PostgreSQL, and postgresql.conf is daunting
>>> and confusing. So they use a different database that's "faster".
>>
>> I take it you haven't looked at mysql's configuration file lately.
>
> I'm not much into MySQL, but in the work I've done with it, I've had
> to create /etc/my.cnf myself. There *is* no configuration file
> configuring MySQL until that file is created, is there? So there is no
> configuration to learn at first. I'm not saying that this is
> necessarily admirable -- it's kind of the opposite end of the spectrum
> (PostgreSQL: "Here is every configuration tweak you could ever
> possibly want, have fun!" vs MySQL: "There is no configuration until
> you need one, then you have to find the docs for it."

Tell me how that's better.

If that's what you want, simply remove all the comment lines from your
config file. Problem solved.

>
>> They aren't actually in any better shape than we are, except that
>> they supply several "preconfigured" sample files for people to choose
>> from.
>
> Which would be a good start, if nothing else…

It's been suggested in the past. It is highly debatable that it would
actually be an advance.

cheers

andrew


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 20:00:02
Message-ID: 3A015E00-47CB-4A84-ACE4-0B498E0F9003@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 4, 2008, at 12:48, Andrew Dunstan wrote:

>> I'm not much into MySQL, but in the work I've done with it, I've
>> had to create /etc/my.cnf myself. There *is* no configuration file
>> configuring MySQL until that file is created, is there? So there is
>> no configuration to learn at first. I'm not saying that this is
>> necessarily admirable -- it's kind of the opposite end of the
>> spectrum (PostgreSQL: "Here is every configuration tweak you could
>> ever possibly want, have fun!" vs MySQL: "There is no configuration
>> until you need one, then you have to find the docs for it."
>
> Tell me how that's better.
>
> If that's what you want, simply remove all the comment lines from
> your config file. Problem solved.

I didn't say it was better. The point is that it seems to be less
confusing to non-DBAs.

>> Which would be a good start, if nothing else…
>
> It's been suggested in the past. It is highly debatable that it
> would actually be an advance.

Agreed; we've bandied around some better ideas here. Well, I haven't,
I've just kibbitzed. But we can surely do better.

Best,

David


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 20:12:05
Message-ID: 28946.1212610325@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"David E. Wheeler" <david(at)kineticode(dot)com> writes:
> I'm not much into MySQL, but in the work I've done with it, I've had
> to create /etc/my.cnf myself. There *is* no configuration file
> configuring MySQL until that file is created, is there? So there is no
> configuration to learn at first.

Postgres will start happily with an empty configuration file, too.
What's your point?

regards, tom lane


From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 20:31:27
Message-ID: 162867790806041331rce6eb94pb5d31b26892329ea@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2008/6/4 David E. Wheeler <david(at)kineticode(dot)com>:
> On Jun 3, 2008, at 20:48, Greg Smith wrote:
>
>> Correct, but completely off-topic regardless. One problem to be solved
>> here is to take PostgreSQL tuning from zero to, say, 50% automatic. Wander
>> the user lists for a few months; the number of completely misconfigured
>> systems out there is considerable, partly because the default values for
>> many parameters are completely unreasonable for modern hardware and there's
>> no easy way to improve on that without someone educating themselves.
>> Getting distracted by the requirements of the high-end systems will give
>> you a problem you have no hope of executing in a reasonable time period.
>
> Exactly. The issue is that application developers, who are not DBAs, have no
> idea how to tune PostgreSQL, and postgresql.conf is daunting and confusing.
> So they use a different database that's "faster".
>

do you thing, so any better config can help? It's not possible. And
you can't tune database without well knowledge of applications that
use database. Any automatic tools are joy for child. But some default
PostgreSQL parameters are not optimal.

> I think that right now postgresql.conf is designed for full-time DBAs,
> rather than folks who might want to target PostgreSQL for an application
> they're developing. We want to attract the latter (without, of course, any
> expense with the former). Changing how configuration works so that it's
> easier to understand and, if possible, at least partly automatically tunable
> would go a long way towards making PostgreSQL friendlier for developers,
> IMHO.
>
> Best,
>
> David
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
Cc: Decibel! <decibel(at)decibel(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-04 20:39:37
Message-ID: Pine.GSO.4.64.0806041045520.2258@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 4 Jun 2008, Andreas Pflug wrote:

> When reading this thread, I'm wondering if anybody ever saw a config file for
> a complex software product that was easily editable and understandable.

I would recommend Apache's httpd.conf as an example of something that's
easy to edit and follow. Like any complex product, the comments in the
configuration file itself can't possibly be sufficient by themselves.
But in general I've found Apache's config file to have enough comments to
jog my memory when I'm editing it while not being overwhelming. They
provide enough detail that when I run into a setting I don't understand
there's enough context provided that it's easy to search for more
information.

Poking around with Google for a bit, here's a reasonable sample:
http://webdav.org/goliath/dav_on_x/apache.conf

> IMHO the best compromise in machine and human readability is an XML format.

If the primary PostgreSQL configuration file becomes XML I will quit
working with the project. I'm not kidding. If you think XML is easy to
generate, edit by hand, and use revision control on, we are at such an
fundamental disagreement that I wouldn't even try and directly argue with
you. Instead I'll quote Eric Raymond:

"The most serious problem with XML is that it doesn't play well with
traditional Unix tools. Software that wants to read an XML format needs an
XML parser; this means bulky, complicated programs."
http://www.catb.org/esr/writings/taoup/html/ch05s02.html#id2907018

Let me suggest the following requirement instead which naturally rules it
out: it should be possible for a DBA-level coder to write a simple shell
script that does something useful with the configuration file in order for
having a text-based configuration to be useful in this context. To give a
simple example, I can write a single line [sed|awk|perl] command that will
let me update the value for one parameter in the current postgresql.conf
file. When you can give me a one-liner that does that on an XML file in
any shell language in that class, then we might have something to talk
about.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 20:40:49
Message-ID: 3D9BF1C4-5FE6-4EDC-A5CF-3C5348BAC7E3@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 4, 2008, at 13:31, Pavel Stehule wrote:

> do you thing, so any better config can help? It's not possible. And
> you can't tune database without well knowledge of applications that
> use database. Any automatic tools are joy for child. But some default
> PostgreSQL parameters are not optimal.

Agreed. But, speaking as an app developer, I'm child-like in my DBA
abilities. I could use a toy to help me with it. :-)

Best,

David


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 20:43:15
Message-ID: 3A37768E-5F03-4B76-9A34-EAB11A45B603@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 4, 2008, at 13:12, Tom Lane wrote:

> "David E. Wheeler" <david(at)kineticode(dot)com> writes:
>> I'm not much into MySQL, but in the work I've done with it, I've had
>> to create /etc/my.cnf myself. There *is* no configuration file
>> configuring MySQL until that file is created, is there? So there is
>> no
>> configuration to learn at first.
>
> Postgres will start happily with an empty configuration file, too.
> What's your point?

That it's less daunting for inexperienced users to start with that.
I'm not talking about how things work, I'm talking about what
configurations are present to start with. That's all.

Best,

David


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-04 20:57:32
Message-ID: 29525.1212613052@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> On Wed, 4 Jun 2008, Andreas Pflug wrote:
>> IMHO the best compromise in machine and human readability is an XML format.

> If the primary PostgreSQL configuration file becomes XML I will quit
> working with the project. I'm not kidding.

I have no particular use for XML in this scenario either, but really
this thread seems to be arguing about mostly-irrelevant details. There
is not anything fundamentally broken about keeping configuration in a
text file, as is proven by the fact that all those other packages do it.

The real problem we need to solve is how to allow newbies to have the
system auto-configured to something that more or less solves their
problems. Putting the config settings in XML does not accomplish that,
and neither does putting them inside the database. It might knock a
day or two off the time needed to develop a tool that actually does
solve the newbie's problem ... but it's unlikely that the effort of
changing Postgres to use some other configuration representation would
get repaid through easier tool development.

So I think we should stop worrying about the file format and think about
these two problems:

* Can we present the config options in a more helpful way (this is 99%
a documentation problem, not a code problem)?

* Can we build a "configuration wizard" to tell newbies what settings
they need to tweak?

regards, tom lane


From: "BRUSSER Michael" <Michael(dot)BRUSSER(at)3ds(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Greg Smith" <gsmith(at)gregsmith(dot)com>
Cc: "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Josh Berkus" <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-04 21:10:02
Message-ID: 64A7F4285BB8D8449430A4A467A363335AE7C3@CORP-CLT-EXB04.ds
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

If I can add my 2 cents as the end user, IMHO having the configuration
file
in the xml format is unnecessary and only makes it less user-friendly.

Thanks,
Michael.

-----Original Message-----
From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Tom Lane
Sent: Wednesday, June 04, 2008 4:58 PM
To: Greg Smith
Cc: Andreas Pflug; Decibel!; Peter Eisentraut;
pgsql-hackers(at)postgresql(dot)org; Josh Berkus
Subject: Re: [HACKERS] Overhauling GUCS

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> On Wed, 4 Jun 2008, Andreas Pflug wrote:
>> IMHO the best compromise in machine and human readability is an XML
format.

> If the primary PostgreSQL configuration file becomes XML I will quit
> working with the project. I'm not kidding.

I have no particular use for XML in this scenario either, but really
this thread seems to be arguing about mostly-irrelevant details. There
is not anything fundamentally broken about keeping configuration in a
text file, as is proven by the fact that all those other packages do it.

The real problem we need to solve is how to allow newbies to have the
system auto-configured to something that more or less solves their
problems. Putting the config settings in XML does not accomplish that,
and neither does putting them inside the database. It might knock a
day or two off the time needed to develop a tool that actually does
solve the newbie's problem ... but it's unlikely that the effort of
changing Postgres to use some other configuration representation would
get repaid through easier tool development.

So I think we should stop worrying about the file format and think about
these two problems:

* Can we present the config options in a more helpful way (this is 99%
a documentation problem, not a code problem)?

* Can we build a "configuration wizard" to tell newbies what settings
they need to tweak?

regards, tom lane


From: Steve Atkins <steve(at)blighty(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-04 21:10:26
Message-ID: 7BB6E05F-6AC3-46B9-A0C6-1157D139E9F6@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Jun 4, 2008, at 1:57 PM, Tom Lane wrote:

> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?

Probably. Given the demographics of a lot of the newbies is
Windows this likely needs to be a pointy-clicky sort of thing
if it's going to be widely useful.

Doing it in a smart way is likely hard, but even a very simple,
crude one would likely be helpful. Doing it as a standalone
GUI editor / Wizard wouldn't be too hard, but it sounds more
like something that should be integrated in pgadmin, perhaps?

Cheers,
Steve


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 21:57:28
Message-ID: 653E38C1-A6CD-46CA-A5A1-8FF8433C1376@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 4, 2008, at 13:57, Tom Lane wrote:

> So I think we should stop worrying about the file format and think
> about
> these two problems:
>
> * Can we present the config options in a more helpful way (this is 99%
> a documentation problem, not a code problem)?
>
> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?

Amen, Tom. Nicely put.

Thanks,

David


From: Rainer Bauer <usenet(at)munnin(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-04 23:23:53
Message-ID: u38e44hd43noiihgcdarflv9f5320i2kdi@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Pavel Stehule" wrote:

>2008/6/4 David E. Wheeler <david(at)kineticode(dot)com>:
>>
>> Exactly. The issue is that application developers, who are not DBAs, have no
>> idea how to tune PostgreSQL, and postgresql.conf is daunting and confusing.
>> So they use a different database that's "faster".
>
>do you thing, so any better config can help? It's not possible. And
>you can't tune database without well knowledge of applications that
>use database. Any automatic tools are joy for child. But some default
>PostgreSQL parameters are not optimal.

I think it would be an enourmous help for beginners if they had a "simple"
tuning tool which would tell them which values where altered (and possibly
why) from Postgres' default settings based on some basic information.
Like:
- machine hardware (disk layout, OS, installed memory, etc.)
- application usage (no. of clients, read/write activity, etc)

I don't think that such a tool could tune the database perfectly (or even very
good), but at least people new to Postgres would know where to start looking
for tuning it to their needs.

And I am not speaking about end users running an application that uses
Postgres. I talk about application developers like me that port their
application to use Postgres.

Rainer


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 00:02:46
Message-ID: 48472D26.8040408@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

>
> * Can we present the config options in a more helpful way (this is 99%
> a documentation problem, not a code problem)?
>
> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?

It's certainly one thing to create an initial postgresql.conf from
scratch after some inquiry, but a different level of problems to deal
with when offering to change the settings. IMHO initial creation isn't
enough, users will feel even more left alone if there are no tools
helping them further. I guess most users will start tweaking after the
server is already running for a while, with some config already in place.
That's when file format and/or APIs come into play. Preserving comments
and/or using them in a wizard isn't too easy with the current format.

Regards,
Andreas


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 00:07:35
Message-ID: 48472E47.7030701@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
>
> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?
>
>
>

That would trump all the other suggestions conclusively. Anyone good at
expert systems?

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 00:15:12
Message-ID: 7395.1212624912@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andreas Pflug <pgadmin(at)pse-consulting(dot)de> writes:
> Tom Lane wrote:
>> * Can we build a "configuration wizard" to tell newbies what settings
>> they need to tweak?

> It's certainly one thing to create an initial postgresql.conf from
> scratch after some inquiry, but a different level of problems to deal
> with when offering to change the settings. IMHO initial creation isn't
> enough, users will feel even more left alone if there are no tools
> helping them further. I guess most users will start tweaking after the
> server is already running for a while, with some config already in place.

Indeed, the wizard should be designed to assist with tweaking an
existing installation. I see no value at all to trying to run it
before/during initdb.

> That's when file format and/or APIs come into play. Preserving comments
> and/or using them in a wizard isn't too easy with the current format.

[ shrug... ] We have already wasted more effort arguing about this
issue than it would take to do something good enough for a wizard
tool. The great thing about simple text formats is that they're pretty
robust. I think all we need to do is comment out any existing
setting(s) of the target variable and add the new setting before the
first such, perhaps with a comment indicating what added it and when.
If the user finds this too ugly, he can tweak it with a text editor.

This is even assuming that the tool needs to edit the file itself,
rather than just give advice. The advice is the hard part, folks;
could we stop obsessing about trivia?

regards, tom lane


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 00:16:55
Message-ID: Pine.GSO.4.64.0806041720520.26357@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 4 Jun 2008, Tom Lane wrote:

> The real problem we need to solve is how to allow newbies to have the
> system auto-configured to something that more or less solves their
> problems. Putting the config settings in XML does not accomplish that,
> and neither does putting them inside the database.

The subtle issue here is that what makes sense for the database
configuration changes over time; there's not just one initial generation
and you're done. postgresql.conf files can end up moving from one machine
to another for example. I think something that doesn't recognize that
reality and move toward a "tune-up" capability as well as initial
generation wouldn't be as useful, and that's where putting the settings
inside the database helps so much.

Also, there's a certain elegance to having a optimization tool that works
again either a new installation or an existing one. I personally have
zero interest in a one-shot config generator. It just doesn't solve the
problems I see in the field. Performance starts out just fine even with
the default settings when people first start, and then goes to hell after
the system has been running for a while (and possibly moved to another
machine). By that point nobody wants to mess with their configuration
file unless it's one simple change at a time.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 00:23:17
Message-ID: 7473.1212625397@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> * Can we build a "configuration wizard" to tell newbies what settings
>> they need to tweak?

> That would trump all the other suggestions conclusively. Anyone good at
> expert systems?

How far could we get with the answers to just three questions:

* How many concurrent queries do you expect to have?

* How much RAM space are you willing to let Postgres use?

* How much "overhead" disk space are you willing to let Postgres use?

concurrent queries drives max_connections, obviously, and RAM space
would drive shared_buffers and effective_cache_size, and both of them
would be needed to size work_mem. The third one is a bit weird but
I don't see any other good way to set the checkpoint parameters.

If those aren't enough questions, what else must we ask? Or maybe they
aren't the right questions at all --- maybe we should ask "is this a
dedicated machine or not" and try to extrapolate everything else from
what we (hopefully) can find out about the hardware.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 00:26:14
Message-ID: 7509.1212625574@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> On Wed, 4 Jun 2008, Tom Lane wrote:
>> The real problem we need to solve is how to allow newbies to have the
>> system auto-configured to something that more or less solves their
>> problems. Putting the config settings in XML does not accomplish that,
>> and neither does putting them inside the database.

> The subtle issue here is that what makes sense for the database
> configuration changes over time; there's not just one initial generation
> and you're done. postgresql.conf files can end up moving from one machine
> to another for example. I think something that doesn't recognize that
> reality and move toward a "tune-up" capability as well as initial
> generation wouldn't be as useful,

As I just mentioned to someone else, I don't see any point in framing it
as an "initial generation" problem at all. initdb will already give you
settings that work, for some value of "work". The config wizard really
only needs to deal with the case of tuning an existing installation.

> and that's where putting the settings
> inside the database helps so much.

How does it help, pray tell? If you mean being able to see what the
existing settings are, pg_settings already does that.

regards, tom lane


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 01:14:55
Message-ID: 20080605011455.GF14498@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> [080604 20:46]:

> If those aren't enough questions, what else must we ask? Or maybe they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.

For these three things:
1) max connections/workmem
2) shared buffers/effective cache
3) bgwriter/checkpoint parameters

What are the stats or measures that PostgreSQL produces (or that you
want it too currently doesn't, but you would like it to produce) that
the "masters" (i.e. people who tune PostgreSQL effectively, like you,
Greg, Simon, Robert, Peter, Josh, Jim, etc - sorry if I missed others)
actually use to decide whether to increase or decrease a value?

I tune my postgresql.conf mainly on folklore, and "assimilated
understanding" from reading the lists and blogs... But I haven't come
across (or rather, haven't come across and remembered/bookmarked)
anything that helps someone sample/read any stats or counts to find
bottleneck points which to start tuning.

Stuff like (remembering that I've never had to really work at tuning
because PG has always been "fast enough" for my needs, so take this with
a grain of salt)
* Are backends always writing out dirty buffers because there are no free
ones? This might mean tweaking settings affecting bgwriter.
* Are the evicted buffers ones with really high usage counts? This
might mean an increase shared buffers would help?
* Are we always spilling small amounts of data to disk for sorting? A
a small work_mem increase might help...
* Are all our reads from disk really quick? This probably means OS
pagecache has our whole DB, and means random_page_cost could be
tweaked?

If we could get a definitive list of things like this, or maybe just
comprehensive, or even at least agreed-to-not-be-wrong things to look
at, that would go a long way to documentation *how* to tune PG
effectively, and could lead to any projects that want to tackle
examining a running cluster and suggesting some config changes...

a.
--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: Steve Atkins <steve(at)blighty(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 01:20:49
Message-ID: 448BD875-FE20-4985-92CD-DFF6FC0641D3@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Jun 4, 2008, at 5:23 PM, Tom Lane wrote:

> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> Tom Lane wrote:
>>> * Can we build a "configuration wizard" to tell newbies what
>>> settings
>>> they need to tweak?
>
>> That would trump all the other suggestions conclusively. Anyone
>> good at
>> expert systems?
>
> How far could we get with the answers to just three questions:
>
> * How many concurrent queries do you expect to have?
>
> * How much RAM space are you willing to let Postgres use?
>
> * How much "overhead" disk space are you willing to let Postgres use?
>
> concurrent queries drives max_connections, obviously, and RAM space
> would drive shared_buffers and effective_cache_size, and both of them
> would be needed to size work_mem. The third one is a bit weird but
> I don't see any other good way to set the checkpoint parameters.
>
> If those aren't enough questions, what else must we ask? Or maybe
> they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.

I'd be interested in putting together a framework+GUI client to do this
cleanly in a cross-platform (Windows, Linux, Solaris, OS X as a bare
minimum) sort of way, if no-one else already has such a thing.

A framework doesn't get you all the way there, but it makes it a
whole lot easier to work on what base data and information you
need, and how easy it is to map pgsql-performance and #postgresql
gut feel onto something more algorithmic.

Cheers,
Steve


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 01:28:13
Message-ID: Pine.GSO.4.64.0806042033120.3874@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 4 Jun 2008, Andrew Dunstan wrote:

> Tom Lane wrote:
>> * Can we build a "configuration wizard" to tell newbies what settings
>> they need to tweak?
>
> That would trump all the other suggestions conclusively. Anyone good at
> expert systems?

Sigh. I guess we need to start over again.

Last year around this time, there was one of the recurring retreads of
this topic named "PostgreSQL Configuration Tool for Dummies":
http://archives.postgresql.org/pgsql-performance/2007-06/msg00386.php

Josh Berkus pointed out that he already had the "expert system" part of
this problem solved pretty well with a spreadsheet:

http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc (that's in
the OpenOffice Calc format if you don't know the extension)

That particular spreadsheet has more useful tuning suggestions in this
area than 99.9% of PostgreSQL users have or will ever know. You can
nitpick the exact recommendations, but the actual logic and thinking
involved is pretty well solved. It could use a touch of tweaking and
modernization but it's not too far off from being as good as you're likely
to get at making guesses without asking the user too many questions.
There is one ugly technical issue, that you can't increase shared_buffers
usefully in many situations because of SHMMAX restrictions, and that issue
will haunt any attempt to be completely automatic.

Where Josh got hung up, where I got hung up, where Lance Campbell stopped
at with his Dummies tool, and what some unknown number of other people
have been twarted by, is that taking that knowledge and turning it into a
tool useful to users is surprisingly difficult. The reason for that is
the current postgresql.conf file and how it maps internally to GUC
information isn't particularly well suited to automated generation,
analysis, or updates. I think Josh got lost somewhere in the parsing the
file stage. The parts I personally got stuck on were distinguishing
user-added comments from ones the system put in, plus being completely
dissatisfied with how lossy the internal GUC process was (I would like a
lot more information out of pg_settings than are currently there).
Lance's helper tool was hobbled by the limitations of being a simple web
application.

That's the background to Josh's proposal. It has about an 80% overlap
with what I was working on suggesting, which is why I jumped on his
bandwagon so fast. The outline at
http://wiki.postgresql.org/wiki/GUCS_Overhaul includes the superset of our
respective thinking on the first step here toward straightening out this
mess, further expanded with observations made in this thread.

I would respectively point out that comments about the actual tuning
itself have no bearing whatsoever on this proposal. This is trying to
nail down all the features needed to support both doing an initial
generation and subsequent incremental improvements to the postgresql.conf
file, while also reducing some redundancy in the code itself. Reducing
the scope to only handling initial generation would make this a smaller
task. But it happens to fall out that the work required to cut down on
the redundancy and that required to better support incremental updates as
well happen to be almost the same. Josh's stated agenda is to get this
right in one swoop, with only one version worth of disruption to the
format, and that goal is served better IMHO as well by addressing all
these changes as one batch.

I will attempt to resist further outbursts about non-productive comments
here, and each time I am tempted instead work on prototyping the necessary
code I think this really needs instead.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Steve Atkins <steve(at)blighty(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 01:33:51
Message-ID: 3B6261EE-A131-4303-AF92-488754E9D731@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Jun 4, 2008, at 6:28 PM, Greg Smith wrote:
>
>
> Josh Berkus pointed out that he already had the "expert system" part
> of this problem solved pretty well with a spreadsheet:
>
> http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc
> (that's in the OpenOffice Calc format if you don't know the extension)

On Jun 4, 2008, at 6:20 PM, Steve Atkins wrote:
> I'd be interested in putting together a framework+GUI client to do
> this
> cleanly in a cross-platform (Windows, Linux, Solaris, OS X as a bare
> minimum) sort of way, if no-one else already has such a thing.

/me makes go together motions, if nobody has any objection

Cheers,
Steve


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 02:04:54
Message-ID: Pine.GSO.4.64.0806042134090.3874@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 4 Jun 2008, Aidan Van Dyk wrote:

> * Are backends always writing out dirty buffers because there are no free
> ones? This might mean tweaking settings affecting bgwriter.

What you mean on the first one is "are backends always writing out dirty
buffers becuase there are no *clean* ones"; the server operates with no
*free* buffers as standard operations. Figuring that out is now easy in
8.3 with the pg_stat_bgwriter view.

> * Are the evicted buffers ones with really high usage counts? This
> might mean an increase shared buffers would help?

Evicted buffers must have a 0 usage count. The correct question to ask is
"are buffers never getting high usage counts because they keep getting
evicted too fast?". You can look at that in 8.3 using pg_buffercache,
I've got suggested queries as part of my buffer cache presentation at
http://www.westnet.com/~gsmith/content/postgresql/

> * Are we always spilling small amounts of data to disk for sorting? A
> a small work_mem increase might help...

I was just talking to someone today about building a monitoring tool for
this. Not having a clear way to recommend people monitor use of work_mem
and its brother spilled to disk sorts is an issue right now, I'll whack
that one myself if someone doesn't beat me to it before I get time.

> * Are all our reads from disk really quick? This probably means OS
> pagecache has our whole DB, and means random_page_cost could be
> tweaked?

This is hard to do with low overhead in an OS-independant way. The best
solution available now would use dtrace to try and nail it down. There's
movement in this area (systemtap for Linux, recent discussion at the PGCon
Developer Meeting of possibly needing more platform-specific code) but
it's not quite there yet.

So everything you mentioned is either recently added/documented or being
actively worked on somewhere, and the first two were things I worked on
myself after noticing they were missing. Believe me, I feel the items
that still aren't there, but they're moving along at their own pace.
There's already more tuning knowledge available than tools to help apply
that knowledge to other people's systems, which is why I think a diversion
to focus just on that part is so necessary.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 02:37:20
Message-ID: 20080605023720.GG14498@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Greg Smith <gsmith(at)gregsmith(dot)com> [080604 22:14]:

> So everything you mentioned is either recently added/documented or being
> actively worked on somewhere, and the first two were things I worked on
> myself after noticing they were missing. Believe me, I feel the items
> that still aren't there, but they're moving along at their own pace.
> There's already more tuning knowledge available than tools to help apply
> that knowledge to other people's systems, which is why I think a diversion
> to focus just on that part is so necessary.

But as an administrator/developer, I don't understand the focus on a new
API for "writing my config" for me... I'ld love a tool that helped me
"analyze" my current running PG database (and yes, that includes getting
*current* settings), and "suggest" config changes, ideally in order that
the tool thinks will make a difference... I can make that change, and
distribute it. That's the easy part. If I really trust the tool then
I'll just blindly run it (depending on it's output format):
$TOOL > $PGDATA/postgresql.conf
in which case, I don't care if it groked any of my previous comments and
cruft. Otherwise, I'll look at it, and integrate some (or all) of the
changes into postgresql.conf using my preferred method of
commenting/SCM/quirks.

a.

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 03:43:25
Message-ID: Pine.GSO.4.64.0806042241070.23557@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 4 Jun 2008, Aidan Van Dyk wrote:

> I'd love a tool that helped me "analyze" my current running PG database
> (and yes, that includes getting *current* settings), and "suggest"
> config changes

Fine. To demonstrate why the overhaul is needed, let's start designing a
simple tool whose sole purpose in life is to suggest new settings for
shared_buffers and work_mem. Say we want to show people their current
setting and what we'd recommend instead.

I just created a new cluster on my laptop. It has the following in the
postgresql.conf:

shared_buffers = 32MB # min 128kB or max_connections*16kB
# (change requires restart)
#work_mem = 1MB # min 64kB

Say I first start designing such a tool by trying to read the
postgresql.conf file to figure out what these values are set to. In order
to accomplish that, I need to parse the whole file correctly, doing things
like turning "32MB" into the actual numeric value so my program can make
decisions based on its value[1]. This basically requires someone writing
a tuning tool replicate the GUC parsing code, which is crazy; at this
point you've already lost most potential tool authors.

But you're smarter than that; instead you use pg_settings:

psql=# select name,setting,unit from pg_settings where name='shared_buffers' or name='work_mem';
name | setting | unit
----------------+---------+------
shared_buffers | 4096 | 8kB
work_mem | 1024 | kB

Now: what do you tell the user their current value is? The way the
postgresql.conf is parsed into memory is lossy; at this point you don't
know anymore what units where specified in the original file. If someone
sees their current setting shown as "4096" but they know they set it to
"32MB", they'll end up confused unless they understand the whole page size
concept--and if they understood that, they'd automatically be disqualified
from being the type of user our theoretical tool is targeted toward. If
you try and make things more user-friendly by always showing the most
human readable version, what about the person who ended up setting this
parameter because they copied an old config file that recommended setting
it to 4096. You show it to them as "32MB"; they'll also be confused and
blame the tool for being bad.

And even if you work through all that, to give good advice here you have
to know things like that shared_buffers is a server parameter requiring
restart, while work_mem is a per-session parameter. Right now, the only
way to know all that is for tool authors to assemble their own database
and keep it up to date with each release. And you just lost another set
of potential authors with that realization.

Next up, we manage to work through all those issues, and someone happily
follows our advice and gets their file updated with a much larger value
for work_mem. Yeah, we are heroes! Or not. You see, in the config file
we just helpful updated for them was this comment just above that setting:

# OMG don't set this too high or the Linux OOM killer will
# take down the server!

(This is not a theoretical example; I have found variations on that text
in two postgresql.conf files and I vaguely recall Josh mentioned running
into it as well).

And now you just *crashed their server* by giving bad advice that was
clearly against the invaluable comment history in the file already. Oh,
but where are those comments located at? Before the setting? After the
setting? In a group at the top? Who can say? Since there's no strong
standard, people put them all over the place.

I don't know about you, but I'm too busy to spend a bunch of time writing
a tool to work around all these issues knowing it is always going to be
fragile, incomplete, and need updating with every GUC change no matter
what. Instead, reconsider http://wiki.postgresql.org/wiki/GUCS_Overhaul ,
having gotten a taste of the motivation behind those changes, and consider
how this would play out after those improvements.

The approach where you connect to the database and query is the easy path.
There is never a reason to consider parsing postgresql.conf. Anybody who
knows how to write a simple script that connects to a database and reads a
table (basically anyone who's written the database client equivilent of
"hello, world") can feel like a potential tool author.

All the information about the postgresql.conf side of every setting is
retained in case you want to show where they came from, or to generate a
new file that's as similar as possible to the original.

Any recommendations you suggest can be trivially annotated with whether
you need to consider max_connections because it's per-session, and whether
people need to restart the server or can just send it a signal, and that
will continue to be the case in the future with minimal work on the tool
author's part.

The defaults are now available, so that it's easy to figure out what
people changed. That is sometimes handy to include as part of this sort
of analysis, and it's necessary to provide improvements like a "strip the
unnecessary junk out of this file" that many people would like from this
sort of tool.

When you show people that you recommend increasing a value to something
larger, any comments about that setting will be shown and they'll know not
to follow the tool's advice if there's a history there.

This seems like such a better place to be that I'd rather drive toward the
server-side changes necessary to support it rather than fight the
difficult tool creation problems. That's why the "focus on a new API for
'writing my config' for me"; that particular goal is just one part of a
set of revisions that streamline the tool creation process in a not
necessarily obvious way. Unless, of course, you've tried to write a
full-circle config tuning tool, in which case most of the proposed changes
in this overhaul jump right out at you.

[1] In the shared_buffers case, it may be possible to just recommend a
value without caring one bit what the current one is. But for work_mem,
you really need to actually understand the value if you want any real
intelligence that combines that information with the maximum connections,
so that you can compute how much memory is left over for things like
effective_cache_size.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Josh Berkus" <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 06:38:21
Message-ID: 484789DD.2050800@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> * How much "overhead" disk space are you willing to let Postgres use?
>
> ... The third one is a bit weird but
> I don't see any other good way to set the checkpoint parameters.

The way I think about the checkpoint settings is:

1. Set checkpoint_timeout to the max. time you're willing to spend in
recovery in case of crash or power loss.

2. Set checkpoint_segments to a high value. "High" meaning high enough
that you'll never reach it in practice. The purpose is just to keep you
from running out of disk space if something weird happens.

The amount of downtime one is willing to accept in case of power loss is
a good question to ask because it doesn't require any knowledge of how
PostgreSQL works; it can be answered directly from the application
requirements. And if the DBA/developer don't know the answer, he needs
to figure it out, because it's a very important question not only for
the database but in general.

I believe checkpoint_timeout correlates quite well with the max. time
required in recovery. If it took 10 minutes to generate X amount of WAL,
replaying that WAL will need to do at most the same amount of I/O, which
should take roughly the same amount of time, regardless of whether the
I/O was sequential or random. If the system wasn't busy doing updates
during between the checkpoints, it will of course take less.

As with all settings, the tool will need to explain the tradeoff.
Smaller checkpoint_timeout means more checkpointing which means more I/O
which means less average TPS and more WAL generated (because of
full_page_writes).

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 06:54:07
Message-ID: 48478D8F.1030006@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Guys,

A configuration wizard would be nice, but it would be a good start to
add a section to the manual on how to do the basic tuning. AFAICS we
don't have one. Clear instructions on how to set the few most important
settings like shared_buffers and checkpoint_timeout/segments would
probably be enough, with a link to the main configuration section that
explains the rest of the settings.

If people don't read the manual, we can add a link to it from
postgresql.conf.sample, add a screen to the Windows installer suggesting
to read it, or even open postgresql.conf in Notepad.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 08:15:49
Message-ID: Pine.GSO.4.64.0806050352530.10308@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 5 Jun 2008, Heikki Linnakangas wrote:

> A configuration wizard would be nice, but it would be a good start to add a
> section to the manual on how to do the basic tuning. AFAICS we don't have
> one. Clear instructions on how to set the few most important settings like
> shared_buffers and checkpoint_timeout/segments would probably be enough, with
> a link to the main configuration section that explains the rest of the
> settings.

It hasn't gelled yet but I'm working on that. Most of the text needed is
now linked to at http://wiki.postgresql.org/wiki/Performance_Optimization

I already talked with Chris Browne about merging his document I put first
in that list with useful pieces from some of mine into one more
comprehensive document on the Wiki, covering everything you mention here.
If we took a snapshot of that when it's done and dumped that into the
manual, I don't think that would be a problem to wrap up before 8.4 is
done. I'd like to include a link to the above performance page in that
section of the manual as well, both so that people are more likely to find
fresh content as well as to give them pointers toward more resources than
the manual can possibly cover.

> If people don't read the manual, we can add a link to it from
> postgresql.conf.sample, add a screen to the Windows installer suggesting
> to read it, or even open postgresql.conf in Notepad.

They don't. Putting pointers toward a relatively simple performance
tuning document a bit more in people's faces might help lower some of the
criticism the project takes over providing low defaults for so many
things.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Rainer Bauer <usenet(at)munnin(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 15:18:02
Message-ID: 20080605151802.GC16690@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 05, 2008 at 01:23:53AM +0200, Rainer Bauer wrote:
> I think it would be an enourmous help for beginners if they had a "simple"
> tuning tool which would tell them which values where altered (and possibly
> why) from Postgres' default settings based on some basic information.
> Like:
> - machine hardware (disk layout, OS, installed memory, etc.)
> - application usage (no. of clients, read/write activity, etc)

It would be possible to make a program that worked like:
# pg_autotune
Detected 4GB memory, 2.3GB free
Measured random_page_cost=2.3
Select expected usage: (d)edicated, (n)ormal, (t)iny
> t
Shared_buffers 64MB
Configuration stored to /etc/postgresql/8.3/postgresql.conf
#

This would probably solve most issues.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 15:49:18
Message-ID: 15892.1212680958@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> A configuration wizard would be nice, but it would be a good start to
> add a section to the manual on how to do the basic tuning.

+1. If we can't write an explanation of what to do, we certainly aren't
going to be able to implement it in a wizard. Agreeing on what to put
in the manual would also go a long way towards providing an
implementation spec for the wizard ...

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 16:24:00
Message-ID: 20080605162400.GF16502@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith wrote:

> Where Josh got hung up, where I got hung up, where Lance Campbell stopped
> at with his Dummies tool, and what some unknown number of other people
> have been twarted by, is that taking that knowledge and turning it into a
> tool useful to users is surprisingly difficult. The reason for that is
> the current postgresql.conf file and how it maps internally to GUC
> information isn't particularly well suited to automated generation,
> analysis, or updates.

What I think this says is that we should be pushing Magnus more to
continue work on the configuration API thing he was designing.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: "Greg Smith" <gsmith(at)gregsmith(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 17:05:03
Message-ID: 48481CBF.6040602@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera wrote:
> Greg Smith wrote:
>
>> Where Josh got hung up, where I got hung up, where Lance Campbell stopped
>> at with his Dummies tool, and what some unknown number of other people
>> have been twarted by, is that taking that knowledge and turning it into a
>> tool useful to users is surprisingly difficult. The reason for that is
>> the current postgresql.conf file and how it maps internally to GUC
>> information isn't particularly well suited to automated generation,
>> analysis, or updates.
>
> What I think this says is that we should be pushing Magnus more to
> continue work on the configuration API thing he was designing.

There's some magic in pgAdmin to parse and write the file. You could
take a look a that.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 17:10:58
Message-ID: 20080605171058.GI16502@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> Alvaro Herrera wrote:

>> What I think this says is that we should be pushing Magnus more to
>> continue work on the configuration API thing he was designing.
>
> There's some magic in pgAdmin to parse and write the file. You could
> take a look a that.

That's what they want to get rid of. It's a problem that every GUI
writer needs to reimplement that functionality; ISTM this is something
worth having in the backend.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 17:11:22
Message-ID: 48481E3A.4010306@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
>> Greg Smith wrote:
>>
>>> Where Josh got hung up, where I got hung up, where Lance Campbell
>>> stopped at with his Dummies tool, and what some unknown number of
>>> other people have been twarted by, is that taking that knowledge and
>>> turning it into a tool useful to users is surprisingly difficult.
>>> The reason for that is the current postgresql.conf file and how it
>>> maps internally to GUC information isn't particularly well suited to
>>> automated generation, analysis, or updates.
>>
>> What I think this says is that we should be pushing Magnus more to
>> continue work on the configuration API thing he was designing.
>
> There's some magic in pgAdmin to parse and write the file. You could
> take a look a that.

Not really. You don't want to go there. We really need a "proper API"
for it, and the stuff in pgAdmin isn't even enough to base one on.

//Magnus


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 17:24:24
Message-ID: Pine.GSO.4.64.0806051321360.27711@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 5 Jun 2008, Magnus Hagander wrote:

> We really need a "proper API" for it, and the stuff in pgAdmin isn't
> even enough to base one on.

I would be curious to hear your opinion on whether the GUC overhaul
discussed in this thread is a useful precursor to building such a proper
API.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 17:35:02
Message-ID: 484823C6.6030207@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith wrote:
> On Thu, 5 Jun 2008, Magnus Hagander wrote:
>
>> We really need a "proper API" for it, and the stuff in pgAdmin isn't
>> even enough to base one on.
>
> I would be curious to hear your opinion on whether the GUC overhaul
> discussed in this thread is a useful precursor to building such a
> proper API.

Since I'm the guy who initially wrote that config file editing stuff, I
feel somehow addressed.
The answer is a clear ABSOLUTELY.

- The current implementation is able to edit the file directly or
through pgsql functions; any format change will affect that function
immediately.
- If documentation is enhanced by adding more comments in the
postgresql.conf file, this won't help the editor because it can't rely
on it to present help and hints to the user. It needs the comments/help
in pg_settings or alike.

Regards,
Andreas


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 17:42:09
Message-ID: 20080605174209.GL16502@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith wrote:
> On Thu, 5 Jun 2008, Magnus Hagander wrote:
>
>> We really need a "proper API" for it, and the stuff in pgAdmin isn't
>> even enough to base one on.
>
> I would be curious to hear your opinion on whether the GUC overhaul
> discussed in this thread is a useful precursor to building such a proper
> API.

I must say that I am confused by this thread. What's the discussed GUC
overhaul? Things that I vaguely recall being proposed are (in the order
they came to mind):

(1) Add a lot more comments to each setting
(2) Add documentation links to each setting
(3) Move more frequently used settings to the top of the file
(4) Ship different sample config files
(5) Create an expert system to suggest tuning
(6) Other random ideas (XML, settings in database, others?)

To me, there are two ideas that are doable right now, which are (2) and
(4). (1) seems to be a step backwards in pg_hba.conf experience, and we
would have to maintain duplicate documentation. (3) seems messy. (5)
is a lot of work; do we have volunteers? As for (6), the two examples I
give can be easily dismissed.

(2) and (4) do not seem necessary to get the config API built.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
To: Steve Atkins <steve(at)blighty(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 17:42:47
Message-ID: 48482597.1080006@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Steve Atkins wrote:
>
> I'd be interested in putting together a framework+GUI client to do this
> cleanly in a cross-platform (Windows, Linux, Solaris, OS X as a bare
> minimum) sort of way, if no-one else already has such a thing.
>
This is a great idea, and I was thinking along the same line. The
framework can provide generic interfaces for the GUI/Web client, and
leave it up to the OS to provide the needed data.

-Robert


From: Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 18:43:08
Message-ID: 484833BC.40607@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> This is even assuming that the tool needs to edit the file itself,
> rather than just give advice. The advice is the hard part, folks;
> could we stop obsessing about trivia?
+1. IMHO, the tool doesn't need to worry about generating a prettied
version of postgresql.conf. It should just inform the user about the
appropriate settings or create a postgresql.conf.recommend and have the
user update postgresql.conf himself.

-Robert


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 19:15:14
Message-ID: Pine.GSO.4.64.0806051452480.27254@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 5 Jun 2008, Alvaro Herrera wrote:

> I must say that I am confused by this thread. What's the discussed GUC
> overhaul?

http://wiki.postgresql.org/wiki/GUCS_Overhaul

I drop that URL in every other message in hopes that people might start
commenting on it directly if they see it enough; the fact that you're
confused says I may need to keep that up :(

> (1) Add a lot more comments to each setting
> (2) Add documentation links to each setting
> (3) Move more frequently used settings to the top of the file
> (4) Ship different sample config files
> (5) Create an expert system to suggest tuning
> (6) Other random ideas (XML, settings in database, others?)
>
> To me, there are two ideas that are doable right now, which are (2) and
> (4). (1) seems to be a step backwards in pg_hba.conf experience, and we
> would have to maintain duplicate documentation. (3) seems messy. (5)
> is a lot of work; do we have volunteers? As for (6), the two examples I
> give can be easily dismissed.
> (2) and (4) do not seem necessary to get the config API built.

(1) is in that proposal but is strictly optional as something to put in
the configuration file itself. The idea behind (2) is to enable tool
authors to have an easier way to suggest where to head for more
information. I'd like for it to be trivial for a tool to say "Suggested
value for <x> is <y>; see
http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html
for more information". I know what most of the settings I tinker with do,
but even I'd like it to be easier to find the right spot in the manual;
for newbies it's vital. You are correct that (2) isn't strictly necessary
here, but it's valuable and will be easier to wrap into this than to bolt
on later.

(3) (4) (5) and (6) were off-topic diversions.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 19:47:16
Message-ID: 484842C4.4070801@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> If those aren't enough questions, what else must we ask? Or maybe they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.
>
I think probably a combination of high and low-level questions and make
the low-level (more specific) questions optional since some users may
not be able to provide low-level info. Here's a rough idea of how I
envision this tool should work.

$ pg_config_wizard

Is this machine dedicated to Postgres? (y/n) n (now tool
auto-discovers available HW resources)
Your system has 32GB memory. What percentage do you want to allocate to
Postgres? (1=50%, 2=33%, 3=25%, etc) 1
What type of workload? (OLTP, DSS, etc)
...

At the end, the tool runs for a while checking to see if certain
thresholds are reached to determine which parameters need to be
increased. The tool would change the parameters causing the
bottlenecks, rerun Postgres/workload, and iterate a few times until the
results are satfisfactory. Write the recommended settings to
postgresql.conf.recommend and let the user update postgresql.conf
himself or whatever.

I update my postgresql.conf, rerun the app and see 100% increased in
throughput :-)

-Robert


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 19:50:24
Message-ID: 48484380.1010504@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> How far could we get with the answers to just three questions:
>
> * How many concurrent queries do you expect to have?
> * How much RAM space are you willing to let Postgres use?
> * How much "overhead" disk space are you willing to let Postgres use?

+1 to this approach - these are the kinds of questions that
make sense to me when first setting up a new installation.
They sound useful for both large servers and tiny (salesguy
laptop for demos) installations.

> If those aren't enough questions, what else must we ask?

* Perhaps something to guess FSM settings? I think FSM is
tunable I most often get wrong with more painful
consequences (bloat) than other tunables.
My approach is to have cron run database-wide vacuums
even on systems with autovacuum just to see the log
messages about FSM.

* Something to tune vacuum delay? Perhaps:
How much I/O bandwidth can be dedicated to Postgres
background activities?


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To:
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-05 19:53:55
Message-ID: 48484453.7030105@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Steve Atkins wrote:
> ... cross-platform (Windows, Linux, Solaris, OS X as a bare
> minimum)

I wonder how cross-platform the tuning algorithm itself is.

I could also imagine that decisions like "do I let the OS page
cache, or postgres's buffer cache get most of the memory" are
extremely OS dependent.

Also, the configuration tool would be even more useful if it could
offer extra platform-specific advice like "hey, I see you're using
this filesystem. You should be using this journaling option for WAL
and this other one for data", or "on your system you should be using
this fsync method", or "use relatime or noatime when mounting your
disk".


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 21:21:29
Message-ID: 20080605212129.GI14498@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Greg Smith <gsmith(at)gregsmith(dot)com> [080605 15:17]:
> On Thu, 5 Jun 2008, Alvaro Herrera wrote:
>
> >I must say that I am confused by this thread. What's the discussed GUC
> >overhaul?
>
> http://wiki.postgresql.org/wiki/GUCS_Overhaul
>
> I drop that URL in every other message in hopes that people might start
> commenting on it directly if they see it enough; the fact that you're
> confused says I may need to keep that up :(

I've read it. A couple times now. And I like lots of it.

> >(1) Add a lot more comments to each setting
> >(2) Add documentation links to each setting
> >(3) Move more frequently used settings to the top of the file
> >(4) Ship different sample config files
> >(5) Create an expert system to suggest tuning
> >(6) Other random ideas (XML, settings in database, others?)

> (3) (4) (5) and (6) were off-topic diversions.

But, right from the above mentioned page:

*Goals*

By shifting from a model where postgresql.conf is document-formatted
and hand-edited to one where it's machine generated, it becomes vastly
easier to write simple utilities to manage these settings. Right now,
the big "obstacle" to things like SET PERSISTENT is "how to we
preserve the hand-edited comments in the file" -- and the answer is we
don't.

This little goal leads to:

* By having a generated postgresql.conf and an easy way to generate
it, writing autoconfiguration scripts (as well as shortcuts like SET
PERSISTENT) become vastly easier.

And later:

There needs to be a way to modify the underlying settings and save
that into a new machine-generated postgresql.conf file. Is
implementing SET PERSISTENT sufficient for that?

I think that these parts, seemingly "snuck into" the GUC overhaul
proposal is what people like me a wary of. People like me don't want
to have postgresql.conf be *only* a machine-generated file, which I am
not allowed to edit anymore because next DBA doing a "SET PERSISTANT"
type of command is going to cause postgres to write out something else,
over-writing my carefully documented reason for some particular setting.

But the big issue I have (not that it really matters, because I'm not
one of the ones working on it, so I please don't take this as me telling
anyone what they can or can't do) is that that goal doesn't solve any of
the listed problems stated in the proposal:

1. Most people have no idea how to set these.

2. The current postgresql.conf file is a huge mess of 194 options,
the vast majority of which most users will never touch.

3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
and settings.sgml), which are only synched with each other manually.

4. We don't seem to be getting any closer to autotuning.

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 21:47:55
Message-ID: Pine.GSO.4.64.0806051725550.10475@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 5 Jun 2008, Aidan Van Dyk wrote:

> People like me don't want to have postgresql.conf be *only* a
> machine-generated file, which I am not allowed to edit anymore because
> next DBA doing a "SET PERSISTANT" type of command is going to cause
> postgres to write out something else, over-writing my carefully
> documented reason for some particular setting.

This is why there's the emphasis on preserving comments as they pass into
the GUC structure and back to an output file. This is one of the
implementation details I haven't fully made up my mind on: how to clearly
label user comments in the postgresql.conf to distinguish them from
verbose ones added to the file. I have no intention of letting manual
user edits go away; what I'm trying to do here (and this part is much more
me than Josh) is make them more uniform such that they can co-exist with
machine edits without either stomping on the other. Right now doing that
is difficult, because it's impossible to tell the default comments from
the ones the users added and the current comment structure bleeds onto the
same lines as the settings.

> But the big issue I have (not that it really matters, because I'm not
> one of the ones working on it, so I please don't take this as me telling
> anyone what they can or can't do) is that that goal doesn't solve any of
> the listed problems stated in the proposal
> 1. Most people have no idea how to set these.

Making it much easier to build recommendation tools is how this helps
here.

> 2. The current postgresql.conf file is a huge mess of 194 options,
> the vast majority of which most users will never touch.

The proposed pg_generate_conf tool includes options to spit out a basic
configuration file instead of the complete one.

> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
> and settings.sgml), which are only synched with each other manually.

The proposal throws away having a separate postgresql.conf file, so that
reduces it from 3 places to 2. That's moving in the right direction

> 4. We don't seem to be getting any closer to autotuning.

If you try to build a tuning tool, these areas end up being the
unnecessarily hard parts.

Thanks for the comments on the proposal. I'm only bothering to respond to
messages like yours now, am deleting all of the continuing attemps to
divert the discussion over to parameter tuning details or expanding the
scope here.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Aidan Van Dyk <aidan(at)highrise(dot)ca>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 00:13:22
Message-ID: 3F927D52-A0FD-4575-8D24-F0F70C459235@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 5, 2008, at 14:47, Greg Smith wrote:

> This is why there's the emphasis on preserving comments as they pass
> into the GUC structure and back to an output file. This is one of
> the implementation details I haven't fully made up my mind on: how
> to clearly label user comments in the postgresql.conf to distinguish
> them from verbose ones added to the file. I have no intention of
> letting manual user edits go away; what I'm trying to do here (and
> this part is much more me than Josh) is make them more uniform such
> that they can co-exist with machine edits without either stomping on
> the other. Right now doing that is difficult, because it's
> impossible to tell the default comments from the ones the users
> added and the current comment structure bleeds onto the same lines
> as the settings.

How about a simple rule, such as that machine-generated comments start
with "##", while user comments start with just "#"? I think that I've
seen such a rule used before. At any rate, I think that, unless you
have some sort of line marker for machine-generated comments, there
will be no way to tell them apart from user comments.

Other possibilities for machine-comments:

## Machine comment
### Machine comment
#! Machine comment
#@ Machine comment
#$ Machine comment
#^ Machine comment
# Machine comment

I actually kinda like "#!". It's distinctive and unlikely to appear in
a user comment. Anyway, just food for thought.

Best,

David


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 00:24:52
Message-ID: 20080606002452.GY16502@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David E. Wheeler wrote:

> How about a simple rule, such as that machine-generated comments start
> with "##", while user comments start with just "#"? I think that I've
> seen such a rule used before. At any rate, I think that, unless you have
> some sort of line marker for machine-generated comments, there will be no
> way to tell them apart from user comments.

FWIW smb.conf uses ; for one purpose and # for the other.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 00:53:16
Message-ID: Pine.GSO.4.64.0806052029290.9064@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 5 Jun 2008, Alvaro Herrera wrote:

> FWIW smb.conf uses ; for one purpose and # for the other.

They're actually combining the way UNIX files use # with how Windows INI
files use ; in a config file context, which I personally find a little
weird.

I was already considering keeping user comments as # while making all
system-inserted ones #! ; many people are already used to #! having a
special system-related meaning from its use in UNIX shell scripting which
makes it easier to remember.

I think the next step to this whole plan is to generate a next-gen
postgresql.conf mock-up showing what each of the outputs from the
pg_generate_conf tool might look like to get feedback on that; it will
make what is planned here a bit easier to understand as well.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 01:07:00
Message-ID: 6FD0F350-6B8C-459A-B51E-6C6C6A4B4BE5@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 5, 2008, at 17:53, Greg Smith wrote:

> I was already considering keeping user comments as # while making
> all system-inserted ones #! ; many people are already used to #!
> having a special system-related meaning from its use in UNIX shell
> scripting which makes it easier to remember.

Oooh, yeah. I hadn't even thought of that! I was just looking at
characters on my keyboard and typing them in to see which ones I
thought were most distinctive. This may be part of the reason I
thought that #! was distinctive. :-)

Best,

David


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Aidan Van Dyk" <aidan(at)highrise(dot)ca>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 06:08:07
Message-ID: 4848D447.10005@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David E. Wheeler wrote:
> On Jun 5, 2008, at 14:47, Greg Smith wrote:
>
>> This is why there's the emphasis on preserving comments as they pass
>> into the GUC structure and back to an output file. This is one of the
>> implementation details I haven't fully made up my mind on: how to
>> clearly label user comments in the postgresql.conf to distinguish them
>> from verbose ones added to the file. I have no intention of letting
>> manual user edits go away; what I'm trying to do here (and this part
>> is much more me than Josh) is make them more uniform such that they
>> can co-exist with machine edits without either stomping on the other.
>> Right now doing that is difficult, because it's impossible to tell the
>> default comments from the ones the users added and the current comment
>> structure bleeds onto the same lines as the settings.
>
> How about a simple rule, such as that machine-generated comments start
> with "##", while user comments start with just "#"? I think that I've
> seen such a rule used before. At any rate, I think that, unless you have
> some sort of line marker for machine-generated comments, there will be
> no way to tell them apart from user comments.

What comments do we consider machine-generated? Just the ones used to
comment out settings, like

#shared_buffers = 32MB

or something else?

If the automatic tool lets alone all other kind of comments, I think
we're fine. In fact, it wouldn't necessarily need to modify those
comments either, it could simply add a new setting line below that:

#shared_buffers = 32MB
shared_buffers = 1024MB

For extra safety, it could comment out old settings, perhaps with
something like this:

#shared_buffers = 32MB
#shared_buffers = 1024MB # commented out by wizard on 2008-06-05
shared_buffers = 2048MB

This would preserve a full change history in the file. It would become
quite messy after a lo of changes, of course, but a user can trim the
history by hand if he wants to.

Or perhaps we should explicitly mark the settings the tool has
generated, and comment out:

#shared_buffers = 32MB # commented out by wizard on 2008-06-05
shared_buffers = 1024MB # automatically set by wizard on 2008-06-05

That way the tool could safely replace automatically set settings,
without replacing manually set ones without leaving a clear trace of
what happened.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To:
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 08:51:50
Message-ID: 4848FAA6.8000903@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David E. Wheeler wrote:
>
> How about a simple rule, such as that machine-generated comments start
> with "##", while user comments start with just "#"? I think that I've
> seen such a rule used before. At any rate, I think that, unless you
> have some sort of line marker for machine-generated comments, there
> will be no way to tell them apart from user comments.

Two heretical questions:
Do we need user generated comments at all?
I can't remember ever having used any comment in postgresql.conf.

Why do so many people here insist on editing postgresql.conf as primary
means of changing config params?
Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing
postgresql.conf or doing it manually?

Looking around for different approaches, network appliances come to my
mind, e.g. Cisco routers and PIX. You have 3 ways to configure a pix:
- use a command line (using ssh or telnet, eqivalent to psql); WRITE
MEMORY to make the changes survive a reboot.
- use a web interface (or similar tool)
- use tftp to up/download the complete config in and out, editing the
file. User comments will be lost, with the exception of those that have
been applied with special comment commands (equivalent to "comment on").

Regards,
Andreas


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 10:13:42
Message-ID: 87ej7aeuqx.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Andreas Pflug" <pgadmin(at)pse-consulting(dot)de> writes:

> Why do so many people here insist on editing postgresql.conf as primary means
> of changing config params?
> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing
> postgresql.conf or doing it manually?

no, it's awful.

> Looking around for different approaches, network appliances come to my mind,
> e.g. Cisco routers and PIX. You have 3 ways to configure a pix:
> - use a command line (using ssh or telnet, eqivalent to psql); WRITE MEMORY to
> make the changes survive a reboot.
> - use a web interface (or similar tool)
> - use tftp to up/download the complete config in and out, editing the file.
> User comments will be lost, with the exception of those that have been applied
> with special comment commands (equivalent to "comment on").

And in every major installation I've seen people use the last option. They
treat the original text file which is kept elsewhere -- normally checked into
some revision control system, tracked and managed like source code -- as the
canonical and authoritative version.

The fact that you have to go through an especially complicated and awkward
process to load it onto the switches is just a technical detail to be worked
around. And the fact that somebody could have altered the configuration
directly on the switch and forgotten to write it to the configuration file is
a danger to be worked around with policy, not a benefit.

Essentially what good cisco network admins are doing here is working around
the awkward interface using policy and replacing it with a static text
configuration file kept elsewhere.

Experience with other attempts at automatic tools to edit configuration files
like webmin etc are that editing text files which are also under user control
is just a terrible idea. You can make it almost work but it's always fragile
and can't keep up with the flexibility of free-form text and the creativity of
users.

What I would suggest is going back to the plan when we added include files.
Have an automatically generated config file which is freshly re-generated each
time and doesn't try to preserve any user-formatting. Users then include that
file at the top of postgresql.conf and override any of the settings by setting
them in postgresql.conf after the include.

The important thing here is to separate files which are "under postgres
control" and those "under user control". That doesn't necessarily mean we have
to ban users from touching the automatically generated config file or switch
formats, but it relieves us of any responsibility for maintaining free-form
text.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 10:37:48
Message-ID: 4849137C.3040008@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark wrote:
> "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de> writes:
>
>
>> Why do so many people here insist on editing postgresql.conf as primary means
>> of changing config params?
>> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing
>> postgresql.conf or doing it manually?
>>
>
> no, it's awful.
>
>

So I wonder why you accept it when configuring schemas. What's the big
difference between setting a config param, and creating a table?

And ultimately, the config param file format may well look like an SQL
command file, restricted to SET only.

> And in every major installation I've seen people use the last option. They
> treat the original text file which is kept elsewhere -- normally checked into
> some revision control system, tracked and managed like source code -- as the
> canonical and authoritative version.
>

That's how you'd have to manage the schema sources too, no? Your
comments are lost as well after schema creation scripts are executed,
and manual changes may interfere with that.

Regards,
Andreas


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 11:19:33
Message-ID: 877id2erp6.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Andreas Pflug" <pgadmin(at)pse-consulting(dot)de> writes:

> Gregory Stark wrote:
>> "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de> writes:
>>
>>> Why do so many people here insist on editing postgresql.conf as primary means
>>> of changing config params?
>>> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing
>>> postgresql.conf or doing it manually?
>>
>> no, it's awful.
>
> So I wonder why you accept it when configuring schemas. What's the big
> difference between setting a config param, and creating a table?

Yes, this is a big impedence mismatch between the old-school DBMS (with
emphasis on the MS part of the acronym) architecture and more modern
approaches. You don't configure your web site with "ALTER PAGE home.html SET
FORM contactus INPUT TYPE SUBMIT" either...

This is partly out of necessity though as those database objects contain
*data*. So merely replacing them with new data objects doesn't give the system
enough information to understand what to do with that data. The DDL commands
which modify the schema give that kind of data mutating instruction.

And I'll note that DBAs go to *great* lengths to do exactly the same kind of
thing I described the CISCO DBAs doing. Usually that means doing an ALTER
command and separately editing a creation script and trying to keep the two in
sync. It's a huge burden.

> And ultimately, the config param file format may well look like an SQL command
> file, restricted to SET only.

So all you have is our existing file except with an additional layer of
quoting to deal with, a useless SET keyword to annoy users, and a file that
you need a bison parser to deal instead of a simple keyword-value syntax that
sed can manipulate.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 11:45:26
Message-ID: 48492356.4040506@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark wrote:
> So all you have is our existing file except with an additional layer of
> quoting to deal with, a useless SET keyword to annoy users, and a file that
> you need a bison parser
Don't you think that's a little over the top, throwing bison at the
simple task to extend postgresql.conf scanning so it accepts --, /**/
and SET?

> to deal instead of a simple keyword-value syntax that
> sed can manipulate.
>
>
sed seems to be THE killer application... Apart from the fact that sed
easily could identify SET, my answer would be to use psql to modify the
config, not sed.

Text config files are NOT friendly for beginner and mediocre users. IMHO
the current restriction on GUC changes is a major obstacle towards pgsql
tuning tools, e.g. written as a Google SoC project. Graphic tools aren't
too popular at pgsql-hackers, but please contemplate a little how much
pgadmin may have contributed to the pgsql usage boost, esp. on windows.

Regards,
Andreas


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Aidan Van Dyk <aidan(at)highrise(dot)ca>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 12:24:25
Message-ID: 200806061424.26830.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Mittwoch, 4. Juni 2008 schrieb Aidan Van Dyk:
> > When reading this thread, I'm wondering if anybody ever saw a config
> > file for a complex software product that was easily editable and
> > understandable. I don't know one. If there was one, it'd be nice to know
> > it so we can learn from it.
>
> PostreSQL, Apache, X.org

And note that one of the major advances in X.org over XFree86 was that all the
useless garbage was removed from the configuration file, so that the final
and usable configuration fits on one screen, and you can even write it from
memory if you dare to.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 12:35:00
Message-ID: 200806061435.01196.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Mittwoch, 4. Juni 2008 schrieb Tom Lane:
> * Can we present the config options in a more helpful way (this is 99%
> a documentation problem, not a code problem)?

ack

> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?

Some questions to clarify this:

- What settings do "newbies" (or anyone else) typically need to change?
Please post a list.

- What values would you set those settings to? Please provide a description
for arriving at a value, which can later be transformed into code. Note that
in some cases, not even the documentation provides more than handwaving help.

- If we know better values, why don't we set them by default?

Another orthogonal stumbling block on the way to making all of this automatic
is that the surely criticial shared_buffers setting will in any useful
configuration require messing around with kernel settings that no PostgreSQL
tool can really help with.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 12:39:31
Message-ID: 200806061439.32335.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Donnerstag, 5. Juni 2008 schrieb Tom Lane:
> How far could we get with the answers to just three questions:
>
> * How many concurrent queries do you expect to have?
>
> * How much RAM space are you willing to let Postgres use?
>
> * How much "overhead" disk space are you willing to let Postgres use?

This is surely a good start. We could optimize this even more by saying, disk
space is cheap, so let's just use a much higher default setting for
checkpoint_segments. (If PostgreSQL is installed but not actually used, not
all the space is actually going to be used anyway.) Then, increase
max_connections a bit; that should be OK for most users. Then you are left
with the memory settings, and those need kernel tuning in most cases, so any
automation tool loses. Hmm.


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
Cc: "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 12:40:40
Message-ID: 20080606124040.GK14498@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Andreas Pflug <pgadmin(at)pse-consulting(dot)de> [080606 04:50]:
> David E. Wheeler wrote:
> >
> >How about a simple rule, such as that machine-generated comments start
> >with "##", while user comments start with just "#"? I think that I've
> >seen such a rule used before. At any rate, I think that, unless you
> >have some sort of line marker for machine-generated comments, there
> >will be no way to tell them apart from user comments.
>
> Two heretical questions:
> Do we need user generated comments at all?
> I can't remember ever having used any comment in postgresql.conf.

Well, I have, mainly to leave traces as to what and why I've changed
something from a default/previous value, based on "chagne, hope, and
test" style tuning.

And the one that Greg brought up earlier:
## Don't make this too high, or linux OOM will kill ther server!!!

I'm guessing that comment was put in for a reason too.

> Why do so many people here insist on editing postgresql.conf as primary
> means of changing config params?
> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing
> postgresql.conf or doing it manually?

I would guess main for 3 reasons:

1) It's always been that way, it's the traditional "unix" way, and many
admins are comfortable being able to comment/make quick test
changes with simple text files.
2) Their postgresql.conf are distrubuted/synced/generated from central
provisioning/SCM system
3) PostgreSQL the server isn't even running

> Looking around for different approaches, network appliances come to my
> mind, e.g. Cisco routers and PIX. You have 3 ways to configure a pix:
> - use a command line (using ssh or telnet, eqivalent to psql); WRITE
> MEMORY to make the changes survive a reboot.
> - use a web interface (or similar tool)
> - use tftp to up/download the complete config in and out, editing the
> file. User comments will be lost, with the exception of those that have
> been applied with special comment commands (equivalent to "comment on").

And, of course, other server software comes to mind too:
apache, bind, postfix, sendmail, dhcpd, sshd, cron, xinetd... ;-)

a.
--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 13:16:49
Message-ID: 20080606131649.GL14498@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Peter Eisentraut <peter_e(at)gmx(dot)net> [080606 08:25]:
> Am Mittwoch, 4. Juni 2008 schrieb Aidan Van Dyk:
> > > When reading this thread, I'm wondering if anybody ever saw a config
> > > file for a complex software product that was easily editable and
> > > understandable. I don't know one. If there was one, it'd be nice to know
> > > it so we can learn from it.
> >
> > PostreSQL, Apache, X.org
>
> And note that one of the major advances in X.org over XFree86 was that all the
> useless garbage was removed from the configuration file, so that the final
> and usable configuration fits on one screen, and you can even write it from
> memory if you dare to.

Ya, and unfortunately, I haven't got my dual-head to work with recent
distro X.org either, probably because I haven't found how how/where/what
to set in Xorg.conf for new version (and my old one doesn't "just work"
on new X.org either). So I've got an old x.org version pinned, even
though I'm pretty sure that a newer X.org is probably better, dispite my
lack of ability to configure it.

Go figure ;-)

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 14:29:14
Message-ID: 873anqeix0.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Andreas Pflug" <pgadmin(at)pse-consulting(dot)de> writes:

> Gregory Stark wrote:
>> So all you have is our existing file except with an additional layer of
>> quoting to deal with, a useless SET keyword to annoy users, and a file that
>> you need a bison parser
>
> Don't you think that's a little over the top, throwing bison at the simple task
> to extend postgresql.conf scanning so it accepts --, /**/ and SET?

Well you're the one who proposed generalising the simple text file format to
SET commands.

The whole point of a machine-editable format would be to make it more
restricted and predictable, not less so. We had a proposal to do a
postgresql.conf.auto which was automatically generated included from a
free-form postgresql.conf. That file could generated in a simple format
without worrying about formatting, comments, ordering, etc.

> Text config files are NOT friendly for beginner and mediocre users. IMHO the
> current restriction on GUC changes is a major obstacle towards pgsql tuning
> tools, e.g. written as a Google SoC project. Graphic tools aren't too popular
> at pgsql-hackers, but please contemplate a little how much pgadmin may have
> contributed to the pgsql usage boost, esp. on windows.

Like it or not computers actually have to store state when you're done
entering it via the GUI. If you design around the GUI you end up with system
that can *only* be used via a GUI and spend years trying to work around that
(witness Windows which is only now with a lot of effort recovering from that
mistake).

People got all focused on comments and sed, but in fact there are a host of
reasons to want to be able to directly edit the state stored in a simple text
file. You might like to group together related options, for example, perhaps
grouping together the options that your site have to adjust between dev and
production -- and probably have instructions on how to make the transition in
a commented section.

I happen to think being presented with 200 options is less confusing than
being presented with an empty file and being told to go read the documentation
to find out whether I might want to put anything in it. The documentation is a
good reference but isn't good as a "here's what you'll want to check before
you start" guide.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc: <pgsql-hackers(at)postgresql(dot)org>, "Aidan Van Dyk" <aidan(at)highrise(dot)ca>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 14:58:19
Message-ID: 87ve0md304.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Peter Eisentraut" <peter_e(at)gmx(dot)net> writes:

> And note that one of the major advances in X.org over XFree86 was that all the
> useless garbage was removed from the configuration file, so that the final
> and usable configuration fits on one screen, and you can even write it from
> memory if you dare to.

This is a good point, though I think the big advance wasn't in actually
removing the garbage so much as making it irrelevant. That is, the advances
were in making the server actually work automatically with less manual
configuration.

That's always a noble goal. I would be interested in what parameters people
thought we could eliminate from the initial config file because the server
normally gets it right anyways and users shouldn't have to adjust it anyways.

(I'm specifically *not* referring to sections like the kerberos parameters
which even if most users can leave them alone those who do use that feature
*do* need to adjust those parameters because the server does *not* get them
right automatically)

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 15:12:56
Message-ID: 1618.1212765176@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andreas Pflug <pgadmin(at)pse-consulting(dot)de> writes:
> Text config files are NOT friendly for beginner and mediocre users. IMHO
> the current restriction on GUC changes is a major obstacle towards pgsql
> tuning tools, e.g. written as a Google SoC project. Graphic tools aren't
> too popular at pgsql-hackers, but please contemplate a little how much
> pgadmin may have contributed to the pgsql usage boost, esp. on windows.

I grow weary of this thread. I will say it once more: I do not believe
for one instant that the current formatting of postgresql.conf is the
major impediment, or even a noticeable impediment, to producing a useful
configuration wizard. If you wish to prove otherwise, provide a
complete wizard except for the parts that touch the config file, and
I will promise to finish it.

I will not read or respond to any further discussion of changing the
config file format. It's a waste of bandwidth.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 15:16:50
Message-ID: 1669.1212765410@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> - If we know better values, why don't we set them by default?

The problem is: better for what? In particular, I'm uncomfortable with
any changes in the direction of trying to make Postgres take over the
entire machine by default. I'd want some fairly explicit permission
from the user for that ...

regards, tom lane


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 15:36:47
Message-ID: 4849598F.9080800@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> - If we know better values, why don't we set them by default?
>
> The problem is: better for what? In particular, I'm uncomfortable with
> any changes in the direction of trying to make Postgres take over the
> entire machine by default. I'd want some fairly explicit permission
> from the user for that ...

That is where some 80% solution sample config files come in.

Joshua D. Drake

>
> regards, tom lane
>


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 15:50:54
Message-ID: 200806061750.55582.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Freitag, 6. Juni 2008 schrieb Tom Lane:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > - If we know better values, why don't we set them by default?
>
> The problem is: better for what? In particular, I'm uncomfortable with
> any changes in the direction of trying to make Postgres take over the
> entire machine by default. I'd want some fairly explicit permission
> from the user for that ...

Yes, those are decisions we are going to have to make, eventually. But recall
the three step process:

1. What values need changing?

2. What to change them to?

3. Could that be the new default value?

I haven't seen a proposal for item 1 yet, so the rest is idle discussion at
this time.


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 15:55:21
Message-ID: 48495DE9.9040409@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut wrote:
> Am Freitag, 6. Juni 2008 schrieb Tom Lane:
>> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>>> - If we know better values, why don't we set them by default?
>> The problem is: better for what? In particular, I'm uncomfortable with
>> any changes in the direction of trying to make Postgres take over the
>> entire machine by default. I'd want some fairly explicit permission
>> from the user for that ...
>
> Yes, those are decisions we are going to have to make, eventually. But recall
> the three step process:
>
> 1. What values need changing?

shared_buffers
work_mem
maintenance_work_mem
checkpoint_segments
wal_sync_method
effective_cache_size

> I haven't seen a proposal for item 1 yet, so the rest is idle discussion at
> this time.

I think those cover the biggest low hanging fruit, async_commit is arguable.

Sincerely,

Joshua D. Drake


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 15:59:09
Message-ID: 200806061759.10221.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Freitag, 6. Juni 2008 schrieb Joshua D. Drake:
> That is where some 80% solution sample config files come in.

Considering that writing a sample configuration file is trivial, yet I haven't
seen a single one posted in the six or more years of GUC, I have no faith in
this plan until I actually see it implemented.


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 16:21:57
Message-ID: 48496424.2050600@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut wrote:
> Am Freitag, 6. Juni 2008 schrieb Joshua D. Drake:
>> That is where some 80% solution sample config files come in.
>
> Considering that writing a sample configuration file is trivial, yet I haven't
> seen a single one posted in the six or more years of GUC, I have no faith in
> this plan until I actually see it implemented.

I fail to see why anyone would bother with the amount of noise on this
thread. If people start leaning toward an actual solution, I am sure the
work will get done.

Joshua D. Drake


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Aidan Van Dyk" <aidan(at)highrise(dot)ca>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 16:22:39
Message-ID: 3D652D3F-4642-4DC6-B883-E09240F196CE@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 5, 2008, at 23:08, Heikki Linnakangas wrote:

> What comments do we consider machine-generated? Just the ones used
> to comment out settings, like
>
> #shared_buffers = 32MB
>
> or something else?

Those and documentation comments.

> If the automatic tool lets alone all other kind of comments, I think
> we're fine. In fact, it wouldn't necessarily need to modify those
> comments either, it could simply add a new setting line below that:
>
> #shared_buffers = 32MB
> shared_buffers = 1024MB

Well, we've been talking about having varying levels of documentation
in the comments of the file based on the options passed to the
configuration program. I think that these are the primary concern,
though Greg, please do correct me if I'm mistaken.

> For extra safety, it could comment out old settings, perhaps with
> something like this:
>
> #shared_buffers = 32MB
> #shared_buffers = 1024MB # commented out by wizard on 2008-06-05
> shared_buffers = 2048MB
>
> This would preserve a full change history in the file. It would
> become quite messy after a lo of changes, of course, but a user can
> trim the history by hand if he wants to.

I guess that could be a feature. Personally, I use a vcs system for
that.

Best,

David


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 16:25:06
Message-ID: 866D369E-782F-4EBD-B4B4-250FF6068D0D@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 6, 2008, at 01:50, Andreas Pflug wrote:

> Two heretical questions:
> Do we need user generated comments at all?
> I can't remember ever having used any comment in postgresql.conf.

That's a valid point. I've used comments to note by whom and when when
a setting was changed.

> Why do so many people here insist on editing postgresql.conf as
> primary means of changing config params?
> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as
> sed'ing postgresql.conf or doing it manually?

I think that there has been enough pushback against housing all the
settings in the database, not to mention that it calls for an API,
that just starting with something simpler to parse the file and
rewrite it from the command-line might be a better first step.

> Looking around for different approaches, network appliances come to
> my mind, e.g. Cisco routers and PIX. You have 3 ways to configure a
> pix:
> - use a command line (using ssh or telnet, eqivalent to psql); WRITE
> MEMORY to make the changes survive a reboot.
> - use a web interface (or similar tool)
> - use tftp to up/download the complete config in and out, editing
> the file. User comments will be lost, with the exception of those
> that have been applied with special comment commands (equivalent to
> "comment on").

I think the first option there is the one that's been getting the most
support here.

Best,

David


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 16:44:42
Message-ID: 200806061244.43081.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday 04 June 2008 15:48:47 Andrew Dunstan wrote:
> simply remove all the comment lines from your
> config file.

+1. That would clear up a lot of confusion on it's own.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 17:00:38
Message-ID: 200806061300.38871.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Saturday 31 May 2008 17:34:27 David E. Wheeler wrote:
> On May 31, 2008, at 12:36, Gregory Stark wrote:
> > What this sounds like is a sly way to try to get rid of
> > postgresql.conf
> > entirely and replace it with parameters stored in the database so
> > admins would
> > adjust the parameters using an SQL syntax rather than a text file.
> >
> > There are pros and cons of such a system but I think for newbie
> > admins that
> > would be a thousand times *more* baffling. You would have to learn new
> > commands and have no holistic view of what parameters had been set,
> > what
> > related parameters might exist. You also have no way to keep the
> > file in a
> > version control system or sync across servers etc.
>
> FWIW, this has not been a barrier to MySQL adoption.
>

agreed. using sql like syntax is often a more familiar operation than editing
a text file for DBAs. Consider the number of DBA's that manage other DB's
strictly through graphical interfaces (something we are rather lacking in
ability to do comparativly). consider also that if this is built atop the
current pg_settings, all the normal sql tools can still help... ie.
pg_dump -t if you want to check something into svn.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 17:07:25
Message-ID: 20080606170725.GA21030@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* David E. Wheeler <david(at)kineticode(dot)com> [080606 12:22]:

> I guess that could be a feature. Personally, I use a vcs system for
> that.

Bugger... Now we only need to make postgresql check postmaster.conf
into git everytime it makes a change...

;-)

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 17:15:57
Message-ID: 1212772557.12052.520.camel@jd-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2008-06-06 at 13:07 -0400, Aidan Van Dyk wrote:
> * David E. Wheeler <david(at)kineticode(dot)com> [080606 12:22]:
>
> > I guess that could be a feature. Personally, I use a vcs system for
> > that.
>
> Bugger... Now we only need to make postgresql check postmaster.conf
> into git everytime it makes a change...

Just work with the FAM api to automate it :P

Joshua D. Drake

>
> ;-)
>
>


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 17:54:06
Message-ID: 6afb6c41b47e5d39dc12091e14639f2f@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> Bugger... Now we only need to make postgresql check postmaster.conf
> into git everytime it makes a change...

Been there, wrote that. (for postgresql.conf anyway).

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200806061351
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhJeXQACgkQvJuQZxSWSshS3QCfQMC092pMGm/FiRJkYmu9j68e
VDYAoM051BtIF8Hsemsrs//jSv3dkGel
=otzg
-----END PGP SIGNATURE-----


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 18:33:13
Message-ID: 484982E9.5@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark wrote:
>
>> Text config files are NOT friendly for beginner and mediocre users. IMHO the
>> current restriction on GUC changes is a major obstacle towards pgsql tuning
>> tools, e.g. written as a Google SoC project. Graphic tools aren't too popular
>> at pgsql-hackers, but please contemplate a little how much pgadmin may have
>> contributed to the pgsql usage boost, esp. on windows.
>>
>
> Like it or not computers actually have to store state when you're done
> entering it via the GUI. If you design around the GUI you end up with system
> that can *only* be used via a GUI and spend years trying to work around that
> (witness Windows which is only now with a lot of effort recovering from that
> mistake).
>

I never advocated a file format that isn't editable any more; au
contraire. And the statement that a GUI configuration contradicts manual
editing is plain wrong, even with most windows software (if written
nicely to the registry, you could dump the key, edit and reload it, or
use regedit).
OTOH, you can't make most windows users happy with a text file version only.

Regards,
Andreas


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 18:46:07
Message-ID: 484985EF.8080403@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> I grow weary of this thread. I will say it once more: I do not believe
> for one instant that the current formatting of postgresql.conf is the
> major impediment, or even a noticeable impediment, to producing a useful
> configuration wizard. If you wish to prove otherwise, provide a
> complete wizard except for the parts that touch the config file, and
> I will promise to finish it.
>
> I will not read or respond to any further discussion of changing the
> config file format. It's a waste of bandwidth.
>

Your statement doesn't really surprise me. Apparently you kind of
celebrate misunderstanding my point, which isn't primarily about the
file format, but about config param accessibility via api/SQL.

I personally wouldn't even think about starting such a wizard, unless I
have an idea how to push the result into the database. No, not a file,
but via SQL! So your statement you won't react unless a wizard is almost
ready is prohibitive, apart from the fact that not only wizards
(featuring AI) are interesting, but simple config tools as well.

Regards,
Andrads


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 19:06:11
Message-ID: 200806061506.12093.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday 02 June 2008 10:12:06 Tom Lane wrote:
> I have no objection to providing alternative ways to edit the
> configuration data, but the primary source of the settings is
> going to continue to be an editable text file. Any proposals for
> alternatives-to-a-text-editor have to work within that reality.
>

I think there is some disagreement that using text files needs to be
the "primary" way to edit configurations. Some people are of the opinion
that we should focus on text files as a secondary method, generally to be
used only when a more sql/gui oriented way wont work (ie. your shared buffer
restart scenario). I think most of the user base would like to approach
administration from that point-of-view, and as of yet I haven't seen a
technical reason why that world view is wrong, only philosphical ones.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 19:22:58
Message-ID: Pine.GSO.4.64.0806061500450.7804@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 6 Jun 2008, Peter Eisentraut wrote:

> - What settings do "newbies" (or anyone else) typically need to change?
> Please post a list.
> - What values would you set those settings to? Please provide a description
> for arriving at a value, which can later be transformed into code. Note that
> in some cases, not even the documentation provides more than handwaving help.

Josh's spreadsheet at
http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc provides
five different models for setting the most critical parameters based on
different types of workloads. Everyone can quibble over the fine tuning,
but having a good starter set of reasonable settings for these parameters
is a solved problem. It's just painful to build a tool to apply the
available expert knowledge that is already around.

> - If we know better values, why don't we set them by default?

Because there's not enough information available; the large differences
between how you tune for different workloads is one example. Another is
that people tune for peak and projected activity rather than just what's
happening right now. Every model suggested for a tuning wizard recognizes
you need to ask some set of questions to nail things down. I continue to
repeat in broken-record style, exactly what a tuning tool will ask about
and what settings it will suggest is not important, and getting into that
is an entirely different discussion (one that gets hashed out every single
day on pgsql-performance). The fact that writing such a tool is harder
than it should be is the issue here.

> Another orthogonal stumbling block on the way to making all of this
> automatic is that the surely criticial shared_buffers setting will in
> any useful configuration require messing around with kernel settings
> that no PostgreSQL tool can really help with.

Yes. So? All you can do is point this out to users.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 19:41:36
Message-ID: 200806061541.36984.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday 04 June 2008 22:04:54 Greg Smith wrote:
> On Wed, 4 Jun 2008, Aidan Van Dyk wrote:
> > * Are we always spilling small amounts of data to disk for sorting? A
> > a small work_mem increase might help...
>
> I was just talking to someone today about building a monitoring tool for
> this. Not having a clear way to recommend people monitor use of work_mem
> and its brother spilled to disk sorts is an issue right now, I'll whack
> that one myself if someone doesn't beat me to it before I get time.
>

I remember *years* ago, someone wrote a perl script to poll pgsql_tmp and
print out anytime something showed up... you could probably find that in the
archives if you look around.

of course to me this sounds like an excellent idea for a dtrace probe ;-)

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Steve Atkins <steve(at)blighty(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 19:49:56
Message-ID: 3F30213C-4366-4938-98F6-E8D0792EAB31@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Jun 6, 2008, at 12:22 PM, Greg Smith wrote:

> On Fri, 6 Jun 2008, Peter Eisentraut wrote:
>
>> - What settings do "newbies" (or anyone else) typically need to
>> change?
>> Please post a list.
>> - What values would you set those settings to? Please provide a
>> description
>> for arriving at a value, which can later be transformed into code.
>> Note that
>> in some cases, not even the documentation provides more than
>> handwaving help.
>
> Josh's spreadsheet at http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc
> provides five different models for setting the most critical
> parameters based on different types of workloads. Everyone can
> quibble over the fine tuning, but having a good starter set of
> reasonable settings for these parameters is a solved problem.

> It's just painful to build a tool to apply the available expert
> knowledge that is already around.

I hope to have something in a week or so that's a first cut at that
tool. I'm aiming at a GUI tool at first, as I think that's the main
need, though once the basics are done an interactive or non-
interactive CLI version shouldn't be a big deal.

Cheers,
Steve


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 19:57:14
Message-ID: 87prqucp5x.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Andreas Pflug" <pgadmin(at)pse-consulting(dot)de> writes:

> I personally wouldn't even think about starting such a wizard, unless I have an
> idea how to push the result into the database. No, not a file, but via SQL! So
> your statement you won't react unless a wizard is almost ready is prohibitive,
> apart from the fact that not only wizards (featuring AI) are interesting, but
> simple config tools as well.

Well there's a perfectly good place to start today. Dump out a config file
with the settings your wizard or config tool wants to change from their
current values.

We could come up with a canonical filename for such a file and include an
include directive for it in the standard configuration file or you could just
include instructions saying what include line to add to your config file in
the installation instructions for your tool.

Great, I'm glad we've resolved that issue.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 19:59:05
Message-ID: Pine.GSO.4.64.0806061543200.7804@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 6 Jun 2008, Heikki Linnakangas wrote:

> Or perhaps we should explicitly mark the settings the tool has generated, and
> comment out:
>
> #shared_buffers = 32MB # commented out by wizard on 2008-06-05
> shared_buffers = 1024MB # automatically set by wizard on 2008-06-05

What I would like to do is make the tool spit out a revision history in
the same way I find all "big IT" shops handling this already: by putting
a revision history style commentary above the current setting. Here's a
sample:

# 2008-03-02 : 32MB : postgres : Database default
# 2008-05-02 : 512MB : pg_autotune : Wizard update
# 2008-05-15 : 1024MB : gsmith : Increased after benchmark tests
shared_buffers = 1024MB

If the first tuning tool that comes into existance used this format, and
the format was reasonable, I think it would be possible to get people
making manual edits to adopt it as well.

The exact details of how this should look are off-topic for the main
discussion here, though, so I'd prefer if this whole line of discussion
died off. Anyone who wants to comment on this whole area, feel free to
contact me off-list or edit the Wiki page (which has a section on this
topic now) to hash out suggestions in this area, I'm trying to keep this
somewhat thread focused now.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 20:02:31
Message-ID: 200806061602.31641.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday 05 June 2008 15:15:14 Greg Smith wrote:
> (1) is in that proposal but is strictly optional as something to put in
> the configuration file itself. The idea behind (2) is to enable tool
> authors to have an easier way to suggest where to head for more
> information. I'd like for it to be trivial for a tool to say "Suggested
> value for <x> is <y>; see
> http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html
> for more information". I know what most of the settings I tinker with do,
> but even I'd like it to be easier to find the right spot in the manual;
> for newbies it's vital. You are correct that (2) isn't strictly necessary
> here, but it's valuable and will be easier to wrap into this than to bolt
> on later.
>

One idea I have been kicking around is having every guc have a anchor in the
website (rahter than anchoring on parameter family). It might be enough to
just populate the search bot with every guc anchored to family though...

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 20:24:28
Message-ID: 200806061624.29105.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday 06 June 2008 08:35:00 Peter Eisentraut wrote:
> Am Mittwoch, 4. Juni 2008 schrieb Tom Lane:
> > * Can we present the config options in a more helpful way (this is 99%
> > a documentation problem, not a code problem)?
>
> ack
>
> > * Can we build a "configuration wizard" to tell newbies what settings
> > they need to tweak?
>
> Some questions to clarify this:
>
> - What settings do "newbies" (or anyone else) typically need to change?
> Please post a list.
>

i do have a defined set (sort of a "top ten you need to know") that i tell
people about in my postgres 101 tutorials... off the top of my head it looks
like this:

shared_buffers
effective_cache_size
default_stats_target
work_mem
maintainance_work_mem
listen_address
max_connections
the fsm parameters
checkpoint_segements
random_page_cost

i think others close to the list are constraint_exclusion (unless it defaults
to on now), max_prepared_transactions (set it to 0 unless you use it... i may
stop recommending this after noticing its effects on max # system locks
allowed), and then the logging parameters (which is to say, you need to set
up logging that works, however you want to do it).

There are a couple more that might go on the list, like synchronous_commit,
but its on the fence for now.

> - What values would you set those settings to? Please provide a
> description for arriving at a value, which can later be transformed into
> code. Note that in some cases, not even the documentation provides more
> than handwaving help.
>

I can provide this if you want, will need to look over my notes. one issue
we've faced in the past with this is something like shared_buffers, where the
settings is based on 1) dedicated server?, 2) available ram, 3) amount of
red/write traffic, 4) disk subsystem. Those types of input are hard to
quantify in code.

> - If we know better values, why don't we set them by default?
>

There is a saying, something like "The accumulation of annecdotes is not
data". Well, we seem to have a high bar on what proof we need to actually
change a default GUC settings. default_statistics_target is a prime example,
where almost no one i know has ever recommended 10 as a default, or suggests
setting it to 10 as an way to improve performance, but the effort to get it
changed to something more reasonable has been monumental.

> Another orthogonal stumbling block on the way to making all of this
> automatic is that the surely criticial shared_buffers setting will in any
> useful configuration require messing around with kernel settings that no
> PostgreSQL tool can really help with.

yep. seems it might be possible to just compare the shared_buffer setting
with the kernel parameters before making the change though... not sure in
which way you would slant the output though.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 20:30:33
Message-ID: Pine.GSO.4.64.0806061524140.7804@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 6 Jun 2008, Tom Lane wrote:

> I grow weary of this thread.

If we keep it up for, oh, another three years, then maybe you'll be as
weary as I am of struggling with problems in this area. Strinking a
balance between the wants and needs of people who want a fancy GUI tool
for configuring database settings with those who want to edit things
manually is a difficult problem that is not going away. If this didn't
keep coming back to haunt me all the time I'd like to forget about it
myself.

> I will say it once more: I do not believe for one instant that the
> current formatting of postgresql.conf is the major impediment, or even a
> noticeable impediment, to producing a useful configuration wizard.

Arguments about formatting change to postgresql.conf are a tangent to the
central questions here, and having just closed some open comments on that
I am with you on ignoring those as off-topic the same way I keep
minimizing "what are the parameters to tune?" comments.

Here are the relevant questions around since the first message that are
not attracting discussion:

1) Is it worthwhile to expand the information stored in the GUC structure
to make it better capable of supporting machine generation and to provide
more information for tool authors via pg_settings? The exact fields that
should or shouldn't be included remains controversial; consider "default
value", "per-session/runtime/restart", and "enum lists" as the list of
things that are most needed there.

2) Should the sample postgresql.conf file be replaced by a program that
generates it using that beefed up structure instead, therefore removing
one file that has to be manually kept in sync with the rest of the code
base right now?

3) What now makes sense for a way to update database parameters for users
whose primary (or only in some cases) access to the server is over the
database port, given the other changes have improved automatic config file
generation?

> If you wish to prove otherwise, provide a complete wizard except for the
> parts that touch the config file, and I will promise to finish it.

You do realize that if I provided you with such a sample, the not
implemented yet "config API" stubs it needs to work would be exactly what
are suggested to add in the proposal page, right? I (and Josh) didn't
just make them all up out of nowhere you know. I wrote a message here
already about what the seemingly inevitable path the budding "wizard tool
hacker" follows and why that leads into some of the changes suggested.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 20:53:55
Message-ID: 20080606205355.GE16502@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Treat wrote:

> One idea I have been kicking around is having every guc have a anchor in the
> website (rahter than anchoring on parameter family). It might be enough to
> just populate the search bot with every guc anchored to family though...

+1 on the anchor per variable.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 21:06:59
Message-ID: 4849A6F3.8010800@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:
> Tom Lane wrote:
>> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>>> - If we know better values, why don't we set them by default?
>>
>> The problem is: better for what?
>
> That is where some 80% solution sample config files come in.
+1.

At work I use 3 templates.
* One for salespeople's demo laptops.
* One for a developer's desktop.
* One for our bigger production servers.

The old old default postgresql.conf used to be nice for
the first group. The newer set of defaults is nicer
for the second group. Emailing the lists here's the
current best way of tuning for that last case.

I wonder if the fastest way to generate the configurator
would be to simply ask everyone to post their tuned
postgresql.conf files along with a brief description of
the use case for that file. The we could group the
use-cases into various classes; and average the values
of the submitted files. Then the configurator's one
question "choose which use case most closely matches
yours from this list".


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Greg Smith" <gsmith(at)gregsmith(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 21:10:53
Message-ID: 87d4muclr6.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Greg Smith" <gsmith(at)gregsmith(dot)com> writes:

> 1) Is it worthwhile to expand the information stored in the GUC structure to
> make it better capable of supporting machine generation and to provide more
> information for tool authors via pg_settings? The exact fields that should or
> shouldn't be included remains controversial; consider "default value",
> "per-session/runtime/restart", and "enum lists" as the list of things that are
> most needed there.

Isn't that a list of what's *already* there?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Andreas 'ads' Scherbaum <adsmail(at)wars-nicht(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 21:23:54
Message-ID: 20080606232354.69d70b64@iridium.wars-nicht.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 05 Jun 2008 12:53:55 -0700 Ron Mayer wrote:

> Steve Atkins wrote:
> > ... cross-platform (Windows, Linux, Solaris, OS X as a bare
> > minimum)
>
> I wonder how cross-platform the tuning algorithm itself is.
>
> I could also imagine that decisions like "do I let the OS page
> cache, or postgres's buffer cache get most of the memory" are
> extremely OS dependent.

But you can hide most of the internal stuff from the user, either by
generating the config file for the platform the tool is running on or
by an option like "generate config for platform xyz".

If you have cross-platform already in place, this should not be much
overhead.

Kind regards

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group


From: Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>
Subject: Re: Overhauling GUCS
Date: 2008-06-06 21:29:12
Message-ID: 4849AC28.4060801@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Treat wrote:
> On Wednesday 04 June 2008 22:04:54 Greg Smith wrote:
>
>> I was just talking to someone today about building a monitoring tool for
>> this. Not having a clear way to recommend people monitor use of work_mem
>> and its brother spilled to disk sorts is an issue right now, I'll whack
>> that one myself if someone doesn't beat me to it before I get time.
>>
>
> I remember *years* ago, someone wrote a perl script to poll pgsql_tmp and
> print out anytime something showed up... you could probably find that in the
> archives if you look around.
>
> of course to me this sounds like an excellent idea for a dtrace probe ;-)
>
>

Actually, you can find out from the sort-end probe now whether or not
the sort spilled to disk and number of disk blocks used. This is one of
the probes from Simon.

TRACE_POSTGRESQL_SORT_END(state->tapeset,
(state->tapeset ?
LogicalTapeSetBlocks(state->tapeset) :
(state->allowedMem - state->availMem + 1023) /
1024));

-Robert


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 21:38:38
Message-ID: Pine.GSO.4.64.0806061730410.27291@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 6 Jun 2008, Gregory Stark wrote:

> "Greg Smith" <gsmith(at)gregsmith(dot)com> writes:
>
>> 1) Is it worthwhile to expand the information stored in the GUC structure to
>> make it better capable of supporting machine generation and to provide more
>> information for tool authors via pg_settings? The exact fields that should or
>> shouldn't be included remains controversial; consider "default value",
>> "per-session/runtime/restart", and "enum lists" as the list of things that are
>> most needed there.
>
> Isn't that a list of what's *already* there?

I should have been clearer there. Some of the items suggested are already
in the structure, but aren't visible via pg_settings. In those cases it's
just exporting information that's already there. In others (like the
suggestion to add a URL to the documentation) it is actually a new field
being added as well as its corresponding entry in the settings view.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 21:48:57
Message-ID: 60bq2emdyu.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

gsmith(at)gregsmith(dot)com (Greg Smith) writes:
> On Fri, 6 Jun 2008, Heikki Linnakangas wrote:
>
>> Or perhaps we should explicitly mark the settings the tool has
>> generated, and comment out:
>>
>> #shared_buffers = 32MB # commented out by wizard on 2008-06-05
>> shared_buffers = 1024MB # automatically set by wizard on 2008-06-05
>
> What I would like to do is make the tool spit out a revision history
> in the same way I find all "big IT" shops handling this already: by
> putting a revision history style commentary above the current setting.
> Here's a sample:
>
> # 2008-03-02 : 32MB : postgres : Database default
> # 2008-05-02 : 512MB : pg_autotune : Wizard update
> # 2008-05-15 : 1024MB : gsmith : Increased after benchmark tests
> shared_buffers = 1024MB
>
> If the first tuning tool that comes into existance used this format,
> and the format was reasonable, I think it would be possible to get
> people making manual edits to adopt it as well.

My first reaction to this is...

"And why is it that you seem to imagine it a good idea to recreate
RCS, badly?"

While that may be an unkind reaction, I don't think it's a
particularly wrong one!

Getting into "comment formatting" certainly *does* seem like an
attempt to do, poorly, what RCS does. And I am pointing at a tool
from 1985 as opposed to "modern" ones to point out that there were
better approaches to managing versioning of simple text documents
*THIRTY YEARS AGO*.

Indeed, CVS, which we are still using doesn't fundamentally do
anything, in this context, that RCS didn't.

I kind of wish that there was some sort of library for "doing RCS";
even in the absence of that, it would be perfectly logical for a tool
that manages this to DEMAND that there be a "diff" utility or some
equivalent available.

In the absence of that, let me suggest that it would be preferable to
follow VMS's lead and, if you "automagically" make any changes, create
a version called "postgresql.conf;[n]" where "[n]" is either:
a) 1, if "postgresql.conf;1" does not exist, or
b) 1 + the largest integer found.

Replace ";" with some other character as need be.

Add a line (or a few lines) at the top of the current version that
indicates some bit of commented-out-metadata that is obviously
formatted and easily recognizable such as:

### [pgauto] postgresql.conf-autotuner - ran 2008-07-02
### [pgauto] little bit of safely-turfable header
### [pgauto] further little bit of safely-turfable header

In effect, what it does is thus:

- Keep the old copy as postgresql.conf;24 (because we have some 24 elder versions)
- Strips off any leading lines that begin with "### [pgauto]"
- Puts any interesting comments in a set of leading lines that begin with "### [pgauto]"

Merits, over some wacky rewrite-comment-formatting:
- This requires only the thinnest bit of "special handling" of comments.
- It doesn't force in a dependancy on [pick-someone's-pet-SCM that you happen to hate]
- Even in the absence of an SCM, it keeps versioning information around
- If you have a "diff," it'll work perfectly well, and be reasonably informative
- It doesn't throw ANY old data away!

Sorry, that's more than enough rant for today! :-)
--
"cbbrowne","@","linuxdatabases.info"
http://cbbrowne.com/info/rdbms.html
Rules of the Evil Overlord #220. "Whatever my one vulnerability is, I
will fake a different one. For example, ordering all mirrors removed
from the palace, screaming and flinching whenever someone accidentally
holds up a mirror, etc. In the climax when the hero whips out a mirror
and thrusts it at my face, my reaction will be ``Hmm...I think I need
a shave.''" <http://www.eviloverlord.com/>


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 23:30:47
Message-ID: 4849C8A7.5030803@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark wrote:
> "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de> writes:
>
>> I personally wouldn't even think about starting such a wizard, unless I have an
>> idea how to push the result into the database. No, not a file, but via SQL! So
>> your statement you won't react unless a wizard is almost ready is prohibitive,
>> apart from the fact that not only wizards (featuring AI) are interesting, but
>> simple config tools as well.
>
> Well there's a perfectly good place to start today. Dump out a config file

I think I made my point very clear when stating "not a file, but via
SQL". Though I'm not a native English speaker, and I'm sure you
understood. I must assume you're polluting this thread deliberately in
order to sabotage the original intention of this thread. I find this
disgusting.

Ok, trying to contribute gui tools for pgsql is for masochists. We have
vi, sed and grep, no more tools required, right?

Regards,
Andreas


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-06 23:45:31
Message-ID: 1212795931.14299.3.camel@jd-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2008-06-07 at 01:30 +0200, Andreas Pflug wrote:
> Gregory Stark wrote:
> > "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de> writes:

> I think I made my point very clear when stating "not a file, but via
> SQL". Though I'm not a native English speaker, and I'm sure you
> understood. I must assume you're polluting this thread deliberately in
> order to sabotage the original intention of this thread. I find this
> disgusting.
>
> Ok, trying to contribute gui tools for pgsql is for masochists. We have
> vi, sed and grep, no more tools required, right?

Hold on guys, let's not get all out of whack. No one in their right mind
is going to suggest that vi, sed and grep are the only tools we need to
do this.

However a GUI wizard isn't really a way to go either. SQL? Yes it would
be very nice to be able to:

BEGIN;
SET shared_buffers to 65536 COMMENT IS 'Since I have 4G of ram I want
512 megs of shared buffers';
SET effective_cache_size to 262144 COMMENT IS 'Since I have 4G of ram I
want 1 gig of effective_cache';
COMMIT;

When that is done, it would write out the postgresql.conf.

Regardless of all of this, this thread is way out of hand. We went from
overhauling the configuration file to wizards, sample postgresql.conf
files and possibly even half of us migrating to MySQL :P.

Can we refine the topic a bit?

Joshua D. Drake


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-07 00:19:22
Message-ID: 8358.1212797962@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
> There is a saying, something like "The accumulation of annecdotes is not
> data". Well, we seem to have a high bar on what proof we need to actually
> change a default GUC settings. default_statistics_target is a prime example,
> where almost no one i know has ever recommended 10 as a default, or suggests
> setting it to 10 as an way to improve performance, but the effort to get it
> changed to something more reasonable has been monumental.

Actually, the reason it's still 10 is that the effort expended to get it
changed has been *ZERO*. I keep asking for someone to make some
measurements, do some benchmarking, anything to make a plausible case
for a specific higher value as being a reasonable place to set it.

The silence has been deafening.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-07 00:30:41
Message-ID: 8459.1212798641@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> On Fri, 6 Jun 2008, Gregory Stark wrote:
>> "Greg Smith" <gsmith(at)gregsmith(dot)com> writes:
>>> 1) Is it worthwhile to expand the information stored in the GUC structure to
>>> make it better capable of supporting machine generation and to provide more
>>> information for tool authors via pg_settings? The exact fields that should or
>>> shouldn't be included remains controversial; consider "default value",
>>> "per-session/runtime/restart", and "enum lists" as the list of things that are
>>> most needed there.
>>
>> Isn't that a list of what's *already* there?

> I should have been clearer there. Some of the items suggested are already
> in the structure, but aren't visible via pg_settings.

Well, you can't see the default or reset values in pg_settings, only the
current value. However, I fail to see the use of either of those for
a configure wizard. It'll presumably be attached to a fresh connection
so the reset value is not different from the current; and any decent
wizard is going to know perfectly well what the defaults are.

> ... In others (like the
> suggestion to add a URL to the documentation) it is actually a new field
> being added as well as its corresponding entry in the settings view.

Offhand I would argue that we should choose the URLs in such a way that
they can be derived automatically if you know the PG version number and
GUC variable name. (We're almost there already, except that you have
to know which section of the chapter it's in.) So the need to have a
pg_settings column seems questionable --- seems like it would mostly
be clutter. Perhaps instead of a view column, it would be useful to
encapsulate the algorithm as a function? pg_variable_help_url(name)

regards, tom lane


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-07 01:02:25
Message-ID: Pine.GSO.4.64.0806062044110.1985@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 6 Jun 2008, Tom Lane wrote:

> Well, you can't see the default or reset values in pg_settings, only the
> current value. However, I fail to see the use of either of those for
> a configure wizard.

I'm under the impression that the primary reason to put the default in
there is to make it easier for a file generator program to be decoupled a
bit from the internal representation. Regardless, these values should be
exposed for tool writers. If you build a prototype interface for an
interactive settings changing tool, you quickly discover that showing the
default, range, and recommended setting are all valuable things people
would like to see when deciding what the change a setting to. And there's
no reason accumulating all that info should be the responsibility of a
tool writer when it's easy to expose and keep up to date inside the
database itself.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-07 01:08:11
Message-ID: 1212800891.14299.35.camel@jd-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote:
> Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:

> Actually, the reason it's still 10 is that the effort expended to get it
> changed has been *ZERO*. I keep asking for someone to make some
> measurements, do some benchmarking, anything to make a plausible case
> for a specific higher value as being a reasonable place to set it.

> The silence has been deafening.

Not surprising really. It is a simple adjustment to make and it also is
easy to spot when its a problem. However it is not trivial to test for
(in terms of time and effort). I know 10 is wrong and so do you. If you
don't I am curious why I see so many posts from you saying, "Your
estimates are off, what is your default_statistics_target?" with yet
even more responses saying, "Uhh 10."

Sincerely,

Joshua D. Drake


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-07 02:00:41
Message-ID: 9606.1212804041@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> Not surprising really. It is a simple adjustment to make and it also is
> easy to spot when its a problem. However it is not trivial to test for
> (in terms of time and effort). I know 10 is wrong and so do you.

Sure. But what is right? I'm afraid to just push it to (say) 100
because of the possibility of O(N^2) behavior in eqjoinsel. Somebody
needs to do some measurements on somewhat realistic scenarios.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, "Josh Berkus" <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-08 23:07:21
Message-ID: 87hcc3le52.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:

> On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote:
>> Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
>
>> Actually, the reason it's still 10 is that the effort expended to get it
>> changed has been *ZERO*. I keep asking for someone to make some
>> measurements, do some benchmarking, anything to make a plausible case
>> for a specific higher value as being a reasonable place to set it.
>
>> The silence has been deafening.
>
> Not surprising really. It is a simple adjustment to make and it also is
> easy to spot when its a problem. However it is not trivial to test for
> (in terms of time and effort). I know 10 is wrong and so do you. If you
> don't I am curious why I see so many posts from you saying, "Your
> estimates are off, what is your default_statistics_target?" with yet
> even more responses saying, "Uhh 10."

Ah, but we only ever hear about the cases where it's wrong of course. In other
words even if we raised it to some optimal value we would still have precisely
the same experience of seeing only posts on list about it being insufficient.

What's needed is some speed benchmarks for complex queries with varying size
statistics so we can see how badly large statistic tables hurt planning time.

The flip side of seeing how much larger tables help planning accuracy is much
harder to measure. Offhand I don't see any systematic way to go about it.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, "Josh Berkus" <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-08 23:54:45
Message-ID: 200806081954.46114.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday 08 June 2008 19:07:21 Gregory Stark wrote:
> "Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> > On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote:
> >> Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
> >>
> >> Actually, the reason it's still 10 is that the effort expended to get it
> >> changed has been *ZERO*. I keep asking for someone to make some
> >> measurements, do some benchmarking, anything to make a plausible case
> >> for a specific higher value as being a reasonable place to set it.
> >>
> >> The silence has been deafening.
> >
> > Not surprising really. It is a simple adjustment to make and it also is
> > easy to spot when its a problem. However it is not trivial to test for
> > (in terms of time and effort). I know 10 is wrong and so do you. If you
> > don't I am curious why I see so many posts from you saying, "Your
> > estimates are off, what is your default_statistics_target?" with yet
> > even more responses saying, "Uhh 10."
>
> Ah, but we only ever hear about the cases where it's wrong of course. In
> other words even if we raised it to some optimal value we would still have
> precisely the same experience of seeing only posts on list about it being
> insufficient.
>

The slipside to this is that we're not trying to find the perfect setting,
we're just trying to determine a number that will cause more benefit than
harm compared to the number we have now. While I am sure there are cases
where 100 is too low as well, I cannot recall ever having seen someone
suggest lowering the default_stats_target to something less than 100. (I
know sit back and wait for someone to comb the archives, just to find that 1
time).

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-09 17:45:00
Message-ID: 484D6C1C.5070004@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

>>> Actually, the reason it's still 10 is that the effort expended to get it
>>> changed has been *ZERO*. I keep asking for someone to make some
>>> measurements, do some benchmarking, anything to make a plausible case
>>> for a specific higher value as being a reasonable place to set it.
>>> The silence has been deafening.
>> Not surprising really. It is a simple adjustment to make and it also is
>> easy to spot when its a problem. However it is not trivial to test for
>> (in terms of time and effort). I know 10 is wrong and so do you. If you
>> don't I am curious why I see so many posts from you saying, "Your
>> estimates are off, what is your default_statistics_target?" with yet
>> even more responses saying, "Uhh 10."

I tried (back in 7.4) to do some systematic testing of this. The
problem is that the cases were higher d_s_t are required are
specifically ones with complex, unbalanced data distributions and/or
very large databases. This makes test cases extremely difficult and
time-consuming to generate; further, I found that the test cases I had
from my clients' databases were not portable (in addition to being
confidential).

Also, I'd actually assert that "10" seems to be perfectly adequate for
the majority of users. That is, the number of users where I've
recommended increasing d_s_t for the whole database is smaller than the
number where I don't, and of course we never hear from most users at
all. So I'm pretty happy recommending "Leave the default. If you
encounter problem queries, increase it to 100, and analyse the database.
If you're running a data warehouse, increase it to 1000."

Where analyze does systematically fall down is with databases over 500GB
in size, but that's not a function of d_s_t but rather of our tiny
sample size.

--Josh


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-09 20:35:27
Message-ID: 87d4mqwdm8.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Josh Berkus" <josh(at)agliodbs(dot)com> writes:

> Where analyze does systematically fall down is with databases over 500GB in
> size, but that's not a function of d_s_t but rather of our tiny sample size.

Speak to the statisticians. Our sample size is calculated using the same
theory behind polls which sample 600 people to learn what 250 million people
are going to do on election day. You do NOT need (significantly) larger
samples for larger populations.

In fact where those polls have difficulty is the same place we have some
problems. For *smaller* populations like individual congressional races you
need to have nearly the same 600 sample for each of those small races. That
adds up to a lot more than 600 total. In our case it means when queries cover
a range much less than a whole bucket then the confidence interval increases
too.

Also, our estimates for n_distinct are very unreliable. The math behind
sampling for statistics just doesn't work the same way for properties like
n_distinct. For that Josh is right, we *would* need a sample size proportional
to the whole data set which would practically require us to scan the whole
table (and have a technique for summarizing the results in a nearly constant
sized data structure).

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: "Hakan Kocaman" <hkocam(at)googlemail(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-09 21:17:08
Message-ID: 48ca23600806091417u546f3b6dm6314bc55aaa147f6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6/9/08, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
>
> "Josh Berkus" <josh(at)agliodbs(dot)com> writes:
>
> > Where analyze does systematically fall down is with databases over 500GB
> in
> > size, but that's not a function of d_s_t but rather of our tiny sample
> size.
>
>
> n_distinct. For that Josh is right, we *would* need a sample size
> proportional
> to the whole data set which would practically require us to scan the whole
> table (and have a technique for summarizing the results in a nearly
> constant
> sized data structure).
>
>
Hi,
is this (summarizing results in a constant sized data structure) something
which could be achived by Bloom-Filters ?
http://archives.postgresql.org/pgsql-general/2008-06/msg00076.php

Kind regards
Hakan Kocaman


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Hakan Kocaman" <hkocam(at)googlemail(dot)com>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-09 23:01:45
Message-ID: 87ve0ius9y.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Hakan Kocaman" <hkocam(at)googlemail(dot)com> writes:

> On 6/9/08, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
>>
>> n_distinct. For that Josh is right, we *would* need a sample size
>> proportional to the whole data set which would practically require us to
>> scan the whole table (and have a technique for summarizing the results in a
>> nearly constant sized data structure).
>
> is this (summarizing results in a constant sized data structure) something
> which could be achived by Bloom-Filters ?

Uhm, it would be a bit of a strange application of them but actually it seems
to me that would be a possible approach. It would need a formula for
estimating the number of distinct values given the number of bits set in the
bloom filter. That should be a tractable combinatorics problem (in fact it's
pretty similar to the combinatorics I posted a while back about getting all
the drives in a raid array busy). And if you have a dynamic structure where
the filter size grows then it would overestimate because extra copied bits
would be set.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-09 23:49:21
Message-ID: 200806091649.22281.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg,

> Speak to the statisticians. Our sample size is calculated using the same
> theory behind polls which sample 600 people to learn what 250 million
> people are going to do on election day. You do NOT need (significantly)
> larger samples for larger populations.

Your analogy is bad. For elections, the voters have only a few choices.
In a 300 million row table, there could be 300 million different values,
and the histogram becomes less accurate for every order of magnitude
smaller than 300 million it is.

> Also, our estimates for n_distinct are very unreliable. The math behind
> sampling for statistics just doesn't work the same way for properties
> like n_distinct. For that Josh is right, we *would* need a sample size
> proportional to the whole data set which would practically require us to
> scan the whole table (and have a technique for summarizing the results
> in a nearly constant sized data structure).

Actually, a number of papers have shown block-based algorithms which can
arrive a reasonably confident (between 50% and 250% of accurate) estimates
based on scanning only 5% of *blocks*. Simon did some work on this a
couple years ago, but he and I had difficultly convincing -hackers that a
genuine problem existed.

You're correct that we'd need to change pg_statistic, though. For one
thing, we need to separate the sample size from the histogram size.

Also, we seem to be getting pretty far away from the original GUC
discussion.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: <josh(at)agliodbs(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-10 10:10:18
Message-ID: 87tzg1mwhh.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Josh Berkus" <josh(at)agliodbs(dot)com> writes:

> Greg,
>
>> Speak to the statisticians. Our sample size is calculated using the same
>> theory behind polls which sample 600 people to learn what 250 million
>> people are going to do on election day. You do NOT need (significantly)
>> larger samples for larger populations.
>
> Your analogy is bad. For elections, the voters have only a few choices.
> In a 300 million row table, there could be 300 million different values,
> and the histogram becomes less accurate for every order of magnitude
> smaller than 300 million it is.

I think you're right that you need, for example, 600 people *for each answer*
to give good poll results. So for a two-way election which is about even
that's about 1,200 people. If one of the candidates is much less popular you
might have to sample many more people before you have 600 people in that
bucket.

The analogous case in our situation is not having 300 million distinct values,
since we're not gathering info on specific values, only the buckets. We need,
for example, 600 samples *for each bucket*. Each bucket is chosen to have the
same number of samples in it. So that means that we always need the same
number of samples for a given number of buckets.

>> Also, our estimates for n_distinct are very unreliable. The math behind
>> sampling for statistics just doesn't work the same way for properties
>> like n_distinct. For that Josh is right, we *would* need a sample size
>> proportional to the whole data set which would practically require us to
>> scan the whole table (and have a technique for summarizing the results
>> in a nearly constant sized data structure).
>
> Actually, a number of papers have shown block-based algorithms which can
> arrive a reasonably confident (between 50% and 250% of accurate) estimates
> based on scanning only 5% of *blocks*. Simon did some work on this a
> couple years ago, but he and I had difficultly convincing -hackers that a
> genuine problem existed.

Really? Could you send references? The paper I read surveyed previous work and
found that you needed to scan up to 50% of the table to get good results.
50-250% is considerably looser than what I recall it considering "good"
results so these aren't entirely inconsistent but I thought previous results
were much worse than that.

> You're correct that we'd need to change pg_statistic, though. For one
> thing, we need to separate the sample size from the histogram size.

That amounts to giving users control over the sample size per bucket. Which
allows them to get a more or less accurate estimate for a range covering a
single bucket without changing the size of the bucket.

I'm a bit puzzled which direction you want to go.

> Also, we seem to be getting pretty far away from the original GUC
> discussion.

Thank heavens :)

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_statistics and sample size WAS: Overhauling GUCS
Date: 2008-06-10 15:34:51
Message-ID: 200806100834.51471.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg,

> The analogous case in our situation is not having 300 million distinct
> values, since we're not gathering info on specific values, only the
> buckets. We need, for example, 600 samples *for each bucket*. Each bucket
> is chosen to have the same number of samples in it. So that means that we
> always need the same number of samples for a given number of buckets.

I think that's plausible. The issue is that in advance of the sampling we
don't know how many buckets there *are*. So we first need a proportional
sample to determine the number of buckets, then we need to retain a histogram
sample proportional to the number of buckets. I'd like to see someone with a
PhD in this weighing in, though.

> Really? Could you send references? The paper I read surveyed previous work
> and found that you needed to scan up to 50% of the table to get good
> results. 50-250% is considerably looser than what I recall it considering
> "good" results so these aren't entirely inconsistent but I thought previous
> results were much worse than that.

Actually, based on my several years selling performance tuning, I generally
found that as long as estimates were correct within a factor of 3 (33% to
300%) the correct plan was generally chosen.

There are papers on block-based sampling which were already cited on -hackers;
I'll hunt through the archives later.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-10 16:33:41
Message-ID: 200806100933.42232.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ron,

> I wonder if the fastest way to generate the configurator
> would be to simply ask everyone to post their tuned
> postgresql.conf files along with a brief description of
> the use case for that file. The we could group the
> use-cases into various classes; and average the values
> of the submitted files. Then the configurator's one
> question "choose which use case most closely matches
> yours from this list".

This has been proposed a number of times, but is still infeasable for the
simple reason that a lot of settings are dependant on how much RAM you have
on the machine. "Sample configs" would just result in people trying to set
their shared_buffers to 150% of system RAM if they have a lightweight system,
or in having 500 different sample configurations, which would produce even
*more* user confusion.

A generator makes more sense.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-10 16:37:37
Message-ID: 200806100937.38124.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

> shared_buffers
> effective_cache_size
> default_stats_target
> work_mem
> maintainance_work_mem
> listen_address
> max_connections
> the fsm parameters
> checkpoint_segements
> random_page_cost

My list is very similar, execept that I drop random_page_cost and add
synchronous_commit, autovaccum (on or off) and offer 4 "packages" of log
settings.

So I think that we can get consensus on what are the "newbie" parameters
pretty easily.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-10 16:49:17
Message-ID: 200806100949.17918.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday 10 June 2008 09:37, Josh Berkus wrote:
> Robert,
>
> > shared_buffers
> > effective_cache_size
> > default_stats_target
> > work_mem
> > maintainance_work_mem
> > listen_address
> > max_connections
> > the fsm parameters
> > checkpoint_segements
> > random_page_cost
>
> My list is very similar, execept that I drop random_page_cost and add
> synchronous_commit, autovaccum (on or off) and offer 4 "packages" of log
> settings.

Oh, and wal_buffers, the default for which we should just change if it weren't
for SHMMAX.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-10 18:23:04
Message-ID: 21875.1213122184@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Oh, and wal_buffers, the default for which we should just change if it
> weren't for SHMMAX.

Uh, why? On a workload of mostly small transactions, what value is
there in lots of wal_buffers?

regards, tom lane


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Decibel!" <decibel(at)decibel(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-10 18:39:08
Message-ID: 484ECA4C.20004@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark wrote:
> "Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
>> ...default_statistics_target?"..."Uhh 10."
>
> Ah, but we only ever hear about the cases where it's wrong of course. In other
> words even if we raised it to some optimal value we would still have precisely
> the same experience of seeing only posts on list about it being insufficient.

Yet if it was raised above the optimal, wouldn't you expect to hear cases
where it was too high?

It seems a slow painful way of finding a good setting might be to
keep increasing it until we had cases where the lists start telling
people they need to lower the parameter?


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 06:52:03
Message-ID: 484F7613.5070202@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Oh, and wal_buffers, the default for which we should just change if it
>> weren't for SHMMAX.
>
> Uh, why? On a workload of mostly small transactions, what value is
> there in lots of wal_buffers?

None. But there's also little to no harm in having a higher setting; at
worst you waste a few megabytes of memory. Besides, most databases are
initialized from some outside source in the beginning, and data loading
does benefit from a higher wal_buffers setting.

Ideally, of course, there would be no wal_buffers setting, and WAL
buffers would be allocated from shared_buffers pool on demand...

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-11 15:45:22
Message-ID: 200806111545.m5BFjMn05925@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Sabino Mullane wrote:
> * The word 'paramters' is still misspelled. :)

Corrected for 8.4.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-11 16:26:23
Message-ID: 1213201583.31182.28.camel@jd-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote:
> Greg Sabino Mullane wrote:
> > * The word 'paramters' is still misspelled. :)
>
> Corrected for 8.4.

Technically this is a bug fix... why not backpatch it too?

>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> EnterpriseDB http://enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 16:36:22
Message-ID: 484FFF06.9010307@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki,

> Ideally, of course, there would be no wal_buffers setting, and WAL
> buffers would be allocated from shared_buffers pool on demand...
>

+1

--Josh


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 16:37:22
Message-ID: 484FFF42.7000408@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Oh, and wal_buffers, the default for which we should just change if it
>> weren't for SHMMAX.
>
> Uh, why? On a workload of mostly small transactions, what value is
> there in lots of wal_buffers?

Actually, it's also useful for any workload with many connections. Any
time you have high throughput, really. We've seen this on DBT2,
SpecJAppserver and iGen.

--Josh


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 16:38:47
Message-ID: 20080611163847.GN5352@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> Heikki,
>
>> Ideally, of course, there would be no wal_buffers setting, and WAL
>> buffers would be allocated from shared_buffers pool on demand...

Same for pg_subtrans, pg_clog, etc (as previously discussed)

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 17:18:33
Message-ID: 200806111718.m5BHIYe04026@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Lor wrote:
> Robert Treat wrote:
> > On Wednesday 04 June 2008 22:04:54 Greg Smith wrote:
> >
> >> I was just talking to someone today about building a monitoring tool for
> >> this. Not having a clear way to recommend people monitor use of work_mem
> >> and its brother spilled to disk sorts is an issue right now, I'll whack
> >> that one myself if someone doesn't beat me to it before I get time.
> >>
> >
> > I remember *years* ago, someone wrote a perl script to poll pgsql_tmp and
> > print out anytime something showed up... you could probably find that in the
> > archives if you look around.
> >
> > of course to me this sounds like an excellent idea for a dtrace probe ;-)
> >
> >
>
> Actually, you can find out from the sort-end probe now whether or not
> the sort spilled to disk and number of disk blocks used. This is one of
> the probes from Simon.
>
> TRACE_POSTGRESQL_SORT_END(state->tapeset,
> (state->tapeset ?
> LogicalTapeSetBlocks(state->tapeset) :
> (state->allowedMem - state->availMem + 1023) /
> 1024));

8.3 has GUC log_temp_files.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-11 17:25:15
Message-ID: 200806111725.m5BHPFU05044@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:
>
>
> On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote:
> > Greg Sabino Mullane wrote:
> > > * The word 'paramters' is still misspelled. :)
> >
> > Corrected for 8.4.
>
> Technically this is a bug fix... why not backpatch it too?

That might show up as a diff for people doing upgrades where the minor
version changed the spelling.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-11 17:28:37
Message-ID: 48500B45.8010909@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:
> Joshua D. Drake wrote:
>>
>> On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote:
>>> Greg Sabino Mullane wrote:
>>>> * The word 'paramters' is still misspelled. :)
>>> Corrected for 8.4.
>> Technically this is a bug fix... why not backpatch it too?
>
> That might show up as a diff for people doing upgrades where the minor
> version changed the spelling.

People upgrading won't see it, I think. You only see it when you do a
new initdb..

//Magnus


From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 17:51:46
Message-ID: 485010B2.4060102@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> Heikki,
>
>> Ideally, of course, there would be no wal_buffers setting, and WAL
>> buffers would be allocated from shared_buffers pool on demand...
>>
>
> +1
>
> --Josh
>

+1

-Jignesh


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-11 17:56:19
Message-ID: 200806111756.m5BHuJo22366@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dave Page wrote:
> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:
>
> >> I've seen people not doing so more often
> >> than you would think. Perhaps because they are DBAs and not sysadmins? I
> >> also
> >> meant a tool to do things like verify that the changes are valid, as
> >> someone
> >> else mentioned elsewhere in this thread.
> >
> > pg_ctl -D data check?
> >
> > I would +1 that.
>
> I would also really like to see that - though I'd also like to see an
> SQL interface so we can check a config before saving when editing via
> pgAdmin or similar.

Should this be a TODO?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Greg Sabino Mullane" <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-11 18:35:29
Message-ID: 937d27e10806111135h27e8cccw7aea9d85b74d065a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Dave Page wrote:
>> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:
>>
>> > pg_ctl -D data check?
>> >
>> > I would +1 that.
>>
>> I would also really like to see that - though I'd also like to see an
>> SQL interface so we can check a config before saving when editing via
>> pgAdmin or similar.
>
> Should this be a TODO?

Yes please.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-11 20:54:23
Message-ID: 200806112054.m5BKsNQ23972@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dave Page wrote:
> On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > Dave Page wrote:
> >> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:
> >>
> >> > pg_ctl -D data check?
> >> >
> >> > I would +1 that.
> >>
> >> I would also really like to see that - though I'd also like to see an
> >> SQL interface so we can check a config before saving when editing via
> >> pgAdmin or similar.
> >
> > Should this be a TODO?
>
> Yes please.

Added to TODO:

* Add pg_ctl option to do a syntax check of postgresql.conf

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Decibel! <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 21:05:15
Message-ID: 11913.1213218315@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Josh Berkus wrote:
>> Ideally, of course, there would be no wal_buffers setting, and WAL
>> buffers would be allocated from shared_buffers pool on demand...

> Same for pg_subtrans, pg_clog, etc (as previously discussed)

I agree with that for pg_clog and friends, but I'm much more leery of
folding WAL into the same framework. Its access pattern is *totally*
unlike standard caches, so the argument that this would be good for
performance is resting on nothing but imagination. Also I'm concerned
about possible deadlocks, because WAL is customarily accessed while
holding one or more exclusive buffer locks.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 23:00:13
Message-ID: 200806112300.m5BN0Dc13952@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > Tom Lane wrote:
> >> * Can we build a "configuration wizard" to tell newbies what settings
> >> they need to tweak?
>
> > That would trump all the other suggestions conclusively. Anyone good at
> > expert systems?
>
> How far could we get with the answers to just three questions:
>
> * How many concurrent queries do you expect to have?
>
> * How much RAM space are you willing to let Postgres use?
>
> * How much "overhead" disk space are you willing to let Postgres use?
>
> concurrent queries drives max_connections, obviously, and RAM space
> would drive shared_buffers and effective_cache_size, and both of them
> would be needed to size work_mem. The third one is a bit weird but
> I don't see any other good way to set the checkpoint parameters.
>
> If those aren't enough questions, what else must we ask? Or maybe they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.

Having returned from Japan, I read through this thread. It had lots of
ideas (new format for postgresql.conf, more/less comments in
postgresql.conf) but I didn't see any of the ideas getting a majority.

I think we do a good job of making many settings automatic (meaning no
one even sees them), but we don't to a great job of making the visible
settings easy to set, both in the process (no GUI) and in knowing the
proper value.

There are two ideas I did think had merit. First, using ## for
system-supplied comments, so user comments would be easier to identify.
There might be value in doing that even if it were not helpful for
scripts.

The second idea is the idea of having one parameter depend on another.
Not only could we do that for some of our existing parameters, but we
could have pseudo-parameters like concurrent_queries, memory_usage, and
extra_disk_space that could be at the top of postgresql.conf and then
affect the other settings.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 23:35:09
Message-ID: 24777.1213227309@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> The second idea is the idea of having one parameter depend on another.
> Not only could we do that for some of our existing parameters, but we
> could have pseudo-parameters like concurrent_queries, memory_usage, and
> extra_disk_space that could be at the top of postgresql.conf and then
> affect the other settings.

We have tried to do that in the past, and it didn't work well *at all*.
The idea has a fundamental logical flaw, which is that it's not clear
which parameter wins if the user changes both.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-11 23:40:36
Message-ID: 200806112340.m5BNea602628@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > The second idea is the idea of having one parameter depend on another.
> > Not only could we do that for some of our existing parameters, but we
> > could have pseudo-parameters like concurrent_queries, memory_usage, and
> > extra_disk_space that could be at the top of postgresql.conf and then
> > affect the other settings.
>
> We have tried to do that in the past, and it didn't work well *at all*.

We have? When?

> The idea has a fundamental logical flaw, which is that it's not clear
> which parameter wins if the user changes both.

Yes, you could get into problems by having variable dependency loops,
but I see no way to easily improve configuration without it.

My idea was to have:

memory_usage = 128MB ## pseudo-parameter
shared_buffers = $memory_usage * 0.75

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 00:26:09
Message-ID: 87od674hy6.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>> Josh Berkus wrote:
>>> Ideally, of course, there would be no wal_buffers setting, and WAL
>>> buffers would be allocated from shared_buffers pool on demand...
>
>> Same for pg_subtrans, pg_clog, etc (as previously discussed)
>
> I agree with that for pg_clog and friends, but I'm much more leery of
> folding WAL into the same framework. Its access pattern is *totally*
> unlike standard caches, so the argument that this would be good for
> performance is resting on nothing but imagination. Also I'm concerned
> about possible deadlocks, because WAL is customarily accessed while
> holding one or more exclusive buffer locks.

Well it may still be worthwhile stealing buffers from shared_buffers even if
we set a special flag marking them as owned by WAL and out of bounds for
the normal buffer manager.

At least that way we could always steal more if we want or return some, as
long as we're careful about when we do it. That would open the door to having
these parameters be dynamically adjustable. That alone would be worthwhile
even if we bypass all bells and whistles of the buffer manager.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 00:37:40
Message-ID: 48506FD4.7020804@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg,

> At least that way we could always steal more if we want or return some, as
> long as we're careful about when we do it. That would open the door to having
> these parameters be dynamically adjustable. That alone would be worthwhile
> even if we bypass all bells and whistles of the buffer manager.
>

One hitch, though, is that asynchronous commit could consume big chunks
of shared_buffers. So we might still need a limit for people who are
using async.

--Josh


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-12 00:54:40
Message-ID: 200806120054.m5C0seA23198@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander wrote:
> Bruce Momjian wrote:
> > Joshua D. Drake wrote:
> >>
> >> On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote:
> >>> Greg Sabino Mullane wrote:
> >>>> * The word 'paramters' is still misspelled. :)
> >>> Corrected for 8.4.
> >> Technically this is a bug fix... why not backpatch it too?
> >
> > That might show up as a diff for people doing upgrades where the minor
> > version changed the spelling.
>
> People upgrading won't see it, I think. You only see it when you do a
> new initdb..

The problem is that people doing initdb with different minor versions
will have different stock postgresql.conf files. That isn't a huge
problem, but I don't see a need to create the problem just to fix a
spelling mistake that few have observed.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 00:56:24
Message-ID: 25534.1213232184@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Tom Lane wrote:
>> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>>> The second idea is the idea of having one parameter depend on another.

>> We have tried to do that in the past, and it didn't work well *at all*.

> We have? When?

Just a couple months ago we had to give up enforcing an
interrelationship between NBuffers and MaxConnections, because it got
too complicated and un-explainable. I seem to recall some other
interactions in the distant past, but a quick look through the CVS logs
didn't find any smoking guns.

>> The idea has a fundamental logical flaw, which is that it's not clear
>> which parameter wins if the user changes both.

> Yes, you could get into problems by having variable dependency loops,

Who said anything about loops? What I am talking about is what happens
during
set memory_usage = X; // implicitly sets work_mem = X/100, say
set work_mem = Y;
set memory_usage = Z;
What is work_mem now, and what's your excuse for saying so, and how
will you document the behavior so that users can understand it?
(Just to make things interesting, assume that some of the above SETs
happen via changing postgresql.conf rather than directly.)

If the objective is to make configuration easier to understand,
I don't believe that behind-the-scenes changes of configuration values
will advance that goal.

> but I see no way to easily improve configuration without it.

The higher-level concepts should be things that a configuration wizard
works with, and then tells you how to set the postmaster parameters.
They should not end up in the configure file (unless maybe as comments?)

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 01:01:49
Message-ID: 87k5gv4gaq.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Josh Berkus" <josh(at)agliodbs(dot)com> writes:

> Greg,
>
>> At least that way we could always steal more if we want or return some, as
>> long as we're careful about when we do it. That would open the door to having
>> these parameters be dynamically adjustable. That alone would be worthwhile
>> even if we bypass all bells and whistles of the buffer manager.
>>
>
> One hitch, though, is that asynchronous commit could consume big chunks of
> shared_buffers. So we might still need a limit for people who are using async.

Well currently we use a fixed number of fixed-sized buffers, no? I doubt we'll
change that even if we take this tact of making wal_buffers resizable by
stealing buffers from the buffer manager for precisely the reasons Tom was
describing.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 01:02:20
Message-ID: 25597.1213232540@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> I agree with that for pg_clog and friends, but I'm much more leery of
>> folding WAL into the same framework.

> Well it may still be worthwhile stealing buffers from shared_buffers even if
> we set a special flag marking them as owned by WAL and out of bounds for
> the normal buffer manager.

> At least that way we could always steal more if we want or return some, as
> long as we're careful about when we do it.

... and as long as you can acquire the WAL per-buffer management space out
of nowhere ...

regards, tom lane


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 01:46:33
Message-ID: Pine.GSO.4.64.0806112127170.13925@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 11 Jun 2008, Tom Lane wrote:

> Who said anything about loops? What I am talking about is what happens
> during
> set memory_usage = X; // implicitly sets work_mem = X/100, say
> set work_mem = Y;
> set memory_usage = Z;
> What is work_mem now, and what's your excuse for saying so, and how
> will you document the behavior so that users can understand it?
> (Just to make things interesting, assume that some of the above SETs
> happen via changing postgresql.conf rather than directly.)

People are already exposed to issues in this area via things like the
include file mechanism. You can think of that two ways. You can say,
"there's already problems like this so who cares if there's another one".
Or, you can say "let's not add even more confusion like that".

Having a mini programming language for setting parameters is interesting
and all, and it might be enough to do a good job of handling the basic
newbie setup chores. But I don't think it's a complete solution and
therefore I find moving in that direction a bit of a distraction; your
concerns about ambiguity just amplify that feeling. It's unlikely that
will get powerful enough to enable the "one true config file" that just
works for everybody. There's too many things that depend a bit on both
data access pattern and on overall database size/structure no matter what
you do.

[If only there were some technology that did workload profiling and set
the server parameters based on that. Some sort of dynamic tuning tool;
wouldn't that be great? Oh well, that's just a dream right now I guess.]

I'm not sure if I've stated this explicitly yet, but I personally have no
interest in just solving the newbie problem. I want a tool to help out
tuning medium to large installs, and generating a simple config file is
absolutely something that should come out of that as a bonus. Anything
that just targets the simple installs, though, I'm not very motivated to
chase after.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 02:34:25
Message-ID: 200806120234.m5C2YP324269@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Tom Lane wrote:
> >> The idea has a fundamental logical flaw, which is that it's not clear
> >> which parameter wins if the user changes both.
>
> > Yes, you could get into problems by having variable dependency loops,
>
> Who said anything about loops? What I am talking about is what happens
> during
> set memory_usage = X; // implicitly sets work_mem = X/100, say
> set work_mem = Y;
> set memory_usage = Z;

My initial thought was that this would behave like a shell script
variable, meaning once you set something it would affect all references
to it below in postgresql.conf. The problem with that is that we
comment out all settings, so there isn't a logical order like you would
have in a shell script.

I was not thinking of memory_usage implicity changing anything. I
figured postgresql.conf would have:

memory_usage = 100
work_mem = $memory_usage * 0.75

If you change memory_usage via SET, it will not change work_mem at all
because you are not re-initializing the variables.

I am kind of lost how this would work logically and am willing to think
about it some more, but I do think we aren't going to simplify
postgresql.conf without such a facility.

The big problem I see is that right now everything has a constant
default. If we allowed memory_usage to change some of the defaults, how
would we signal that we want the variables based on it to change their
values? This is your behind-the-scenes problem you mentioned.

> What is work_mem now, and what's your excuse for saying so, and how
> will you document the behavior so that users can understand it?
> (Just to make things interesting, assume that some of the above SETs
> happen via changing postgresql.conf rather than directly.)
>
> If the objective is to make configuration easier to understand,
> I don't believe that behind-the-scenes changes of configuration values
> will advance that goal.
>
> > but I see no way to easily improve configuration without it.
>
> The higher-level concepts should be things that a configuration wizard
> works with, and then tells you how to set the postmaster parameters.
> They should not end up in the configure file (unless maybe as comments?)

I am concerned that each wizzard is going to have to duplicate the same
logic each time, and adjust to release-based changes. I thought having
the configuration file itself help with setting some values would be
helpful.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 02:36:11
Message-ID: 200806120236.m5C2aBu24710@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith wrote:
> On Wed, 11 Jun 2008, Tom Lane wrote:
>
> > Who said anything about loops? What I am talking about is what happens
> > during
> > set memory_usage = X; // implicitly sets work_mem = X/100, say
> > set work_mem = Y;
> > set memory_usage = Z;
> > What is work_mem now, and what's your excuse for saying so, and how
> > will you document the behavior so that users can understand it?
> > (Just to make things interesting, assume that some of the above SETs
> > happen via changing postgresql.conf rather than directly.)
>
> People are already exposed to issues in this area via things like the
> include file mechanism. You can think of that two ways. You can say,
> "there's already problems like this so who cares if there's another one".
> Or, you can say "let's not add even more confusion like that".
>
> Having a mini programming language for setting parameters is interesting
> and all, and it might be enough to do a good job of handling the basic
> newbie setup chores. But I don't think it's a complete solution and
> therefore I find moving in that direction a bit of a distraction; your
> concerns about ambiguity just amplify that feeling. It's unlikely that
> will get powerful enough to enable the "one true config file" that just
> works for everybody. There's too many things that depend a bit on both
> data access pattern and on overall database size/structure no matter what
> you do.

You are right that the complete solution is going to have to ask users
questions, and my idea of using variables is not going to get us that
far.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-12 16:20:56
Message-ID: f64390b17984f84f526285c89f126930@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> Also, I'd actually assert that "10" seems to be perfectly adequate for
> the majority of users. That is, the number of users where I've
> recommended increasing d_s_t for the whole database is smaller than the
> number where I don't, and of course we never hear from most users at
> all. So I'm pretty happy recommending "Leave the default. If you
> encounter problem queries, increase it to 100, and analyse the database.

Really? I'm the opposite: I never leave a client's setting at 10, that's
just asking for trouble. Making it 100 *after* you encounter problem
queries is reactive; I prefer being proactive. Nor is a setting of 10
"perfectly adequate": I think you might be the last person on the
lists who thinks so. That train has left the station, we've been trying
to decide what a better default should be other than 10, and, more to the
point, how to quantitatively measure it. The problem is, you really can't.
Sure, you can graph a tiny increase in ANALYZE time and disk space, but there
are no stock queries we can use to measure an increase in planning time.
Frankly, I'd be shocked if there is any significant difference and all
compared to the actual query run time.

The orders of magnitude speed up of certain queries when the d_s_t goes
above 98 is what spawned my original thread proposing a change to 100:

http://markmail.org/message/tun3a3juxlsyjbsw

While it's easy to get bogged down in theory about what things
d_s_t should measure, the optimal size of buckets, etc., it's still
a severe performance regression bug that should be fixed, IMO.

Changing the subject line as well: this is only tangentially related
to overhauling GUCS, although I'll point out again that this particular
config is a good example of one that needs more comments.

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200806121213
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhRTKYACgkQvJuQZxSWSsjGvACeJkXZJ8cP385W9UXKzLHdzhvw
gqQAoJWdrepFbkxR2be7oetK8/o/yd9I
=w469
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-12 16:52:26
Message-ID: 10801.1213289546@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Greg Sabino Mullane" <greg(at)turnstep(dot)com> writes:
> The orders of magnitude speed up of certain queries when the d_s_t goes
> above 98 is what spawned my original thread proposing a change to 100:
> http://markmail.org/message/tun3a3juxlsyjbsw

That was a pretty special case (LIKE/regex estimation), and we've since
eliminated the threshold change in the LIKE/regex estimates anyway, so
there's no longer any reason to pick 100 as opposed to any other number.
So we're still back at "what's a good value and why?".

> Frankly, I'd be shocked if there is any significant difference and all
> compared to the actual query run time.

I'm still concerned about the fact that eqjoinsel() is O(N^2). Show me
some measurements demonstrating that a deep nest of equijoins doesn't
get noticeably more expensive to plan --- preferably on a datatype with
an expensive equality operator, eg numeric --- and I'm on board.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 17:25:50
Message-ID: 87iqwetvj5.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Greg Sabino Mullane" <greg(at)turnstep(dot)com> writes:

>> Also, I'd actually assert that "10" seems to be perfectly adequate for
>> the majority of users. That is, the number of users where I've
>> recommended increasing d_s_t for the whole database is smaller than the
>> number where I don't, and of course we never hear from most users at
>> all. So I'm pretty happy recommending "Leave the default. If you
>> encounter problem queries, increase it to 100, and analyse the database.
>
> Really? I'm the opposite: I never leave a client's setting at 10, that's
> just asking for trouble. Making it 100 *after* you encounter problem
> queries is reactive; I prefer being proactive.

Have you ever measured the system speed before and after?

> Nor is a setting of 10 "perfectly adequate":

What percentage of your plans actually change with the larger statistics? How
many for the better? How many were massively improved?

I suspect you're looking at some single-digit percentage slowdown for planning
across the board. In exchange if you a) have simple queries you probably see
none improving. If you b) have moderately complex queries you probably get
some single-digit percentage of them with improvements. And if you c) have
very complex queries you probably have a handful of them which see massive
improvements. Across the internet there are a whole lot more applications of
type (a) than the others...

> Frankly, I'd be shocked if there is any significant difference and all
> compared to the actual query run time.

Well you might start preparing to be shocked. Note that retrieving the
statistics is a query itself so it's not hard for it to be comparable to a
similarly simple query. It's not hard for a simple query using multiple
columns to be using more records of statistics than it is from the actual
data. And things can look much worse if that data is TOASTed and requires
further lookups and/or decompression...

> The orders of magnitude speed up of certain queries when the d_s_t goes
> above 98 is what spawned my original thread proposing a change to 100:
>
> http://markmail.org/message/tun3a3juxlsyjbsw
>
> While it's easy to get bogged down in theory about what things
> d_s_t should measure, the optimal size of buckets, etc., it's still
> a severe performance regression bug that should be fixed, IMO.

It was, three months ago:

http://archives.postgresql.org/pgsql-committers/2008-03/msg00129.php

100 is no longer a magic threshold for LIKE queries (in CVS HEAD)

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-12 17:42:43
Message-ID: 200806121042.44469.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce,

> I am concerned that each wizzard is going to have to duplicate the same
> logic each time, and adjust to release-based changes.

I think that's a feature, not a bug. Right now, I'm not at all convinced that
my algorithms for setting the various major dials are great (I just think
that nobody yet has better). So I think we should *encourage* people to
write their own wizards until we find one that works reasonably well.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-12 18:21:00
Message-ID: 200806121821.m5CIL0V20595@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> Bruce,
>
> > I am concerned that each wizard is going to have to duplicate the same
> > logic each time, and adjust to release-based changes.
>
> I think that's a feature, not a bug. Right now, I'm not at all convinced that
> my algorithms for setting the various major dials are great (I just think
> that nobody yet has better). So I think we should *encourage* people to
> write their own wizards until we find one that works reasonably well.

I am thinking a web-based wizard would make the most sense.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-12 18:24:05
Message-ID: 200806121124.06243.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce,

> I am thinking a web-based wizard would make the most sense.

I'd prefer command-line, so that people could run it on their own servers.
For one thing, we need to generate at least two files on many platforms; a
postgresql.conf and a sysctl.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-12 18:26:14
Message-ID: 200806121826.m5CIQE721561@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> Bruce,
>
> > I am thinking a web-based wizard would make the most sense.
>
> I'd prefer command-line, so that people could run it on their own servers.
> For one thing, we need to generate at least two files on many platforms; a
> postgresql.conf and a sysctl.

They can just download the files the need from the web page, no?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Steve Atkins <steve(at)blighty(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 19:08:25
Message-ID: A63613E3-11D7-4A03-A11A-BD7F9BDF5B6C@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Jun 12, 2008, at 11:21 AM, Bruce Momjian wrote:

> Josh Berkus wrote:
>> Bruce,
>>
>>> I am concerned that each wizard is going to have to duplicate the
>>> same
>>> logic each time, and adjust to release-based changes.
>>
>> I think that's a feature, not a bug. Right now, I'm not at all
>> convinced that
>> my algorithms for setting the various major dials are great (I just
>> think
>> that nobody yet has better). So I think we should *encourage*
>> people to
>> write their own wizards until we find one that works reasonably well.
>
> I am thinking a web-based wizard would make the most sense.

There's a definite need for an interactive GUI wizard (bundle with the
Windows and OS X installers, at least).

And a commandline wizard would certainly be nice, both interactive and
non-interactive. Mostly for including in install scripts on unix
platforms.

And a web-based wizard would be useful too.

And all of them would benefit from being able to both modify an
existing configuration file, and to generate one from scratch.

It looks like it's going to be reasonably easy to abstract away the
interface to the user such that the first two (and likely the third)
can be built from the same codebase, meaning that the smarts about how
to set the various GUC settings (based on RAM available, estimates of
database size and usage) can be maintained in one place.

Cheers,
Steve


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-12 19:29:41
Message-ID: Pine.GSO.4.64.0806121501110.8462@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 12 Jun 2008, Bruce Momjian wrote:

> I am thinking a web-based wizard would make the most sense.

I have not a single customer I work with who could use an external
web-based wizard. Way too many companies have privacy policy restrictions
that nobody dare cross by giving out any info about their server, or
sometimes that they're even using PostgreSQL inside the firewall. If it's
not a tool that you can run on the same server you're running PostgreSQL
on, I'd consider that another diversion that's not worth pursuing.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Decibel! <decibel(at)decibel(dot)org>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 22:18:46
Message-ID: 5B69A1C5-6C65-4FD1-95C4-E73886AFA06D@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 11, 2008, at 9:34 PM, Bruce Momjian wrote:
> Tom Lane wrote:
>> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>>> Tom Lane wrote:
>>>> The idea has a fundamental logical flaw, which is that it's not
>>>> clear
>>>> which parameter wins if the user changes both.
>>
>>> Yes, you could get into problems by having variable dependency
>>> loops,
>>
>> Who said anything about loops? What I am talking about is what
>> happens
>> during
>> set memory_usage = X; // implicitly sets work_mem = X/100, say
>> set work_mem = Y;
>> set memory_usage = Z;
>
> My initial thought was that this would behave like a shell script
> variable, meaning once you set something it would affect all
> references
> to it below in postgresql.conf. The problem with that is that we
> comment out all settings, so there isn't a logical order like you
> would
> have in a shell script.
>
> I was not thinking of memory_usage implicity changing anything. I
> figured postgresql.conf would have:
>
> memory_usage = 100
> work_mem = $memory_usage * 0.75
>
> If you change memory_usage via SET, it will not change work_mem at all
> because you are not re-initializing the variables.

Why? That's the exact opposite of what I'd expect. If I want a
setting's value to be $memory_usage * .75, that's what I want it to
be. Not some value based on whatever $memory_usage was set to when
work_mem happened to be changed.

Of course, if you set something to a hard value with no variables,
then that's what it's set to.

> I am kind of lost how this would work logically and am willing to
> think
> about it some more, but I do think we aren't going to simplify
> postgresql.conf without such a facility.

Agreed. And I think it's a lot more elegant for dealing with day-to-
day tuning than some kind of external wizzard.

> The big problem I see is that right now everything has a constant
> default. If we allowed memory_usage to change some of the
> defaults, how
> would we signal that we want the variables based on it to change their
> values? This is your behind-the-scenes problem you mentioned.

I would suggest that we just re-evaluate everything whenever any
setting is changed (this assumes that we store postgresql.conf
internally in some fashion, so that we're not actually hitting the
file all the time and possibly picking up random edits).
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Decibel! <decibel(at)decibel(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-12 23:17:32
Message-ID: 15711.1213312652@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Decibel! <decibel(at)decibel(dot)org> writes:
> On Jun 11, 2008, at 9:34 PM, Bruce Momjian wrote:
>> I am kind of lost how this would work logically and am willing to
>> think
>> about it some more, but I do think we aren't going to simplify
>> postgresql.conf without such a facility.

> Agreed. And I think it's a lot more elegant for dealing with day-to-
> day tuning than some kind of external wizzard.

You guys call this "simplification"? You're out of your minds.
This proposal is ridiculously complicated, and yet it still fails
even to consider adjusting non-numeric parameters. And what about
things that require more than a trivial arithmetic expression to
compute? It's not hard at all to imagine wanting log, sqrt, etc.

We do not need to put computational capability into GUC. Any
computations needed to determine a parameter setting should be done
by a wizard.

regards, tom lane


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Better default_statistics_target
Date: 2008-06-13 00:55:46
Message-ID: 8230f4174652fe68d79902f17a271afb@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> That was a pretty special case (LIKE/regex estimation), and we've since
> eliminated the threshold change in the LIKE/regex estimates anyway, so
> there's no longer any reason to pick 100 as opposed to any other number.
> So we're still back at "what's a good value and why?".

Glad to hear that, although I think this is only in HEAD, not backpatched,
right? Well at any rate, I withdraw my strong support for 100 and join in
the quest for a good number. The "anything but 10" campaign.

> I'm still concerned about the fact that eqjoinsel() is O(N^2). Show me
> some measurements demonstrating that a deep nest of equijoins doesn't
> get noticeably more expensive to plan --- preferably on a datatype with
> an expensive equality operator, eg numeric --- and I'm on board.

I hope someone else on the list can do this, because I can't. :)

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200806122054
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhRxToACgkQvJuQZxSWSsj0OwCfel+zN/jQth79RvIHtxpUefQD
APMAmQEKIDS6BzqUjn4eTMzP9NDlxTbE
=JZTe
-----END PGP SIGNATURE-----


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-13 01:13:31
Message-ID: 1aaa4d8570787729e054e400e16eaadd@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

>> Really? I'm the opposite: I never leave a client's setting at 10, that's
>> just asking for trouble. Making it 100 *after* you encounter problem
>> queries is reactive; I prefer being proactive.

> Have you ever measured the system speed before and after?

Yes. No change (see below on caching) or faster (better plans).

>> Nor is a setting of 10 "perfectly adequate":

> What percentage of your plans actually change with the larger statistics? How
> many for the better? How many were massively improved?

It matters not if there is a slight increase in planning time: the penalty
of choosing a Bad Plan far outweighs any increased analyze or planning
cost, period. Are you arguing that 10 is a good default, or just against
larger values in general?

> I suspect you're looking at some single-digit percentage slowdown for planning
> across the board. In exchange if you a) have simple queries you probably see
> none improving. If you b) have moderately complex queries you probably get
> some single-digit percentage of them with improvements. And if you c) have
> very complex queries you probably have a handful of them which see massive
> improvements. Across the internet there are a whole lot more applications of
> type (a) than the others...

I'm still skeptical that it's the case, but I wouldn't mind seeing some figures
about how slowed down a "simple" database gets going from 10 to 100 (or larger).
Robert, any chance we can use Pagila for some sort of test for that?

>> Frankly, I'd be shocked if there is any significant difference and all
>> compared to the actual query run time.

> Well you might start preparing to be shocked. Note that retrieving the
> statistics is a query itself so it's not hard for it to be comparable to a
> similarly simple query. It's not hard for a simple query using multiple
> columns to be using more records of statistics than it is from the actual
> data. And things can look much worse if that data is TOASTed and requires
> further lookups and/or decompression...

Even if all you say above is true, and I think we'll have to agree to disagree
on that, there's an important point to remember: query plans can be (and very
often are) cached. Queries and query results cannot (although I'm working
on that... :) Plans to queries is a 1-N, where N can be very, very large,
and further boosts the query time vs planning time ratio.

...
> 100 is no longer a magic threshold for LIKE queries (in CVS HEAD)

That's great, but doesn't help existing releases (unless that was backpatched,
but I don't recall it if so) But that's a battle I'm going to stop fighting,
and concentrate on helping to find a replacement for 10 that may or may not
be 100.

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
End Point Corporation
PGP Key: 0x14964AC8 200806122100
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhRyXIACgkQvJuQZxSWSshZpwCeOdLZCu0gSQGpOLciQ6H29Tsd
iNgAn3AuoisE8zSbMjLuDL4aWzP6NAth
=ujTa
-----END PGP SIGNATURE-----


From: James William Pye <x(at)jwp(dot)name>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Decibel! <decibel(at)decibel(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-13 01:45:32
Message-ID: 9E60C5E3-E2FB-4EFA-A3A3-F46EFDEB04F5@jwp.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> You guys call this "simplification"? You're out of your minds.
> This proposal is ridiculously complicated, and yet it still fails
> even to consider adjusting non-numeric parameters. And what about
> things that require more than a trivial arithmetic expression to
> compute? It's not hard at all to imagine wanting log, sqrt, etc.
>
> We do not need to put computational capability into GUC. Any
> computations needed to determine a parameter setting should be done
> by a wizard.

+1 (save the comment speculating about anyone's location relative to
their mind ;)

Additionally, obvious as it may be, there's nothing stopping anyone
from developing a tool to generate the configuration file from a
more "interesting" source. Whether that's XML or some DSL that
supports computations, doesn't matter. I would think if such a tool
showed dramatic merit it would provoke another discussion about core
integration, but ISTM that leaving it dead simple is best.

[mm, hook into the postgres startup script, shouldn't be that hard to
administer..]


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-13 15:54:20
Message-ID: 200806131154.21216.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday 11 June 2008 16:54:23 Bruce Momjian wrote:
> Dave Page wrote:
> > On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > > Dave Page wrote:
> > >> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd(at)commandprompt(dot)com>
wrote:
> > >> > pg_ctl -D data check?
> > >> >
> > >> > I would +1 that.
> > >>
> > >> I would also really like to see that - though I'd also like to see an
> > >> SQL interface so we can check a config before saving when editing via
> > >> pgAdmin or similar.
> > >
> > > Should this be a TODO?
> >
> > Yes please.
>
> Added to TODO:
>
> * Add pg_ctl option to do a syntax check of postgresql.conf
>

ISTM we need something that can run inside the db as well, i'm thinking
something like pg_check_conf() to go with pg_reload_conf(). Also, these
should probably check validity of the pg_hba.conf as well.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Dave Page <dpage(at)pgadmin(dot)org>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-06-13 16:24:48
Message-ID: 200806131624.m5DGOmO21443@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Treat wrote:
> On Wednesday 11 June 2008 16:54:23 Bruce Momjian wrote:
> > Dave Page wrote:
> > > On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > > > Dave Page wrote:
> > > >> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd(at)commandprompt(dot)com>
> wrote:
> > > >> > pg_ctl -D data check?
> > > >> >
> > > >> > I would +1 that.
> > > >>
> > > >> I would also really like to see that - though I'd also like to see an
> > > >> SQL interface so we can check a config before saving when editing via
> > > >> pgAdmin or similar.
> > > >
> > > > Should this be a TODO?
> > >
> > > Yes please.
> >
> > Added to TODO:
> >
> > * Add pg_ctl option to do a syntax check of postgresql.conf
> >
>
> ISTM we need something that can run inside the db as well, i'm thinking
> something like pg_check_conf() to go with pg_reload_conf(). Also, these
> should probably check validity of the pg_hba.conf as well.

Agreed, TODO updated:

o Add functions to syntax check configuration files

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Better default_statistics_target
Date: 2008-06-13 16:51:25
Message-ID: A88E375D-ED35-40DB-A9FB-60F3EE866E8C@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 12, 2008, at 17:55, Greg Sabino Mullane wrote:

> Glad to hear that, although I think this is only in HEAD, not
> backpatched,
> right? Well at any rate, I withdraw my strong support for 100 and
> join in
> the quest for a good number. The "anything but 10" campaign

I vote for 11. That's one louda, in'it?

Best,

David


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-18 10:47:45
Message-ID: 4858E7D1.6080401@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith wrote:
> On Thu, 5 Jun 2008, Magnus Hagander wrote:
>
>> We really need a "proper API" for it, and the stuff in pgAdmin isn't
>> even enough to base one on.
>
> I would be curious to hear your opinion on whether the GUC overhaul
> discussed in this thread is a useful precursor to building such a proper
> API.

I'm getting back to this one now that the discussion has died down a bit.

As usual, the discussion spread out into these huge changes needed to be
done to satisfy everyone from day one. I don't think that's the way to
go - we have to do it piece by piece if that's ever going to be done.
IMHO the first thing to do is to create a stable API for modifying
config variables remotely. *at this time* it doesn't matter how this API
modifies the configuration, if it's in a file on in the db or whatever.
We can always change that later... Which is why I will be proceeding
with this one - make an API that requires the least possible change for
now, per discussions earlier this year (not in this thread) :-)

Now, this in itself is unrelated to the stuff Josh was talking about
which is collecting the information in one place and being able to
generate config files that don't contain everything, and being able to
auto-tune things. It would be nice for such a tool to be able to get the
full descriptions and such from the pg_settings view or such, which
means it needs to go in the code and not in a standalone file, but it's
not *as necessary*.

I still think the config file we ship today is horrible, I get lots of
questions around it, and I see lots of examples of people who tweak
settings they have no idea what they do, just because it's there. That's
a separate issue that could *either* be solved by shipping more than
one default config file, or it could be solved by the config file
generator Josh proposed.

//Magnus


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andreas Pflug <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-07-16 15:30:23
Message-ID: 200807161530.m6GFUNY28751@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Added to TODO:

o Add external tool to auto-tune some postgresql.conf parameters

http://archives.postgresql.org/pgsql-hackers/2008-06/msg00000.php

---------------------------------------------------------------------------

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > Tom Lane wrote:
> >> * Can we build a "configuration wizard" to tell newbies what settings
> >> they need to tweak?
>
> > That would trump all the other suggestions conclusively. Anyone good at
> > expert systems?
>
> How far could we get with the answers to just three questions:
>
> * How many concurrent queries do you expect to have?
>
> * How much RAM space are you willing to let Postgres use?
>
> * How much "overhead" disk space are you willing to let Postgres use?
>
> concurrent queries drives max_connections, obviously, and RAM space
> would drive shared_buffers and effective_cache_size, and both of them
> would be needed to size work_mem. The third one is a bit weird but
> I don't see any other good way to set the checkpoint parameters.
>
> If those aren't enough questions, what else must we ask? Or maybe they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Michael Nacos" <m(dot)nacos(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-13 14:38:05
Message-ID: 407fa4640808130738s3537ca45j22322a5eb70f9173@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi there... Configuration autotuning is something I am really interested in.
I have seen this page: http://wiki.postgresql.org/wiki/GUCS_Overhaul and
a couple of emails mentioning this, so I wanted to ask is someone already
on it? If yes, I'd like to contribute.

Ideally, an external little app should also provide recommendations based
on current database usage statistics -- wouldn't this constitute something
akin to application-specific advice? In this regard, the tool could also
tell
you things like how much RAM you should order if you were to replace your
existing database server :-)

Michael


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Michael Nacos <m(dot)nacos(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-17 20:48:20
Message-ID: Pine.GSO.4.64.0808171610220.22642@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 13 Aug 2008, Michael Nacos wrote:

> Hi there... Configuration autotuning is something I am really interested in.
> I have seen this page: http://wiki.postgresql.org/wiki/GUCS_Overhaul and
> a couple of emails mentioning this, so I wanted to ask is someone already
> on it? If yes, I'd like to contribute.

Good time to give a status report on what's been going on with all this.

With some help I just finished off an answer to problem #1 there recently,
"Most people have no idea how to set these". There was some concern here
that work was being done on config tools without a clear vision of what
was going to be tuned. See
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server for an intro
on how to set the 18 most important parameters (+7 logging parameters)
based on the best information I'm aware of.

Circa June, Steve Atkins was looking into writing a C++/Qt GUI tuning
interface application, with the idea that somebody else would figure out
the actual smarts to the tuning effort. Don't know where that's at.

Josh Berkus and I have been exchanging some ideas for the GUC internals
overhaul and had a quick discussion about that in person last month.
We've been gravitating toward putting all the extra information we'd like
to push into there in an extra catalog table (pg_settings_info or
something). The stuff the server needs to start can stay right where it
is right now, all the other decoration can move to the table.

> Ideally, an external little app should also provide recommendations based
> on current database usage statistics -- wouldn't this constitute something
> akin to application-specific advice?

Yes, there's a grand plan for a super-wizard that queries the database for
size, index, and statistics information for figure out what to do; I've
been beating that drum for a while now. Unfortunately, the actual
implementation is blocked behind the dreadfully boring work of sorting out
how to organize and manage the GUC information a bit better, and the
moderately boring work of building a UI for modifying things. If you were
hoping to work on the sexy autotuning parts without doing some of the
grunt work, let me know if you can figure out how so I can follow you.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Michael Nacos <m(dot)nacos(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-17 21:18:55
Message-ID: 26969.1219007935@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> Josh Berkus and I have been exchanging some ideas for the GUC internals
> overhaul and had a quick discussion about that in person last month.
> We've been gravitating toward putting all the extra information we'd like
> to push into there in an extra catalog table (pg_settings_info or
> something). The stuff the server needs to start can stay right where it
> is right now, all the other decoration can move to the table.

Somehow, the attraction of that idea escapes me.

What we have now was named Grand Unified Configuration for a reason:
it centralized the handling of what had been a mess of different things
configured in different ways. I'm not eager to go backwards on that.

I'm also interested to know exactly what such a table would provide
that isn't already available in the form of the pg_settings view.

regards, tom lane


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-17 22:12:50
Message-ID: Pine.GSO.4.64.0808171740020.22642@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 17 Aug 2008, Tom Lane wrote:

> What we have now was named Grand Unified Configuration for a reason:
> it centralized the handling of what had been a mess of different things
> configured in different ways. I'm not eager to go backwards on that.

No need to change anything related to how the configuration is done.
There's really only two things wrong with what's there right now IMHO and
they don't require any changes to the internals, just what's shown:

1) The view should show both how the user defined the setting and how it's
represented internally. Basically something that looks like this:

select name,current_setting(name) as input_setting,setting from
pg_settings;

2) Expose the default value.

> I'm also interested to know exactly what such a table would provide
> that isn't already available in the form of the pg_settings view.

Links to the relevant documentation and a place to save both default and
user comments about the setting were two things being considered that
seemed a really bad fit to tack onto the GUC structure. There's some
others. The main point is that that nobody wants to have to tinker with
the core GUC itself just to decorate it with more information, that is
complicated enough as it is.

One might make a case that the stuff the GUC must handle (settings, units,
type, defaults, etc.) could be usefully separated from all the more
documentation-oriented bits stored there right now (category,
descriptions), and that the existing documentation bits could move over to
the table along with the hyperlinks and such. Doing that adds another
place to have to edit, but I think there's an even exchange available
there because it enables easy auto-generation of the postgresql.conf file
at initdb time from that table + pg_settings.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Steve Atkins <steve(at)blighty(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-08-17 22:49:24
Message-ID: F907F453-0733-4070-9F4B-211ABF6C8038@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Aug 17, 2008, at 1:48 PM, Greg Smith wrote:

> On Wed, 13 Aug 2008, Michael Nacos wrote:
>
>> Hi there... Configuration autotuning is something I am really
>> interested in.
>> I have seen this page: http://wiki.postgresql.org/wiki/
>> GUCS_Overhaul and
>> a couple of emails mentioning this, so I wanted to ask is someone
>> already
>> on it? If yes, I'd like to contribute.
>
> Good time to give a status report on what's been going on with all
> this.
>
> With some help I just finished off an answer to problem #1 there
> recently, "Most people have no idea how to set these". There was
> some concern here that work was being done on config tools without a
> clear vision of what was going to be tuned. See http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
> for an intro on how to set the 18 most important parameters (+7
> logging parameters) based on the best information I'm aware of.
>
> Circa June, Steve Atkins was looking into writing a C++/Qt GUI
> tuning interface application, with the idea that somebody else would
> figure out the actual smarts to the tuning effort. Don't know where
> that's at.

First pass is done. Needs a little cleanup before sharing. I spent a
fair while down OS-specific-hardware-queries rathole, but I'm better
now.

Cheers,
Steve


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Steve Atkins <steve(at)blighty(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-08-18 03:23:07
Message-ID: 48A8EB1B.8000702@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Steve,

> First pass is done. Needs a little cleanup before sharing. I spent a
> fair while down OS-specific-hardware-queries rathole, but I'm better now.

Gods, I hope you gave up on that. You want to use SIGAR or something.

--Josh


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Steve Atkins <steve(at)blighty(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-08-18 08:05:49
Message-ID: 48A92D5D.9080604@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> Steve,
>
>> First pass is done. Needs a little cleanup before sharing. I spent a
>> fair while down OS-specific-hardware-queries rathole, but I'm better now.
>
> Gods, I hope you gave up on that. You want to use SIGAR or something.

If it's going to be C++, and reasonably cross platform, and a pg tool,
why not try to build something as a module in pgAdmin? Certainly going
to get you a larger exposure... And I'm sure the pgAdmin team would be
happy to have it!

//Magnus


From: "Michael Nacos" <m(dot)nacos(at)gmail(dot)com>
To: "Greg Smith" <gsmith(at)gregsmith(dot)com>, bruce(at)momjian(dot)us
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 11:26:26
Message-ID: 407fa4640808180426q26a41cv817deb91dbbcd312@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

What I'm interested in is auto-tuning, not necessarily overhauling GUCS,
which happens to be the subject of this thread :-)
Having done a SELECT * FROM pg_settings, all the information you need seems
to be there...
Maybe I'm being over-simplistic here, but the important bit is knowing how
you should tune stuff - and this is what I'm hoping to learn through this
process.
Now, you could probably sidestep UI and GUCS concerns by moving the
auto-tuning process inside the database. You don't need fancy GUIs for
guessing configuration parameters, and if you can already do that, coming up
with a GUI should be pretty straightforward.
For example, I see no reason why you couldn't capture the logic of tuning in
a couple of PL/Python functions to look up usage stats, size of indices etc.
PL/Python being an "untrusted" language, you could even write a new
postgresql.conf file to disk, with the suggested alterations. Cheap, quick
and cheerful!

Perhaps the auto-tuning conversation should take place in a separate thread,
how do you feel about changing the subject line? The most insteresting bit
is discussing and testing tuning strategies. This, of course, is related to
the [admin] and [perform] side of things, but there is also a development
dimension. As soon as there is a tuning strategy published, a number of
tools will certainly follow.

Michael

Yes, there's a grand plan for a super-wizard that queries the database for
> size, index, and statistics information for figure out what to do; I've been
> beating that drum for a while now. Unfortunately, the actual implementation
> is blocked behind the dreadfully boring work of sorting out how to organize
> and manage the GUC information a bit better, and the moderately boring work
> of building a UI for modifying things. If you were hoping to work on the
> sexy autotuning parts without doing some of the grunt work, let me know if
> you can figure out how so I can follow you.
>
>


From: Steve Atkins <steve(at)blighty(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-08-18 15:12:53
Message-ID: EA2C8297-04CD-45A4-BB74-46DE40B2D0EE@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Aug 18, 2008, at 1:05 AM, Magnus Hagander wrote:

> Josh Berkus wrote:
>> Steve,
>>
>>> First pass is done. Needs a little cleanup before sharing. I spent a
>>> fair while down OS-specific-hardware-queries rathole, but I'm
>>> better now.
>>
>> Gods, I hope you gave up on that. You want to use SIGAR or
>> something.
>
> If it's going to be C++, and reasonably cross platform, and a pg tool,
> why not try to build something as a module in pgAdmin? Certainly going
> to get you a larger exposure... And I'm sure the pgAdmin team would be
> happy to have it!

I'm attempting to build it as something that can be used in several
places. Where there's most need for it is as an install time option
in installers, particularly on Windows.

There's no reason the same underlying code couldn't also go into
pgAdmin, of course. At the moment the code is a bit Qt specific,
reducing that is part of the cleanup.

Cheers,
Steve


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Steve Atkins <steve(at)blighty(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-08-18 15:17:31
Message-ID: 48A9928B.6070501@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Steve Atkins wrote:
>
> On Aug 18, 2008, at 1:05 AM, Magnus Hagander wrote:
>
>> Josh Berkus wrote:
>>> Steve,
>>>
>>>> First pass is done. Needs a little cleanup before sharing. I spent a
>>>> fair while down OS-specific-hardware-queries rathole, but I'm better
>>>> now.
>>>
>>> Gods, I hope you gave up on that. You want to use SIGAR or something.
>>
>> If it's going to be C++, and reasonably cross platform, and a pg tool,
>> why not try to build something as a module in pgAdmin? Certainly going
>> to get you a larger exposure... And I'm sure the pgAdmin team would be
>> happy to have it!
>
> I'm attempting to build it as something that can be used in several
> places. Where there's most need for it is as an install time option
> in installers, particularly on Windows.

Well, if it was in pgadmin, it would be there more or less by default on
Windows. And very easy to get in on other platforms, since pgadmin is
already packaged there. Plus, all the dependencies are already there on
said platforms.

//Magnus


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Michael Nacos <m(dot)nacos(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 17:39:45
Message-ID: Pine.GSO.4.64.0808181250190.17947@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 18 Aug 2008, Michael Nacos wrote:

> Having done a SELECT * FROM pg_settings, all the information you need
> seems to be there...

See http://archives.postgresql.org/pgsql-hackers/2008-06/msg00209.php You
sound like you're at rung 2 on the tool author ladder I describe there,
still thinking about the fun part of tuning but not yet aware of the
annoying postgresql.conf management issues that show up in the field that
motivate many of the GUCS changes suggested. Coping with user and
system-generated comments is one difficult part that people normally don't
consider, dealing with bad settings the server won't start with is
another.

I did make one mistake in that message, which is that the "context" field
of pg_settings already exposes when a setting can be changed. And it is
possible to get the value for a setting as entered by the admin by joining
pg_settings against what current_setting returns, which is one part of
handling the import/change/export cycle while keeping useful units intact.

> Maybe I'm being over-simplistic here, but the important bit is knowing
> how you should tune stuff - and this is what I'm hoping to learn through
> this process.

The tuning references at the bottom of
http://wiki.postgresql.org/wiki/GUCS_Overhaul provide more detail here
than anyone has been able to automate so far. There's certainly room to
improve on the suggestions there with some introspection of the database,
I'm trying to stay focused on something to help newbies whose settings are
completely wrong first.

> As soon as there is a tuning strategy published, a number of tools will
> certainly follow.

Josh Berkus published one in 2005 and zero such tools have been produced
since then, even though it looked to him then (like it does to you now and
like it did to me once) that such a tool would easily follow:
http://pgfoundry.org/docman/?group_id=1000106

The bright side here is that you don't have to waste time tinkering in
this area to find out where the dead ends are like Josh and I
independantly did.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Greg Smith" <gsmith(at)gregsmith(dot)com>
Cc: "Michael Nacos" <m(dot)nacos(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-08-18 18:57:32
Message-ID: 87y72u178j.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Greg Smith" <gsmith(at)gregsmith(dot)com> writes:

> On Mon, 18 Aug 2008, Michael Nacos wrote:
>
>> Having done a SELECT * FROM pg_settings, all the information you need seems
>> to be there...
>
> See http://archives.postgresql.org/pgsql-hackers/2008-06/msg00209.php You sound
> like you're at rung 2 on the tool author ladder I describe there, still
> thinking about the fun part of tuning but not yet aware of the annoying
> postgresql.conf management issues that show up in the field that motivate many
> of the GUCS changes suggested. Coping with user and system-generated comments
> is one difficult part that people normally don't consider,

Because coping with free-form user-edited text is a losing game. People don't
consider it because it's a dead-end. Instead you have one file for user-edited
configuration and a separate file for computer generated configuration. You
never try to automatically edit a user-edited file -- that way lies madness.

> dealing with bad settings the server won't start with is another.

A tuning interface can't be turing complete and detect all possible
misconfigurations. To do that it would have to be as complex as the server.

In any case worrying about things like this before you have a tuning interface
that can do the basics is putting the cart before the horse.

>> As soon as there is a tuning strategy published, a number of tools will
>> certainly follow.
>
> Josh Berkus published one in 2005 and zero such tools have been produced since
> then, even though it looked to him then (like it does to you now and like it
> did to me once) that such a tool would easily follow:
> http://pgfoundry.org/docman/?group_id=1000106

The entire target market for such a thing is DBAs stuck on hosted databases
which don't have shell access to their machines. Perhaps the overlap between
that and the people who can write a server-side module which dumps out a
config file according to some rules is just too small?

I do think you and others make it less likely every time you throw up big
insoluble problems like above though. As a consequence every proposal has
started with big overly-complex solutions trying to solve all these incidental
issues which never go anywhere instead of simple solutions which directly
tackle the main problem.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 19:35:29
Message-ID: 200808181235.29983.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg,

> The entire target market for such a thing is DBAs stuck on hosted
> databases which don't have shell access to their machines.

That's incorrect. The main reason for having a port-based API (such as the
SQL command line) for managing your server is that it makes it much easier
to manage a large number of servers. Right now, if you want to survey
your databases, tables, approx disk space, query activity, etc., you can
do that all through port 5432. You can't manage most of your server
settings that way, and definitely can't manage the *persistent* settings.
When you're trying to manage 1000 PostgreSQL servers, this is not a minor
issue.

With the growing "cloud" sector, the lack of easy server parameter
management is hurting PostgreSQL's adoption for hosted applications. This
isn't a new complaint, and is a big part of the reason why 90% of web
hosts still don't offer PostgreSQL. I've heard complaints about our
manageability problems from more vendors than I can count.

HOWEVER, it's completely possible to get a 1st-generation config tool out
there without first implementing port-based config access. For one thing,
there's Puppet. So that's what I'm intending to do.

> I do think you and others make it less likely every time you throw up
> big insoluble problems like above though.

It's not an insoluble problem. It's a political problem; several people
don't want to add this functionality to the project.

> As a consequence every
> proposal has started with big overly-complex solutions trying to solve
> all these incidental issues which never go anywhere instead of simple
> solutions which directly tackle the main problem.

What, in your opinion, is "the main problem"? I'm not sure we agree on
that.

--
--Josh

Josh Berkus
PostgreSQL
San Francisco


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Michael Nacos" <m(dot)nacos(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 19:40:24
Message-ID: 3447.1219088424@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> The entire target market for such a thing is DBAs stuck on hosted databases
> which don't have shell access to their machines. Perhaps the overlap between
> that and the people who can write a server-side module which dumps out a
> config file according to some rules is just too small?

There's a veritable boatload of stuff we do that assumes shell access
(how many times have you seen cron jobs recommended, for instance?).
So I'm unconvinced that "modify the config without shell access"
is really a goal that is worth lots of effort. In any case, there's
already adequate support for sucking postgresql.conf out of the machine
and putting it back: pg_read_file(), pg_file_write(), pg_reload_conf().
So at the end of the day remote access isn't a factor in this at all.

> I do think you and others make it less likely every time you throw up big
> insoluble problems like above though. As a consequence every proposal has
> started with big overly-complex solutions trying to solve all these incidental
> issues which never go anywhere instead of simple solutions which directly
> tackle the main problem.

The impression I get every time this comes up is that various people
have different problems they want to solve that (they think) require
redesign of the way GUC works. Those complicated solutions arise from
attempting to satisfy N different demands simultaneously. The fact that
many of these goals aren't subscribed to by the whole community to begin
with doesn't help to ease resolution of the issues.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: <josh(at)agliodbs(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Overhauling GUCS
Date: 2008-08-18 19:51:28
Message-ID: 87tzdi14qn.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Josh Berkus" <josh(at)agliodbs(dot)com> writes:

> Greg,
>
>> The entire target market for such a thing is DBAs stuck on hosted
>> databases which don't have shell access to their machines.
>
> That's incorrect. The main reason for having a port-based API (such as the
> SQL command line) for managing your server is that it makes it much easier
> to manage a large number of servers. Right now, if you want to survey
> your databases, tables, approx disk space, query activity, etc., you can
> do that all through port 5432. You can't manage most of your server
> settings that way, and definitely can't manage the *persistent* settings.
> When you're trying to manage 1000 PostgreSQL servers, this is not a minor
> issue.

This I don't understand. If you're managing lots of servers running lots of
software the last thing you want to have to do is write a custom method for
updating the configuration of each service. In that use case you would prefer
to just use rsync/svn/git to push the new config file to all the machines
anyways.

> With the growing "cloud" sector, the lack of easy server parameter
> management is hurting PostgreSQL's adoption for hosted applications. This
> isn't a new complaint, and is a big part of the reason why 90% of web
> hosts still don't offer PostgreSQL. I've heard complaints about our
> manageability problems from more vendors than I can count.

These are both use cases which fall in the category I described where you want
to allow users to configure the system through an automated interface.

We can do that today by generating the automatically generated section and
including that in postgresql.conf as an include file.

>> As a consequence every proposal has started with big overly-complex
>> solutions trying to solve all these incidental issues which never go
>> anywhere instead of simple solutions which directly tackle the main
>> problem.
>
> What, in your opinion, is "the main problem"? I'm not sure we agree on
> that.

The main problem that I've seen described is what I mentioned before: allowing
adjusting the postgresql.conf GUC settings by remote users who don't have
shell access.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 19:55:54
Message-ID: 937d27e10808181255sab0a5ecuf3087521a0eeccd5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 18, 2008 at 8:51 PM, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
> The main problem that I've seen described is what I mentioned before: allowing
> adjusting the postgresql.conf GUC settings by remote users who don't have
> shell access.

Which pgAdmin has done perfectly well for years, as long as the config
is all in one file.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 19:59:24
Message-ID: 200808181259.25320.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg,

> The main problem that I've seen described is what I mentioned before:
> allowing adjusting the postgresql.conf GUC settings by remote users who
> don't have shell access.

Oh, ok. I think we're in agreement, though. I don't think that's the
*1st* problem to be solved, but it's definitely important.

--
--Josh

Josh Berkus
PostgreSQL
San Francisco


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Michael Nacos <m(dot)nacos(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 20:21:38
Message-ID: Pine.GSO.4.64.0808181543260.867@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 18 Aug 2008, Gregory Stark wrote:

> Because coping with free-form user-edited text is a losing game. People
> don't consider it because it's a dead-end.

Right, that's impossible technology to build, which is why I had to plan
all these screen shots showing tools that handle that easily for Apache's
very similar configuration file:
http://www.apache-gui.com/apache-windows.html
http://kochizz.sourceforge.net/quelques-captures-decran/

> Instead you have one file for user-edited configuration and a separate
> file for computer generated configuration.

It wouldn't be so difficult if the system generated postgresql.conf didn't
have all this extra junk in it, which is part of what an overhaul plans to
simplify. The way the file gets spit out right now encourages some of the
awful practices people develop in the field.

> A tuning interface can't be turing complete and detect all possible
> misconfigurations. To do that it would have to be as complex as the server.

Thank you for supporting the case for why changes need to be to the server
code itself, to handle things like validating new postgresql.conf files
before they get loaded. I try not to bring that up lest it complicate
things further.

> The entire target market for such a thing is DBAs stuck on hosted
> databases which don't have shell access to their machines.

I've never setup a hosted database on a system I don't have shell access
to, so I have no idea where you get the impression that was a primary goal
of anything I've said. It just so happens that improving what tuning you
can do over port 5432 helps that crowd out too, that's a bonus as I see
it.

> Ask me about EnterpriseDB's On-Demand Production Tuning

...nah, too easy, I'll just let that go.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 20:38:29
Message-ID: 48A9DDC5.2000302@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dave Page wrote:
> On Mon, Aug 18, 2008 at 8:51 PM, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
>> The main problem that I've seen described is what I mentioned before: allowing
>> adjusting the postgresql.conf GUC settings by remote users who don't have
>> shell access.
>
> Which pgAdmin has done perfectly well for years, as long as the config
> is all in one file.

I'll argue it's not done it perfectly well (it's not particularly
user-friendly), but it has certainly *done* it...

//Magnus


From: dpage(at)pgadmin(dot)org
To: "Magnus Hagander" <magnus(at)hagander(dot)net>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 21:34:49
Message-ID: 937d27e10808181434oe3cc9f5u79301d44e3c6bd92@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8/18/08, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> Dave Page wrote:
>> On Mon, Aug 18, 2008 at 8:51 PM, Gregory Stark <stark(at)enterprisedb(dot)com>
>> wrote:
>>> The main problem that I've seen described is what I mentioned before:
>>> allowing
>>> adjusting the postgresql.conf GUC settings by remote users who don't have
>>> shell access.
>>
>> Which pgAdmin has done perfectly well for years, as long as the config
>> is all in one file.
>
> I'll argue it's not done it perfectly well (it's not particularly
> user-friendly), but it has certainly *done* it...

I mean it's able to read & write the config file correctly. I agree
the ui is, umm, sub-optimal.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, Michael Nacos <m(dot)nacos(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-18 21:50:10
Message-ID: 20080818215010.GC4454@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane escribió:
> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> > The entire target market for such a thing is DBAs stuck on hosted databases
> > which don't have shell access to their machines. Perhaps the overlap between
> > that and the people who can write a server-side module which dumps out a
> > config file according to some rules is just too small?
>
> There's a veritable boatload of stuff we do that assumes shell access
> (how many times have you seen cron jobs recommended, for instance?).
> So I'm unconvinced that "modify the config without shell access"
> is really a goal that is worth lots of effort.

Actually, lots of people are discouraged by suggestions of using cron to
do anything. The only reason cron is suggested is because we don't have
any other answer, and for many people it's a half-solution. An
integrated task scheduler in Pg would be more than welcome.

Also, remember that pgAdmin already comes with a pgAgent thing.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <stark(at)enterprisedb(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, Michael Nacos <m(dot)nacos(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-08-19 08:09:41
Message-ID: 48AA7FC5.6070800@hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera wrote:
> Tom Lane escribió:
>> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>>> The entire target market for such a thing is DBAs stuck on hosted databases
>>> which don't have shell access to their machines. Perhaps the overlap between
>>> that and the people who can write a server-side module which dumps out a
>>> config file according to some rules is just too small?
>> There's a veritable boatload of stuff we do that assumes shell access
>> (how many times have you seen cron jobs recommended, for instance?).
>> So I'm unconvinced that "modify the config without shell access"
>> is really a goal that is worth lots of effort.
>
> Actually, lots of people are discouraged by suggestions of using cron to
> do anything. The only reason cron is suggested is because we don't have
> any other answer, and for many people it's a half-solution. An
> integrated task scheduler in Pg would be more than welcome.

Yes, I hear a lot of people complaining aobut that too. Now that we have
a working autovacuum, some of it goes away though - no need to cron your
VACUUMs in most cases anymore. But there are still backups - but they
are often managed by the scheduler of an enterprise backup software.

> Also, remember that pgAdmin already comes with a pgAgent thing.

Yeah, it's a real life-saver on Windows where the builtin task-scheduler
isn't as readily accessible or easy to use..

//Magnus


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org, josh(at)agliodbs(dot)com
Subject: Re: Overhauling GUCS
Date: 2008-08-19 09:14:47
Message-ID: 200808191214.47708.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Monday, 18. August 2008 schrieb Josh Berkus:
> Right now, if you want to survey
> your databases, tables, approx disk space, query activity, etc., you can
> do that all through port 5432.  You can't manage most of your server
> settings that way, and definitely can't manage the *persistent* settings.  
> When you're trying to manage 1000 PostgreSQL servers, this is not a minor
> issue.

Some of that effort could go into making less settings persistent.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <stark(at)enterprisedb(dot)com>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Michael Nacos" <m(dot)nacos(at)gmail(dot)com>
Subject: Re: Overhauling GUCS
Date: 2008-08-19 09:25:21
Message-ID: 200808191225.23039.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Monday, 18. August 2008 schrieb Tom Lane:
> The impression I get every time this comes up is that various people
> have different problems they want to solve that (they think) require
> redesign of the way GUC works.  Those complicated solutions arise from
> attempting to satisfy N different demands simultaneously.

Which may be the reason that I have been getting the impression that
the "Problems" and the proposed resolutions on
http://wiki.postgresql.org/wiki/GUCS_Overhaul are not really closely related.
I can agree with the Problems, but then I am lost.