Re: RFC: programmable file format for postgresql.conf

Lists: pgsql-hackers
From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: pgsql-hackers(at)postgresql(dot)org
Subject: RFC: programmable file format for postgresql.conf
Date: 2013-12-01 19:24:49
Message-ID: 529B8D01.6060301@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Hi there!

I've been reading several threads debating about the format of
postgresql.conf and improvements to it (like "Overhauling GUCS" [1] or
"Proposal for Allow postgresql.conf values to be changed via SQL" [2]).
Trying to summarize that in my own opinion, I think that the current
file format has some problems/limitations:

1) It exposes little information to help users configure postgres.

It has comments, but they are not consistent on the information they
give about the parameters. And they don't expose as much information as
you may find in pg_settings, nor they include extra information, such as
the param's doc URL.

2) It discourages the creation of both GUI tools for configuring
postgresql.conf and auto-tunning tools.

Parsing and writing programmatically the current config file may seem
easy at first, but has been reported to be a daunting task. There isn't
a "standard" for the comments, and hence they're really hard to parse
and write them. There isn't either a way of classifying (grouping)
the parameters by concepts such as pg_settings.category or other
user-oriented criteria, like (newbie, advanced, expert), for example.

3) There is no support for changing parameters persistently from a
postgresql connection.

This is a feature that some want, but it is hard(er) to implement if
there is no simple way of programmatically editing the config file (as
explained in #2).

4) There is no common code to parse/validate/write postgresql.conf files
that could be reused both for the server and other external tools.

IMHO, defining a new syntax for the postgreql.conf file format,
that is suitable for writing and parsing, or using an already existing,
well-known, programmatic syntax, could offer a solution for all the
problems/limitations above.

If that would be the case, I think it should be first debated
what data should be there for every parameter, and what (logical) data
structure should be used to represent it. It would be great if that
information is optionally extensible, to include extra information
--possibly ignored by postgres, but used by the external tools.

I would also suggest to require the syntax to at least be:

a) Text-based.

b) Human-friendly. Even though it may become more verbose, it should
remain easily editable by humans.

c) Easily parseable by "one-liner" cli commands such as grep, awk,
sed...

d) Optionally, an alternative to the current postgresql.conf.

Instead of replacing postgresql.conf, it may offer an alternative way of
configuring postgres. If both files are present, one of them
(postgresql.conf, for example) would take preference, being the other
one being completely ignored. Although this would create some kind of
code duplication, it would open a way for phased adoption.

e) Preferably a well-known syntax, or similar to an existing one, so
it doesn't become a new barrier for user adoption.

I know that suggesting to change --or create an alternate--
postgresql.conf file is not an easy topic, but what's your opinion? Any
comments/ideas would be appreciated :)

Regards,

aht

[1] http://www.postgresql.org/message-id/48409D1E.3070208@agliodbs.com
[2]
http://www.postgresql.org/message-id/007d01cdb5d9$a55d7ab0$f0187010$@kapila@huawei.com

--
Álvaro Hernández Tortosa

-----------
NOSYS
Networked Open SYStems


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-03 21:06:20
Message-ID: 529E47CC.2060804@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/1/13, 2:24 PM, Álvaro Hernández Tortosa wrote:
> IMHO, defining a new syntax for the postgreql.conf file format,
> that is suitable for writing and parsing, or using an already existing,
> well-known, programmatic syntax, could offer a solution for all the
> problems/limitations above.

That's the problem, there isn't one, is there? The closest you'd get is
the INI syntax, but that's like CSV, with many variations. And most
client libraries for this will likely drop all comments when they read
and write a file, so this doesn't address that issue.


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-03 21:27:32
Message-ID: m24n6peibf.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On 12/1/13, 2:24 PM, Álvaro Hernández Tortosa wrote:
>> IMHO, defining a new syntax for the postgreql.conf file format,
>> that is suitable for writing and parsing, or using an already existing,
>> well-known, programmatic syntax, could offer a solution for all the
>> problems/limitations above.
>
> That's the problem, there isn't one, is there? The closest you'd get is
> the INI syntax, but that's like CSV, with many variations. And most
> client libraries for this will likely drop all comments when they read
> and write a file, so this doesn't address that issue.

I've been using INI alot in pgloader previously, and I can't tell you
how happy I am to be away from it now.

I would argue that plenty of programmatic syntax and well known options
do exist for a configuration format. From Emacs Lisp and Guile to
Python, including Lua. You will tell me that it's too programmatic for
what you think is a configuration file, I would argue that it's the best
choice Emacs (and many other pieces of software) made.

Also if the programmatic part of the idea looks fine to someone who
never used the lisp syntax, just realise that there's nothing simpler to
parse nor “better known” (after all, it's been in wild use already for
more than 50 years).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-04 06:42:14
Message-ID: 529ECEC6.1050405@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Hi Peter, Dimitri, thank you for your comments.

On 03/12/13 22:27, Dimitri Fontaine wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> On 12/1/13, 2:24 PM, Álvaro Hernández Tortosa wrote:
>>> IMHO, defining a new syntax for the postgreql.conf file format,
>>> that is suitable for writing and parsing, or using an already existing,
>>> well-known, programmatic syntax, could offer a solution for all the
>>> problems/limitations above.
>>
>> That's the problem, there isn't one, is there? The closest you'd get is
>> the INI syntax, but that's like CSV, with many variations. And most
>> client libraries for this will likely drop all comments when they read
>> and write a file, so this doesn't address that issue.

Certainly INI files won't preserve the comments, nor they help adding
extra information to the config file to help users configure postgres
and tools to generate GUIs and/or automatic configuration.

>
> I've been using INI alot in pgloader previously, and I can't tell you
> how happy I am to be away from it now.
>
> I would argue that plenty of programmatic syntax and well known options
> do exist for a configuration format. From Emacs Lisp and Guile to
> Python, including Lua. You will tell me that it's too programmatic for
> what you think is a configuration file, I would argue that it's the best
> choice Emacs (and many other pieces of software) made.
>
> Also if the programmatic part of the idea looks fine to someone who
> never used the lisp syntax, just realise that there's nothing simpler to
> parse nor “better known” (after all, it's been in wild use already for
> more than 50 years).

I agree that there are many options out there, like the ones you
mentioned. I'm unsure if Lisp would be the best one, specially in terms
of newbie-friendness and general "convenience" to replace the current
postgresql.conf, but it should definitely provide with all the
requirements I was suggesting.

IMHO, the key here would be defining first *what* data should this
config file be storing. The idea is that everything that has ever been
thought of as a comment would be represented by a proper data structure.

Just brainstorming, I'm thinking of something like: (logical structure,
not syntax)

[category]
[param_name]
* param_value
- unit
[param_info]
* url
* short_description
- extra_description
* context
* vartype
- minVal
- maxVal
- enumvals
* min_pg_version
- max_pg_version
- comments
- x-tool-field

where "[]" are nested fields, "*" denotes a field that must always be
present, "-" an optional field and "x-tool-field" a mechanism for
extensions: any tool may use that field(s) to add extra information,
that both postgres and other tools should preserve but may obviously
ignore. There are several use cases that come to my mind for these such
as "version" fields where the history of the param values may be stored
or "audit" fields where the user that changes the values is recorded
with some other audit information (time, etc) for auditing purposes.

IMHO, a data structure like the above would be completely
self-contained and allow any autoconfiguring tool or GUI tool to be
easily created, if the syntax is programmable. It would certainly make
the config file more verbose, but at the same time would help a lot of
users to configure postgres providing much more information.

Makes sense?

Regards,

aht

--
Álvaro Hernández Tortosa

-----------
NOSYS
Networked Open SYStems


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-04 15:51:21
Message-ID: 529F4F79.8020207@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/4/13, 1:42 AM, Álvaro Hernández Tortosa wrote:
> IMHO, a data structure like the above would be completely
> self-contained and allow any autoconfiguring tool or GUI tool to be
> easily created, if the syntax is programmable. It would certainly make
> the config file more verbose, but at the same time would help a lot of
> users to configure postgres providing much more information.

What you are describing appears to be isomorphic to XML and XML Schema.
Note that you are not required to maintain your configuration data in a
postgresql.conf-formatted file. You can keep it anywhere you like, GUI
around in it, and convert it back to the required format. Most of the
metadata is available through postgres --describe-config, which is the
result of a previous attempt in this area, which never really went anywhere.

It's not like there are a bunch of GUI and autotuning tools that people
are dying to use or developers are dying to create, but couldn't because
editing configuration files programmatically is hard.

Let's also not forget the two main use cases (arguably) of the
configuration files: hand editing, and generation by configuration
management tools. Anything that makes these two harder is not going to
be well-received.


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-04 16:22:28
Message-ID: 529F56C4.7040605@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/12/13 16:51, Peter Eisentraut wrote:
> On 12/4/13, 1:42 AM, Álvaro Hernández Tortosa wrote:
>> IMHO, a data structure like the above would be completely
>> self-contained and allow any autoconfiguring tool or GUI tool to be
>> easily created, if the syntax is programmable. It would certainly make
>> the config file more verbose, but at the same time would help a lot of
>> users to configure postgres providing much more information.
>
> What you are describing appears to be isomorphic to XML and XML Schema.

I don't think XML would be a good idea. Even if it is both
programatically and humanly editable (two of the features I was
suggesting for it), it is messy and very verbose for this purpose.

> Note that you are not required to maintain your configuration data in a
> postgresql.conf-formatted file. You can keep it anywhere you like, GUI
> around in it, and convert it back to the required format. Most of the

I think it is not a very good idea to encourage GUI tools or tools to
auto-configure postgres to use a separate configuration file and then
convert it to postgresql.conf. That introduces a duplicity with evil
problems if either source of data is modified out-of-the-expected-way.

That's why I'm suggesting a config file that is, at the same time,
usable by both postgres and other external tools. That also enables
other features such as editing the config file persistently through a
SQL session.

> metadata is available through postgres --describe-config, which is the
> result of a previous attempt in this area, which never really went anywhere.
>
> It's not like there are a bunch of GUI and autotuning tools that people
> are dying to use or developers are dying to create, but couldn't because
> editing configuration files programmatically is hard.

It might be a chicken-and-egg problem. Maybe it's hard and futile to
write this config tools since postgresql.conf doesn't support the
required features. I don't know how to measure the "interest of people"
but I have seen many comments on this mailing list about features like
this. IMHO it would be a great addition :)

>
> Let's also not forget the two main use cases (arguably) of the
> configuration files: hand editing, and generation by configuration
> management tools. Anything that makes these two harder is not going to
> be well-received.

100% agreed :) That's why I suggested that the format of the config
file should adhere to the requisites a) to e) mentioned on my original
email (http://www.postgresql.org/message-id/529B8D01.6060301@nosys.es).

Would it be well-received a new file format that keeps it simple for
both hand editing and generation of the configuration, and at the same
time offers the features I have mentioned?

Thanks for your comments,

aht

--
Álvaro Hernández Tortosa

-----------
NOSYS
Networked Open SYStems


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-04 18:49:14
Message-ID: 529F792A.6050700@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/4/13, 11:22 AM, Álvaro Hernández Tortosa wrote:
> Would it be well-received a new file format that keeps it simple for
> both hand editing and generation of the configuration, and at the same
> time offers the features I have mentioned?

I don't see how that would work exactly: You want to add various kinds
of complex metadata to the configuration file, but make that metadata
optional at the same time. The immediate result will be that almost no
one will supply the optional metadata, and no tools will be able to rely
on their presence.


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-04 19:02:23
Message-ID: 529F7C3F.5040900@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/12/13 19:49, Peter Eisentraut wrote:
> On 12/4/13, 11:22 AM, Álvaro Hernández Tortosa wrote:
>> Would it be well-received a new file format that keeps it simple for
>> both hand editing and generation of the configuration, and at the same
>> time offers the features I have mentioned?
>
> I don't see how that would work exactly: You want to add various kinds
> of complex metadata to the configuration file, but make that metadata
> optional at the same time. The immediate result will be that almost no
> one will supply the optional metadata, and no tools will be able to rely
> on their presence.
>

I wouldn't say the metadata is "complex". Looks quite familiar to that
of pg_settings (besides that, it was just a brainstorming, not a formal
proposal).

The optional fields are basically NULLABLE attributes in pg_settings.
That is, they only make sense depending on other values (in this case,
the parameter name). All of the attributes that are required for tools
to work are marked as non optional.

So optional fields are either purely optional (i.e., only for tools
that want to use them; everyone else may ignore, but preserve, them) and
some other are just NULLABLEs, depending on the parameter).

In any case, my idea is just to open up the question and search for the
best possible set of data to be represented, and then, the best possible
syntax / file format for it.

aht

--
Álvaro Hernández Tortosa

-----------
NOSYS
Networked Open SYStems


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-04 19:44:43
Message-ID: 529F862B.2040706@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/4/13, 2:02 PM, Álvaro Hernández Tortosa wrote:
> So optional fields are either purely optional (i.e., only for tools
> that want to use them; everyone else may ignore, but preserve, them) and
> some other are just NULLABLEs, depending on the parameter).

But my point stands: If it's optional, you can't rely on it, if it's
required, people will object because they don't more junk in their
config file.

But I think this is solving the wrong problem. The metadata is already
available via postgres --describe-config.


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-04 23:51:51
Message-ID: 529FC017.8070001@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/12/13 20:44, Peter Eisentraut wrote:
> On 12/4/13, 2:02 PM, Álvaro Hernández Tortosa wrote:
>> So optional fields are either purely optional (i.e., only for tools
>> that want to use them; everyone else may ignore, but preserve, them) and
>> some other are just NULLABLEs, depending on the parameter).
>
> But my point stands: If it's optional, you can't rely on it, if it's
> required, people will object because they don't more junk in their
> config file.

OK, I get what you say. My bad, I called "optional" what it is either
"optional" (reserved for extension fields) or NULLABLE (fields that may
be absent, meaning that they are NULL).

But what matters are the required fields. You say they add "junk" to
the config file. I understand what you say, but is it really junk? Is it
that bad?

In return for this extra information, we:

- Provide users with more help (information) to help them configure
postgres (which is no easy task, specially for newcomers).

- Help and encourage app developers to create both GUI tools for easier
postgresql configuration and automatic or semi-automatic configuration
tools.

- Make it way easier to change postgresql parameters persistently from a
SQL connection.

The tradeoff seems quite positive to me. I see no strong reasons why
not do it... am I missing something?

>
> But I think this is solving the wrong problem. The metadata is already
> available via postgres --describe-config.
>

I think that doesn't solve any of the above benefits we would get from
a programmable postgresql format such as the one I have described.

Best,

aht

--
Álvaro Hernández Tortosa

-----------
NOSYS
Networked Open SYStems


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-06 03:47:29
Message-ID: 1386301649.2743.20.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2013-12-05 at 00:51 +0100, Álvaro Hernández Tortosa wrote:
> In return for this extra information, we:
>
> - Provide users with more help (information) to help them configure
> postgres (which is no easy task, specially for newcomers).
>
> - Help and encourage app developers to create both GUI tools for
> easier
> postgresql configuration and automatic or semi-automatic
> configuration
> tools.
>
> - Make it way easier to change postgresql parameters persistently from
> a
> SQL connection.
>
> The tradeoff seems quite positive to me. I see no strong
> reasons why
> not do it... am I missing something?

I don't buy your argument. You say, if we make this change, those
things will happen. I don't believe it. You can *already* do those
things, but no one is doing it.

But if we make this change, existing users will be inconvenienced,
whereas the expected benefits are very much in doubt. Again, this is
postgres --describe-config all over again.


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-06 17:29:28
Message-ID: 52A20978.7090302@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/12/13 04:47, Peter Eisentraut wrote:
> On Thu, 2013-12-05 at 00:51 +0100, Álvaro Hernández Tortosa wrote:

>>
>> The tradeoff seems quite positive to me. I see no strong
>> reasons why
>> not do it... am I missing something?
>
> I don't buy your argument. You say, if we make this change, those
> things will happen. I don't believe it. You can *already* do those
> things, but no one is doing it.

What I've been trying to do is summarize what has already been
discussed here and propose a solution. You say that "you can already do
those thisngs", but that's not what I have read here. Greg Smith (cc'ed
as I'm quoting you) was explaining this in [1]:

"Right now, writing such a tool in a generic way gets so bogged down
just in parsing/manipulating the postgresql.conf file that it's hard to
focus on actually doing the tuning part."

And I completely agree. The alternative of having two separate sources
of metadata is a very bad solution IMHO, as changes done to the
postgresql.conf file directly would completely break the tool used
otherwise. And parsing the actual postgresql.conf is simply not enough.
First because it's difficult to parse all the comments correctly. Then,
because it lacks a lot of the information required for GUI tools and
auto-tunning tools.

I'm sure you have read the GUCS Overhaul wiki page [2], that already
points out many ideas related to this one.

>
> But if we make this change, existing users will be inconvenienced,

And I somehow agree. Adding some metainformation to the postgresql.conf
file may be *a little* bit inconvenient for some users. But those users
are probably pgsql-hackers or advanced DBAs. And I'm sure everybody
here knows keyboard shortcuts and how to fiddle with larger, yet
structured, files. We all know how to grep and sed and awk this files,
right?

On the other hand, this metainformation would be extremely useful for
newbies, not-that-unexperienced DBAs and even users which go to other
databases because postgres is hard to configure. Adding it would be
extremely valuable for them because:

- they would have much more inlined information about the parameter, and
- they could use tools to help them with the configuration

So the question is: which group of users are we trying to please? And
even if the answer would be the pgsql-hackers and not the rest of the
world out there, is that much of an inconvenience what I'm saying, to
deny the rest of advantages that it may bring?

Thanks for your comments,

aht

[1]
http://www.postgresql.org/message-id/Pine.GSO.4.64.0806020452220.26912@westnet.com
[2] http://wiki.postgresql.org/wiki/GUCS_Overhaul

--
Álvaro Hernández Tortosa

-----------
NOSYS
Networked Open SYStems


From: David Johnston <polobo(at)yahoo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-06 18:11:21
Message-ID: 1386353481688-5782175.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Álvaro Hernández Tortosa wrote
>> Note that you are not required to maintain your configuration data in a
>> postgresql.conf-formatted file. You can keep it anywhere you like, GUI
>> around in it, and convert it back to the required format. Most of the
>
> I think it is not a very good idea to encourage GUI tools or tools to
> auto-configure postgres to use a separate configuration file and then
> convert it to postgresql.conf. That introduces a duplicity with evil
> problems if either source of data is modified out-of-the-expected-way.
>
> That's why I'm suggesting a config file that is, at the same time,
> usable by both postgres and other external tools. That also enables
> other features such as editing the config file persistently through a
> SQL session.

For my money I'd rather have a single file and/or directory-structure where
raw configuration settings are saved in the current 'key = value' format
with simple comments allowed and ignored by PostgreSQL. And being simple
key-value the risk of "out-of-the-expected-way" changes would be minimal.

If you want to put an example configuration file out there, one that will
not be considered to the true configuration, with lots of comments and
meta-data then great. I'm hoping that someday there is either a
curses-based and even full-fledged GUI that beginners can use to generate
the desired configuration.

If we want to put a separate "configuration meta-data" file out there to
basically provide a database from which third-party tools can pull out this
information then great. I would not incorporate that same information into
the main PostgreSQL configuration file/directory-structure. The biggest
advantage is that the meta-data database can be readily modified without any
concern regarding such changes impacting running systems upon update. Then,
tools simply need to import "two" files instead of one, link together the
meta-data key with the configuration key, and do whatever they were going to
do anyway.

If indeed that target audience is going to be novices then a static
text-based document is not going to be the most desirable interface to
present. At worse we should simply include a comment-link at the top of the
document to a web-page where an interactive tool for configuration file
creation would exist. That tool, at the end of the process, could provide
the user with text to copy-paste/save into a specified area on the server so
the customizations made would override the installed defaults.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/RFC-programmable-file-format-for-postgresql-conf-tp5781097p5782175.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-06 21:59:52
Message-ID: 52A248D8.3000906@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/6/13, 12:29 PM, Álvaro Hernández Tortosa wrote:
> What I've been trying to do is summarize what has already been
> discussed here and propose a solution. You say that "you can already do
> those thisngs", but that's not what I have read here. Greg Smith (cc'ed
> as I'm quoting you) was explaining this in [1]:
>
> "Right now, writing such a tool in a generic way gets so bogged down
> just in parsing/manipulating the postgresql.conf file that it's hard to
> focus on actually doing the tuning part."

That was in 2008. I don't think that stance is accurate anymore.


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-07 03:28:22
Message-ID: 52A295D6.8080105@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 06/12/2013 22:59, Peter Eisentraut wrote:
> On 12/6/13, 12:29 PM, Álvaro Hernández Tortosa wrote:
>> What I've been trying to do is summarize what has already been
>> discussed here and propose a solution. You say that "you can already do
>> those thisngs", but that's not what I have read here. Greg Smith (cc'ed
>> as I'm quoting you) was explaining this in [1]:
>>
>> "Right now, writing such a tool in a generic way gets so bogged down
>> just in parsing/manipulating the postgresql.conf file that it's hard to
>> focus on actually doing the tuning part."
> That was in 2008. I don't think that stance is accurate anymore.

Just for me to learn about this: why is it not accurate anymore?

Thanks for your patience! :)

aht


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-07 03:28:51
Message-ID: 52A295F3.8040109@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 06/12/2013 19:11, David Johnston wrote:
> Álvaro Hernández Tortosa wrote
>>> Note that you are not required to maintain your configuration data in a
>>> postgresql.conf-formatted file. You can keep it anywhere you like, GUI
>>> around in it, and convert it back to the required format. Most of the
>> I think it is not a very good idea to encourage GUI tools or tools to
>> auto-configure postgres to use a separate configuration file and then
>> convert it to postgresql.conf. That introduces a duplicity with evil
>> problems if either source of data is modified out-of-the-expected-way.
>>
>> That's why I'm suggesting a config file that is, at the same time,
>> usable by both postgres and other external tools. That also enables
>> other features such as editing the config file persistently through a
>> SQL session.
> For my money I'd rather have a single file and/or directory-structure where
> raw configuration settings are saved in the current 'key = value' format
> with simple comments allowed and ignored by PostgreSQL. And being simple
> key-value the risk of "out-of-the-expected-way" changes would be minimal.

What I meant by "out-of-the-expected-way" is that if you edit
postgresql.conf directly rather than through a tool (assuming you're
regularly using the tool), then those changes may get lost when you use
the tool again. In other words, there's potentially "duplicated
information", and we all know that it's not desirable.
> If we want to put a separate "configuration meta-data" file out there to
> basically provide a database from which third-party tools can pull out this
> information then great. I would not incorporate that same information into
> the main PostgreSQL configuration file/directory-structure. The biggest
> advantage is that the meta-data database can be readily modified without any
> concern regarding such changes impacting running systems upon update. Then,
> tools simply need to import "two" files instead of one, link together the
> meta-data key with the configuration key, and do whatever they were going to
> do anyway.
Despite I think it's not ideal to have two separate, both editable,
files for configuring postgresql, if:

- Both would be included in the official distribution, one alongside the
other one
- A tool for converting the new one into the current postgresql.conf is
included also with the distribution, say bin/pgconfiguration or whatever

then I'd agree that it could be a great first step to both adding
support for external tooling for configuring postgres, and providing new
users with a lot more help if they don't use any other tool.

Of course, other tools could be completely external to the
postgresql distribution, but not the "alternate" configuration file and
the pgconfiguration program.

Would this be a good thing to do then?

> If indeed that target audience is going to be novices then a static
> text-based document is not going to be the most desirable interface to
> present. At worse we should simply include a comment-link at the top of the
> document to a web-page where an interactive tool for configuration file
> creation would exist. That tool, at the end of the process, could provide
> the user with text to copy-paste/save into a specified area on the server so
> the customizations made would override the installed defaults.

I think both could be used a lot, editing directly a rich
configuration file or using a GUI tool. I'm trying to suggest supporting
both.

Regards,

aht


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
Cc: David Johnston <polobo(at)yahoo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-09 17:00:04
Message-ID: CA+TgmoaKU-mQX+pXFDhH-oyGtb8gtK6t8NvO7hz5qjBLL3V4Hg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Dec 6, 2013 at 10:28 PM, Álvaro Hernández Tortosa <aht(at)nosys(dot)es> wrote:
> I think both could be used a lot, editing directly a rich configuration
> file or using a GUI tool. I'm trying to suggest supporting both.

I don't really understand how changing the file format fixes anything.
You could make the file an INI file or an XML file and it would still
be hard to edit programmatically, not because the current format is
"hard to parse" in any meaningful sense, but because there's no way
for a program to know how to make changes while preserving the
comments. For example, suppose the user tries to set work_mem to 4MB.
If there's an existing line in the config file for work_mem, it's
fairly plausible to think that you might just replace everything on
that line, up to the beginning of any comment, with a new work_mem
setting.

But what if, as in the default configuration file, there isn't any
such setting? A human will go and find the line that says:

#work_mem = 1MB

...and delete the hash mark, and replace 1MB with 4MB. No problem!
But for a computer, editing comments is hard, and kind of iffy. After
all, there might be multiple lines that look like the above, and how
would you know which one to replace? There could even be something
like this in the file:

#In our installation, because we have very little memory, it's
important not to do anything silly like set
#work_mem = 64MB

A configuration file editor that replaces that line will corrupt the
comment, because no program can be smart enough to recognize the
context the way a human will.

Now, we could design something that gets it right, or close enough to
right, 99% of the time. But previous discussions of this issue on
this mailing list have concluded that people are not willing to accept
that kind of solution, which IMHO is understandable.

The only kind of change that I see as possibly helpful is some format
that explicitly marks which comments go with which settings. For
example, suppose we did this:

<setting>
<name>work_mem</>
<!-- <value>1MB</> -->
<comment>min 64kB</>
</setting>

If you want to set the value, you remove the comment tags around it.
And if you want to comment on the value, you can put whatever you like
within the comment tags. Now, you've got a machine-editable format,
assuming that people keep their comments in the <comment/> section and
not inside actual SGML comments.

But that's ugly and overly verbose, so meh.

Generally I don't regard trying to tinker with postgresql.conf as a
useful way to spend time. Many people have strong and sometimes
conflicting feelings about it, making getting any consensus of any
change almost impossible. And while I'm sure some die-hard will
disagree with me on this, the current format, imperfect as it is, is
not really all *that* bad. We all have our bones to pick with it and
I certainly wouldn't have picked this exact approach myself, but we
could have done far worse. If it were clear what the next logical
step to make it better was, or even if it were clear that the current
blew chunks, then I'd be all over putting energy into getting this
fixed. But it isn't, and it doesn't, and the amount of collective
energy that would need to be put into making any change here doesn't
seem likely to be worth what we'd get out of it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Greg Stark <stark(at)mit(dot)edu>
To: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <gsmith(at)gregsmith(dot)com>
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-09 17:26:08
Message-ID: CAM-w4HNt_4c0SJPDTMhssG-7KOM5T=oo5h8ny9nsr=m2aYE+5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Dec 7, 2013 at 3:28 AM, Álvaro Hernández Tortosa <aht(at)nosys(dot)es> wrote:
>>> "Right now, writing such a tool in a generic way gets so bogged down
>>> just in parsing/manipulating the postgresql.conf file that it's hard to
>>> focus on actually doing the tuning part."
>>
>> That was in 2008. I don't think that stance is accurate anymore.
>
> Just for me to learn about this: why is it not accurate anymore?

This topic has been under active discussion for the last five years. I
strongly recommend going back and skimming over the past discussions
before trying to pick it up again. In particular go look up the
discussion of SET PERSISTENT

Since we have include files now you can just generate an
auto-tune.conf and not try to parse or write the main config file.

The reason previous efforts got bogged down in parsing/manipulating
the postgresql.conf file was purely because they were trying to allow
you to edit the file by hand and mix that with auto generated config.


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-12 12:20:00
Message-ID: 52A9A9F0.3070009@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/12/13 18:26, Greg Stark wrote:
> On Sat, Dec 7, 2013 at 3:28 AM, Álvaro Hernández Tortosa <aht(at)nosys(dot)es> wrote:
>>>> "Right now, writing such a tool in a generic way gets so bogged down
>>>> just in parsing/manipulating the postgresql.conf file that it's hard to
>>>> focus on actually doing the tuning part."
>>>
>>> That was in 2008. I don't think that stance is accurate anymore.
>>
>> Just for me to learn about this: why is it not accurate anymore?
>
> This topic has been under active discussion for the last five years. I
> strongly recommend going back and skimming over the past discussions
> before trying to pick it up again. In particular go look up the
> discussion of SET PERSISTENT

Thanks, Greg. I've been going through those threads, they are quite
interesting. I didn't find an answer, though, about my question: why
parsing the postgresql.conf (and for instance preserving the comments
while writing it back) is no longer a problem. I read about ways of
mitigating this (such as the include facility and so on) but I still
find parsing the file as hard as before. Nonetheless, I think this adds
nothing to what we're talking about, so I'll skip this :)

>
> Since we have include files now you can just generate an
> auto-tune.conf and not try to parse or write the main config file.
>
> The reason previous efforts got bogged down in parsing/manipulating
> the postgresql.conf file was purely because they were trying to allow
> you to edit the file by hand and mix that with auto generated config.
>

Just IMO, it is great if a config file would allow for both use cases:
that both tools and users could seamlessly edit them at will. But of
course YMMV.

Regards,

aht

--
Álvaro Hernández Tortosa

-----------
NOSYS
Networked Open SYStems


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Johnston <polobo(at)yahoo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-12 12:31:28
Message-ID: 52A9ACA0.8080201@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/12/13 18:00, Robert Haas wrote:
> On Fri, Dec 6, 2013 at 10:28 PM, Álvaro Hernández Tortosa <aht(at)nosys(dot)es> wrote:
>> I think both could be used a lot, editing directly a rich configuration
>> file or using a GUI tool. I'm trying to suggest supporting both.
>
> I don't really understand how changing the file format fixes anything.
> You could make the file an INI file or an XML file and it would still
> be hard to edit programmatically, not because the current format is
> "hard to parse" in any meaningful sense, but because there's no way
> for a program to know how to make changes while preserving the
> comments. For example, suppose the user tries to set work_mem to 4MB.

Thanks for your detailed explanation, Robert. I think that since the
comments are the problem, they should be part of the data structure that
holds the parameter (setting). That way comments would be easily
parseable, not for a INI file (which doesn't allow for these kind of
data structures) but definitely for XML (note that I'm not suggesting to
use XML).

>
> The only kind of change that I see as possibly helpful is some format
> that explicitly marks which comments go with which settings. For
> example, suppose we did this:
>
> <setting>
> <name>work_mem</>
> <!-- <value>1MB</> -->
> <comment>min 64kB</>
> </setting>
>
> If you want to set the value, you remove the comment tags around it.
> And if you want to comment on the value, you can put whatever you like
> within the comment tags. Now, you've got a machine-editable format,
> assuming that people keep their comments in the <comment/> section and
> not inside actual SGML comments.
>
> But that's ugly and overly verbose, so meh.

I agree that what you suggested is a machine-editable format, so I
think it's great. I would not care about SGML comments, though. If this
is for programs to use it too, I see no problem on the "verbosity" of
having uncommented all the parameters with all their associated
metainformation.

However, you think it's ugly and verbose. It definitely is (specially
XML, I'd go a different route) but as I said in a previous email: if it
would help regular postgresql users as: (1) it makes it easier to create
config tools, but (2) also helps them providing them much more
information on how to configure manually, why not sacrifice that
verbosity? Is it that bad?

>
> Generally I don't regard trying to tinker with postgresql.conf as a
> useful way to spend time. Many people have strong and sometimes
> conflicting feelings about it, making getting any consensus of any
> change almost impossible. And while I'm sure some die-hard will

I completely understand that. In order to explore whether the approach
I'm suggesting works or not, I'm going to work on a POC of a sample
configuration file, structured in the way I have been describing, and a
GUI and CLI tool (POC!) to use it. I'll get back to the list with it, to
check whether it may make any sense.

Thanks!

aht

--
Álvaro Hernández Tortosa

-----------
NOSYS
Networked Open SYStems


From: Greg Stark <stark(at)mit(dot)edu>
To: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-13 03:11:54
Message-ID: CAM-w4HP-EbzmTvk8PyqNwv4dbkuZ8Nhs6UGO68Cp-O_mMtEM5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12 Dec 2013 04:20, "Álvaro Hernández Tortosa" <aht(at)nosys(dot)es> wrote:

> Thanks, Greg. I've been going through those threads, they are
quite interesting. I didn't find an answer, though, about my question: why
parsing the postgresql.conf (and for instance preserving the comments while
writing it back) is no longer a problem

Parsing it isn't hard. It's precisely because the file isn't programmable
and is such a simple format that's easy to parse.

It's making changes and then writing it out again while preserving the
intended format that's hard.

So we convinced people to stop trying to do that.

The whole idea of include rules is to separate the portion of the file
that's human edited and the portion that's machine maintained. That's the
only viable strategy.


From: Álvaro Hernández Tortosa <aht(at)nosys(dot)es>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: programmable file format for postgresql.conf
Date: 2013-12-13 06:32:59
Message-ID: 52AAAA1B.7000200@nosys.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13/12/13 04:11, Greg Stark wrote:
>
> On 12 Dec 2013 04:20, "Álvaro Hernández Tortosa" <aht(at)nosys(dot)es
> <mailto:aht(at)nosys(dot)es>> wrote:
>
> > Thanks, Greg. I've been going through those threads, they are
> quite interesting. I didn't find an answer, though, about my question:
> why parsing the postgresql.conf (and for instance preserving the
> comments while writing it back) is no longer a problem
>
> Parsing it isn't hard. It's precisely because the file isn't
> programmable and is such a simple format that's easy to parse.
>
> It's making changes and then writing it out again while preserving the
> intended format that's hard.

Sure, it's writing back the "difficult" part.
>
> So we convinced people to stop trying to do that.
>
> The whole idea of include rules is to separate the portion of the file
> that's human edited and the portion that's machine maintained. That's
> the only viable strategy.

I agree that makes the file "programmable" (in a limited way). You say
you're trying to stop people trying to do that, but that's precisely
what is needed to, for example, create tools to help configure postgres!

Going back to my original email, the main issues I wanted to analyze
were basically:

- Adding metainformation to the config file so that non-expert users
(i.e., the great and vast majority of postgres users) can configure
postgresql more easily (by having extra information in-place, such as
the min val, max val, vartype, comments and URL to the docs).

- Adding metainformation to the config file so that this metainformation
is centrally located and self-contained. This in turn encourages tool
devs to create both GUI tools for configuring postgres and automatic
tools. I consider "critical" the "centrally located" part, as it becomes
a "framework" or "repository" of metainformation, so that tool devs
don't have to write their own for every tool.

I now realize that maybe I should have called my post "Adding
metainformation to the postgresql.conf file" or something like that.

The include mechanism allows some degree of programmability, but it has
to be in a format compatible with the current postgresql.conf file that
doesn't contain this metainformation.

To achieve the goals above, I think the only viable way would be to
ship with the postgresql distribution a file with all the
metainformation, which could:

- either be the postgresql.conf file itself (which would need a
different format, of course)

- or an external file with an included program to convert that file to
the current postgresql.conf

Please let me know if there would be a third or fourth option.

I started some little research on the second approach, and I'll post
back with a file format and code of a proof of concept tool to convert
to postgresql.conf and help users configure postgresql both from a GUI
and CLI.

Regards,

aht

--
Álvaro Hernández Tortosa

-----------
NOSYS
Networked Open SYStems