PGDATA confusion

Lists: pgsql-docs
From: Thom Brown <thom(at)linux(dot)com>
To: pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: PGDATA confusion
Date: 2011-10-15 01:24:46
Message-ID: CAA-aLv5XPQG7Qn4bDsG1ALv33DrAF4e6rE_QSW9kYuph8_wsOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs

Hi,

I notice that in the man page and the page for pg_ctl in the
documentation (http://www.postgresql.org/docs/current/static/app-pg-ctl.html)
it states that the -D parameter should point to the directory which
contains the database files:

"Specifies the file system location of the database files. If this is
omitted, the environment variable PGDATA is used."

This isn't necessarily true. It will only need to know where the
postgresql.conf file is. In Debian and Ubuntu, for example, the
"database files" reside in /var/lib/postgresql/x.x/main/, and the
configuration files in /etc/postgresql/x.x/main/, so a user might
assume they need to point it to the former as that's where the actual
database itself can be found, when in fact it needs to be the latter.

This same inaccuracy affects the "postgres" command
(http://www.postgresql.org/docs/current/static/app-postgres.html) to a
lesser extent:

"Specifies the file system location of the data directory or
configuration file(s)."

Which is it? This suggests it can be either, when only the latter
matters. If it also happens to be the data directory, that's
unimportant.

To add further confusion, the page describing PGDATA
(http://www.postgresql.org/docs/current/static/storage-file-layout.html)
refers to it like so:

"All the data needed for a database cluster is stored within the
cluster's data directory, commonly referred to as PGDATA...

*snip*

The PGDATA directory contains several subdirectories and control
files... ...In addition to these required items, the cluster
configuration files postgresql.conf, pg_hba.conf, and pg_ident.conf
are traditionally stored in PGDATA (although in PostgreSQL 8.0 and
later, it is possible to keep them elsewhere)."

Traditionally but not necessarily, and not by default in Debian and
Ubuntu. In fact Gentoo (and therefore probably Sabayon too) has also
elected for this separation of data files and config files as of 9.0.

So if one set PGDATA to somewhere which had no database files at all,
but just postgresql.conf, it could still work (assuming it, in turn,
set data_directory correctly), but not vice versa. It would make more
sense to call it PGCONFIG, although I'm not proposing that, especially
since PGDATA makes sense when it comes to initdb.

There are probably plenty of other places in the docs which also don't
adequately describe PGDATA or -D.

Any disagreements? If not, should I write a patch (since someone will
probably accuse me of volunteering anyway) or would someone like to
commit some adjustments?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Thom Brown <thom(at)linux(dot)com>
To: pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: Re: PGDATA confusion
Date: 2011-10-31 15:46:55
Message-ID: CAA-aLv6KYbPa3FgH6_shVXrxQto86+QDJHQgj1isbeQngiBBGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs

On 15 October 2011 03:24, Thom Brown <thom(at)linux(dot)com> wrote:
> Hi,
>
> I notice that in the man page and the page for pg_ctl in the
> documentation (http://www.postgresql.org/docs/current/static/app-pg-ctl.html)
> it states that the -D parameter should point to the directory which
> contains the database files:
>
> "Specifies the file system location of the database files. If this is
> omitted, the environment variable PGDATA is used."
>
> This isn't necessarily true.  It will only need to know where the
> postgresql.conf file is.  In Debian and Ubuntu, for example, the
> "database files" reside in /var/lib/postgresql/x.x/main/, and the
> configuration files in /etc/postgresql/x.x/main/, so a user might
> assume they need to point it to the former as that's where the actual
> database itself can be found, when in fact it needs to be the latter.
>
> This same inaccuracy affects the "postgres" command
> (http://www.postgresql.org/docs/current/static/app-postgres.html) to a
> lesser extent:
>
> "Specifies the file system location of the data directory or
> configuration file(s)."
>
> Which is it?  This suggests it can be either, when only the latter
> matters.  If it also happens to be the data directory, that's
> unimportant.
>
> To add further confusion, the page describing PGDATA
> (http://www.postgresql.org/docs/current/static/storage-file-layout.html)
> refers to it like so:
>
> "All the data needed for a database cluster is stored within the
> cluster's data directory, commonly referred to as PGDATA...
>
> *snip*
>
> The PGDATA directory contains several subdirectories and control
> files... ...In addition to these required items, the cluster
> configuration files postgresql.conf, pg_hba.conf, and pg_ident.conf
> are traditionally stored in PGDATA (although in PostgreSQL 8.0 and
> later, it is possible to keep them elsewhere)."
>
> Traditionally but not necessarily, and not by default in Debian and
> Ubuntu.  In fact Gentoo (and therefore probably Sabayon too) has also
> elected for this separation of data files and config files as of 9.0.
>
> So if one set PGDATA to somewhere which had no database files at all,
> but just postgresql.conf, it could still work (assuming it, in turn,
> set data_directory correctly), but not vice versa.  It would make more
> sense to call it PGCONFIG, although I'm not proposing that, especially
> since PGDATA makes sense when it comes to initdb.
>
> There are probably plenty of other places in the docs which also don't
> adequately describe PGDATA or -D.
>
> Any disagreements?  If not, should I write a patch (since someone will
> probably accuse me of volunteering anyway) or would someone like to
> commit some adjustments?

No opinions on this?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Thom Brown <thom(at)linux(dot)com>
Cc: pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: Re: PGDATA confusion
Date: 2011-11-04 16:32:13
Message-ID: 201111041632.pA4GWDH15361@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs

Thom Brown wrote:
> > So if one set PGDATA to somewhere which had no database files at all,
> > but just postgresql.conf, it could still work (assuming it, in turn,
> > set data_directory correctly), but not vice versa. ?It would make more
> > sense to call it PGCONFIG, although I'm not proposing that, especially
> > since PGDATA makes sense when it comes to initdb.
> >
> > There are probably plenty of other places in the docs which also don't
> > adequately describe PGDATA or -D.
> >
> > Any disagreements? ?If not, should I write a patch (since someone will
> > probably accuse me of volunteering anyway) or would someone like to
> > commit some adjustments?
>
> No opinions on this?

Yes. I had kept it to deal with later. Please work on a doc patch to
try to clean this up. pg_upgrade just went through this confusion and I
also was unhappy at how vague things are in this area.

Things got very confusing with pg_upgrade when PGDATA pointed to the
configuration directory and the data_directory GUC pointed to the data
directory.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Thom Brown <thom(at)linux(dot)com>
Cc: pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: Re: PGDATA confusion
Date: 2012-08-16 03:00:21
Message-ID: 20120816030021.GI8353@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs

On Fri, Nov 4, 2011 at 12:32:13PM -0400, Bruce Momjian wrote:
> Thom Brown wrote:
> > > So if one set PGDATA to somewhere which had no database files at all,
> > > but just postgresql.conf, it could still work (assuming it, in turn,
> > > set data_directory correctly), but not vice versa. ?It would make more
> > > sense to call it PGCONFIG, although I'm not proposing that, especially
> > > since PGDATA makes sense when it comes to initdb.
> > >
> > > There are probably plenty of other places in the docs which also don't
> > > adequately describe PGDATA or -D.
> > >
> > > Any disagreements? ?If not, should I write a patch (since someone will
> > > probably accuse me of volunteering anyway) or would someone like to
> > > commit some adjustments?
> >
> > No opinions on this?
>
> Yes. I had kept it to deal with later. Please work on a doc patch to
> try to clean this up. pg_upgrade just went through this confusion and I
> also was unhappy at how vague things are in this area.
>
> Things got very confusing with pg_upgrade when PGDATA pointed to the
> configuration directory and the data_directory GUC pointed to the data
> directory.

I have applied the attached doc patch for PG 9.3 to clarify PGDATA.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
pgdata.diff text/x-diff 3.5 KB

From: Thom Brown <thom(at)linux(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: Re: PGDATA confusion
Date: 2012-08-16 07:30:48
Message-ID: CAA-aLv7t0vUH1NZK9k94eJfn-RnA-P844aWoxkRSQL4H=ExYaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs

On 16 August 2012 04:00, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Fri, Nov 4, 2011 at 12:32:13PM -0400, Bruce Momjian wrote:
>> Thom Brown wrote:
>> > > So if one set PGDATA to somewhere which had no database files at all,
>> > > but just postgresql.conf, it could still work (assuming it, in turn,
>> > > set data_directory correctly), but not vice versa. ?It would make more
>> > > sense to call it PGCONFIG, although I'm not proposing that, especially
>> > > since PGDATA makes sense when it comes to initdb.
>> > >
>> > > There are probably plenty of other places in the docs which also don't
>> > > adequately describe PGDATA or -D.
>> > >
>> > > Any disagreements? ?If not, should I write a patch (since someone will
>> > > probably accuse me of volunteering anyway) or would someone like to
>> > > commit some adjustments?
>> >
>> > No opinions on this?
>>
>> Yes. I had kept it to deal with later. Please work on a doc patch to
>> try to clean this up. pg_upgrade just went through this confusion and I
>> also was unhappy at how vague things are in this area.
>>
>> Things got very confusing with pg_upgrade when PGDATA pointed to the
>> configuration directory and the data_directory GUC pointed to the data
>> directory.
>
> I have applied the attached doc patch for PG 9.3 to clarify PGDATA.

Thanks Bruce.

--
Thom


From: Thom Brown <thom(at)linux(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: Re: PGDATA confusion
Date: 2013-04-14 07:56:12
Message-ID: CAA-aLv6Ut2r=m79OPykMuP4-=9BEfOadSAvwaphQgubivOkymQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs

On 16 August 2012 08:30, Thom Brown <thom(at)linux(dot)com> wrote:
> On 16 August 2012 04:00, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> On Fri, Nov 4, 2011 at 12:32:13PM -0400, Bruce Momjian wrote:
>>> Thom Brown wrote:
>>> > > So if one set PGDATA to somewhere which had no database files at all,
>>> > > but just postgresql.conf, it could still work (assuming it, in turn,
>>> > > set data_directory correctly), but not vice versa. ?It would make more
>>> > > sense to call it PGCONFIG, although I'm not proposing that, especially
>>> > > since PGDATA makes sense when it comes to initdb.
>>> > >
>>> > > There are probably plenty of other places in the docs which also don't
>>> > > adequately describe PGDATA or -D.
>>> > >
>>> > > Any disagreements? ?If not, should I write a patch (since someone will
>>> > > probably accuse me of volunteering anyway) or would someone like to
>>> > > commit some adjustments?
>>> >
>>> > No opinions on this?
>>>
>>> Yes. I had kept it to deal with later. Please work on a doc patch to
>>> try to clean this up. pg_upgrade just went through this confusion and I
>>> also was unhappy at how vague things are in this area.
>>>
>>> Things got very confusing with pg_upgrade when PGDATA pointed to the
>>> configuration directory and the data_directory GUC pointed to the data
>>> directory.
>>
>> I have applied the attached doc patch for PG 9.3 to clarify PGDATA.

I've found another unfortunate inconsistency.

PGDATA is not necessarily the same location in these 2 commands:

pg_ctl start -D DATADIR
pg_ctl stop -D DATADIR

The first one requires that the postgresql.conf file be located in the
specified directory. The second one needs to find the pid file. On
Debian/Ubuntu/Linux Mint/Gentoo (and probably most other Linux
distros), it would mean 2 different locations for each:

pg_ctl start -D /etc/postgresql/9.2/main/
pg_ctl stop -D /var/lib/postgresql/9.2/main/

pg_ctl --help confusingly tells us that DATADIR is the "location of
the database storage area". But this clearly isn't true when starting
a cluster.

--
Thom


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: Re: PGDATA confusion
Date: 2013-04-14 14:50:46
Message-ID: 519.1365951046@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs

Thom Brown <thom(at)linux(dot)com> writes:
> I've found another unfortunate inconsistency.

> PGDATA is not necessarily the same location in these 2 commands:

> pg_ctl start -D DATADIR
> pg_ctl stop -D DATADIR

> The first one requires that the postgresql.conf file be located in the
> specified directory. The second one needs to find the pid file. On
> Debian/Ubuntu/Linux Mint/Gentoo (and probably most other Linux
> distros), it would mean 2 different locations for each:

> pg_ctl start -D /etc/postgresql/9.2/main/
> pg_ctl stop -D /var/lib/postgresql/9.2/main/

This is one of the reasons why an external config file isn't as great
an idea as some people think.

I wonder whether we shouldn't simply remove the ability for
postgresql.conf to exist outside the data directory (which would be
mechanized by removing the ability to set data_directory to something
other than the place where the config file is found). People who prefer
to keep their config somewhere else can reduce the in-the-directory file
to just "include /some/other/file". But otherwise, this would get rid
of a confusing and completely unnecessary inconsistency between
different installations.

regards, tom lane