Re: BUG #5118: start-status-insert-fatal

Lists: pgsql-bugs
From: "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 13:36:26
Message-ID: 200910151336.n9FDaQL3084899@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 5118
Logged by: Gerhard Leykam
Email address: gel123(at)sealsystems(dot)de
PostgreSQL version: 8.4.0
Operating system: linux
Description: start-status-insert-fatal
Details:

Hi!

I am using a start script to set up my PostgreSQL database: it runs initdb,
if not done yet, starts the instance with pg_ctl start and checks everything
is fine by pg_ctl status.

If there is another PostgreSQL database on the same machine listening to the
same port, postmaster comes up, pg_ctl status says everthings fine, but
postmaster falls down with appropriate message in postgres.log.

All SQL commands in my script after status check are running against the
other database!

How do you think about some sort of postmaster.up file next to
postmaster.pid, which is created just as really everything is up and OK?

Regards,
Gerhard


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 15:29:27
Message-ID: 4AD6F989020000250002B9B1@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Gerhard Leykam" <gel123(at)sealsystems(dot)de> wrote:

> I am using a start script to set up my PostgreSQL database: it runs
> initdb, if not done yet, starts the instance with pg_ctl start and
> checks everything is fine by pg_ctl status.
>
> If there is another PostgreSQL database on the same machine
> listening to the same port, postmaster comes up, pg_ctl status says
> everthings fine, but postmaster falls down with appropriate message
> in postgres.log.

This is definitely not a PostgreSQL bug.

Perhaps the best place to start, before suggesting a new PostgreSQL
feature to solve this, would be to post to one of the other lists
(admin, maybe?) and describe what you are trying to accomplish with
your script, along with the problems you've found with your current
version of the script. With a little more information, someone might
be able to suggest a solution. (Since you're running on Linux,
something involving the lockfile utility might suffice.)

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-bugs(at)postgresql(dot)org, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 16:03:02
Message-ID: 1260.1255622582@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> "Gerhard Leykam" <gel123(at)sealsystems(dot)de> wrote:
>> I am using a start script to set up my PostgreSQL database: it runs
>> initdb, if not done yet, starts the instance with pg_ctl start and
>> checks everything is fine by pg_ctl status.
>>
>> If there is another PostgreSQL database on the same machine
>> listening to the same port, postmaster comes up, pg_ctl status says
>> everthings fine, but postmaster falls down with appropriate message
>> in postgres.log.

> This is definitely not a PostgreSQL bug.

Well, it's arguably a start-script bug, but I think his point is that
it's hard to fix it without any additional support from PG.

While mulling that it occurred to me that some additional output from
the postmaster would help to solve another thing that's an acknowledged
shortcoming of pg_ctl, namely that it can't parse postgresql.conf to
find out where the postmaster's communication socket is; cf
http://archives.postgresql.org/pgsql-bugs/2009-10/msg00024.php
and other older complaints.

We could redefine things so that it doesn't need to do that (and also
doesn't need to try to intuit the postmaster's port number, which it
does do now, but not terribly well). Suppose that after the postmaster
is fully up, it writes a file $PGDATA/postmaster.ports, with contents
along the lines of

5432
/tmp/.s.PGSQL.5432

ie, IP port number and full socket location (the latter missing on
Windows of course). Then pg_ctl only needs to know $PGDATA, and it
can get the socket address without any guessing or hard work.

(Alternatively we could add this info to postmaster.pid, but a separate
file seems like a better idea, if only because the contents of the pid
file are subject to change that pg_ctl shouldn't have to deal with.)

I am not convinced that this solves the OP's complaint 100%. The
failure case that remains is that if the postmaster crashes without
deleting this file, then pg_ctl could still be fooled by a competing
postmaster that's taken the same port number. I don't know that there's
any good way around that one, though. The main thing it does do is
remove the pressure to add a lot of code to pg_ctl to try to read
postgresql.conf.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 16:51:54
Message-ID: 4AD70CDA020000250002B9BF@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Well, it's arguably a start-script bug

OK.

> While mulling that it occurred to me that some additional output
> from the postmaster would help to solve another thing that's an
> acknowledged shortcoming of pg_ctl, namely that it can't parse
> postgresql.conf to find out where the postmaster's communication
> socket is;
> cf http://archives.postgresql.org/pgsql-bugs/2009-10/msg00024.php
> and other older complaints.
>
> We could redefine things so that it doesn't need to do that (and
> also doesn't need to try to intuit the postmaster's port number,
> which it does do now, but not terribly well). Suppose that after
> the postmaster is fully up, it writes a file
> $PGDATA/postmaster.ports, with contents along the lines of
>
> 5432
> /tmp/.s.PGSQL.5432

The listen_addresses setting would need to figure in, too.

http://archives.postgresql.org/pgsql-hackers/2009-10/msg00022.php

Matching that stuff up could start to get a little messy, but it
should be doable somehow.

This seems likely to overlap the review I was soon going to do of the
differences between pg_ctl behavior and what is required for LSB
conformance. I'll make sure to test this behavior along with others.
One of my current complaints is that pg_ctl doesn't wait until it is
actually ready to receive connections before returning an indication
of success. I see that I neglected that point in my recently proposed
LSB conforming script, but I'm guessing that this fits with other
points in the argument that if what I'm doing in the script is
demonstrably better than current pg_ctl behavior, we should change
pg_ctl to support it rather than scripting around it. (Not that it
would be hard to add ten or twenty lines to the script to cover
this....)

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-bugs(at)postgresql(dot)org, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 17:34:15
Message-ID: 10645.1255628055@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Suppose that after the postmaster is fully up, it writes a file
>> $PGDATA/postmaster.ports, with contents along the lines of
>>
>> 5432
>> /tmp/.s.PGSQL.5432

> The listen_addresses setting would need to figure in, too.

Yeah, I'm not entirely sure how we'd want to deal with IP addresses,
but in principle there could be a line for each postmaster socket not
only the Unix-domain socket.

> This seems likely to overlap the review I was soon going to do of the
> differences between pg_ctl behavior and what is required for LSB
> conformance.

Agreed, it would be good to do a holistic review of what pg_ctl needs.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org, Gerhard Leykam <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 17:41:01
Message-ID: 20091015174101.GG4788@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Kevin Grittner wrote:

> This seems likely to overlap the review I was soon going to do of the
> differences between pg_ctl behavior and what is required for LSB
> conformance. I'll make sure to test this behavior along with others.
> One of my current complaints is that pg_ctl doesn't wait until it is
> actually ready to receive connections before returning an indication
> of success.

Maybe write the file as postmaster.ports.starting or some such and
rename it to its final name when recovery has finished?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 18:02:13
Message-ID: 4AD71D57020000250002B9E0@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

> I neglected that point in my recently proposed LSB conforming script

Hmmm... On review, I see that I assumed that the -w switch on pg_ctl
start would cover this. I see that the problem is that this uses psql
to connect to the specified port. Besides the problems Tom mentioned
with its heuristics to find the right port number for this cluster,
there is the OP's point that connections will go to the competing
cluster. One thought that occurs to me is that instead of, or in
addition to, the new file Tom proposes, the "other cluster" issue
could be solved by having a pg_postmaster_pid function in addition to
the pg_backend_pid function. This would allow pg_ctl or a script to
connect to a port and see if it is the expected postmaster process.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-bugs(at)postgresql(dot)org, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 18:15:11
Message-ID: 11233.1255630511@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Hmmm... On review, I see that I assumed that the -w switch on pg_ctl
> start would cover this. I see that the problem is that this uses psql
> to connect to the specified port. Besides the problems Tom mentioned
> with its heuristics to find the right port number for this cluster,
> there is the OP's point that connections will go to the competing
> cluster. One thought that occurs to me is that instead of, or in
> addition to, the new file Tom proposes, the "other cluster" issue
> could be solved by having a pg_postmaster_pid function in addition to
> the pg_backend_pid function. This would allow pg_ctl or a script to
> connect to a port and see if it is the expected postmaster process.

I would rather see us implement the hypothetical pg_ping protocol
and remember to include the postmaster's PID in the response. One
of the worst misfeatures of pg_ctl is the need to be able to
authenticate itself to the postmaster, and having it rely on being
able to actually issue a SQL command would set that breakage in stone.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 18:21:12
Message-ID: 4AD721C8020000250002B9EA@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I would rather see us implement the hypothetical pg_ping protocol
> and remember to include the postmaster's PID in the response. One
> of the worst misfeatures of pg_ctl is the need to be able to
> authenticate itself to the postmaster, and having it rely on being
> able to actually issue a SQL command would set that breakage in
> stone.

Sounds good to me, other than it stalls pg_ctl revamp until pg_ping is
done. I don't remember a clear design of what pg_ping should look
like. Does anyone have a clear plan in their head?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-bugs(at)postgresql(dot)org, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 18:37:50
Message-ID: 11547.1255631870@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I wrote:
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>> ... This would allow pg_ctl or a script to
>> connect to a port and see if it is the expected postmaster process.

> I would rather see us implement the hypothetical pg_ping protocol
> and remember to include the postmaster's PID in the response.

Although on second thought, any such test is worth approximately nothing
anyway. You can check that the postmaster answering the doorbell
reports the same PID that you see in $PGDATA/postmaster.pid, but that
still doesn't prove that that postmaster is using that data directory.
It could be a random coincidence of PIDs. And in the case of a start
script, the probability of random PID match to a stale lockfile is many
orders of magnitude higher than you might think; see prior discussions.

This could be addressed by having the postmaster report its $PGDATA
value in the pg_ping response, but I would be against that on security
grounds. We don't let nonprivileged users know where PGDATA is, why
would we make the information available without any authentication at
all?

[ thinks... ] Maybe we could have the postmaster generate a random
number at start and include that in both the postmaster.ports file
and its pg_ping responses. That would have a substantially lower
collision probability than PID, if the number generation process
were well designed; and it wouldn't risk exposing anything sensitive
in the ping response.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 18:46:41
Message-ID: 4AD727C1020000250002B9F3@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> [ thinks... ] Maybe we could have the postmaster generate a random
> number at start and include that in both the postmaster.ports file
> and its pg_ping responses. That would have a substantially lower
> collision probability than PID, if the number generation process
> were well designed; and it wouldn't risk exposing anything sensitive
> in the ping response.

Unless two postmasters could open the same server socket within a
microsecond of one another, a timestamp value captured on opening the
server socket seems even better than a random number. Well, I guess
if someone subverted the clock it could mislead, but is that really
more likely to cause a false match than a random number?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-bugs(at)postgresql(dot)org, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 18:52:59
Message-ID: 11911.1255632779@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Sounds good to me, other than it stalls pg_ctl revamp until pg_ping is
> done. I don't remember a clear design of what pg_ping should look
> like. Does anyone have a clear plan in their head?

I don't think anyone's written down a full spec, but it seems like a
relatively trivial thing to me.

* Client connects to the usual place and sends a packet that has a
special "protocol number" (similar to the way we handle SSL requests).
AFAICS there wouldn't need to be anything else in the packet.

* Postmaster responds with a suitable message and closes the connection.
The message should at least include the current postmaster
CanAcceptConnections status and the PID/magic number we were just
discussing. I can't think of anything else offhand --- anyone else?

I'm not sure whether we'd want to provide a function within libpq
for this, or just code it in pg_ctl. Within libpq the natural
thing would be to take a conninfo connection string, but I'm not
sure that suits pg_ctl's purposes.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-bugs(at)postgresql(dot)org, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 18:59:39
Message-ID: 12033.1255633179@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> [ thinks... ] Maybe we could have the postmaster generate a random
>> number at start and include that in both the postmaster.ports file
>> and its pg_ping responses.

> Unless two postmasters could open the same server socket within a
> microsecond of one another, a timestamp value captured on opening the
> server socket seems even better than a random number.

Well, that raises the question of whether postmaster uptime could be
considered security-sensitive info. I'd still rather use a random
number.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 19:11:45
Message-ID: 4AD72DA1020000250002BA01@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I'm not sure whether we'd want to provide a function within libpq
> for this, or just code it in pg_ctl.

I'm inclined to think there would be value to a pg_ping utility to
support automated monitoring by unprivileged users on other boxes.
That both suggests libpq as the location, and one or two additional
pieces of information. An indication of "in archive recovery" versus
production or shutdown, for example, might be useful. I'm not sure
what else might make sense.

> Within libpq the natural thing would be to take a conninfo
> connection string, but I'm not sure that suits pg_ctl's purposes.

I'm a little lost on that. Would it cause any problems for pg_ctl,
or just be more than it would need if it's only implemented there?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-bugs(at)postgresql(dot)org, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 19:28:02
Message-ID: 12373.1255634882@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm not sure whether we'd want to provide a function within libpq
>> for this, or just code it in pg_ctl.

> I'm inclined to think there would be value to a pg_ping utility to
> support automated monitoring by unprivileged users on other boxes.

True. I had first thought that pg_ctl itself could serve that purpose,
but it's really designed around the assumption that it has direct access
to $PGDATA, so it wouldn't fit well for monitoring from another machine.

> That both suggests libpq as the location, and one or two additional
> pieces of information. An indication of "in archive recovery" versus
> production or shutdown, for example, might be useful. I'm not sure
> what else might make sense.

IIRC, that's already covered by the CanAcceptConnections state.
We need to be pretty conservative about how much information we
expose here, anyhow, since it will be handed out to absolutely
anybody who can reach the postmaster port.

>> Within libpq the natural thing would be to take a conninfo
>> connection string, but I'm not sure that suits pg_ctl's purposes.

> I'm a little lost on that. Would it cause any problems for pg_ctl,
> or just be more than it would need if it's only implemented there?

Well, given what we were saying about a postmaster.ports file, pg_ctl
would typically be working with an absolute path to the socket file.
Which is not what normally goes into a conninfo string. Perhaps that
could be addressed by specifying the file contents differently, but
I'd be wary of assuming that *all* users of the ports file will be
libpq-based --- for instance a Java version of pg_ctl wouldn't be.

regards, tom lane


From: Pedro Gimeno <pgsql-003(at)personal(dot)formauri(dot)es>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-bugs(at)postgresql(dot)org, Gerhard Leykam <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-15 22:30:52
Message-ID: 4AD7A29C.3050008@personal.formauri.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:

> This could be addressed by having the postmaster report its $PGDATA
> value in the pg_ping response, but I would be against that on security
> grounds. We don't let nonprivileged users know where PGDATA is, why
> would we make the information available without any authentication at
> all?

Maybe a hash of it?


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Pedro Gimeno" <pgsql-003(at)personal(dot)formauri(dot)es>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 14:33:31
Message-ID: 4AD83DEB020000250002BA5B@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Pedro Gimeno <pgsql-003(at)personal(dot)formauri(dot)es> wrote:
> Tom Lane wrote:
>
>> This could be addressed by having the postmaster report its $PGDATA
>> value in the pg_ping response, but I would be against that on
>> security grounds. We don't let nonprivileged users know where
>> PGDATA is, why would we make the information available without any
>> authentication at all?
>
> Maybe a hash of it?

I'm not really clear on why it's a security issue for someone to know
the $PGDATA value, but if it is, there are some "typical" locations
for which a hash could be generated and matched against the returned
hash; so a hash of it would only be safe for those who chose
sufficiently "creative" directory paths.

On top of that, I'm not sure it's a very useful way to confirm that
you've connected to the correct instance. We often get requests to
replace the contents of a development or test database with a dump
from a production database. More than once, the DBA doing this has
forgotten to stop PostgreSQL before deleting the $PGDATA directory and
creating it fresh for the restore of the PITR dump. When we attempt to
start the new copy, which has the same $PGDATA, owner, and port number
as the copy still running in the deleted directory, we have similar
issues to those described in the original post. So, personally, I
consider the data directory a less reliable test than the pid. (We
don't have a lot of OS crash & reboot occurrences.)

-Kevin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Pedro Gimeno <pgsql-003(at)personal(dot)formauri(dot)es>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org, Gerhard Leykam <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 14:42:05
Message-ID: 603c8f070910160742u34e66a2crdd7d0b3a94579062@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Oct 16, 2009 at 10:33 AM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Pedro Gimeno <pgsql-003(at)personal(dot)formauri(dot)es> wrote:
>> Tom Lane wrote:
>>
>>> This could be addressed by having the postmaster report its $PGDATA
>>> value in the pg_ping response, but I would be against that on
>>> security grounds.  We don't let nonprivileged users know where
>>> PGDATA is, why would we make the information available without any
>>> authentication at all?
>>
>> Maybe a hash of it?
>
> I'm not really clear on why it's a security issue for someone to know
> the $PGDATA value, but if it is, there are some "typical" locations
> for which a hash could be generated and matched against the returned
> hash; so a hash of it would only be safe for those who chose
> sufficiently "creative" directory paths.
>
> On top of that, I'm not sure it's a very useful way to confirm that
> you've connected to the correct instance.  We often get requests to
> replace the contents of a development or test database with a dump
> from a production database.  More than once, the DBA doing this has
> forgotten to stop PostgreSQL before deleting the $PGDATA directory and
> creating it fresh for the restore of the PITR dump. When we attempt to
> start the new copy, which has the same $PGDATA, owner, and port number
> as the copy still running in the deleted directory, we have similar
> issues to those described in the original post.  So, personally, I
> consider the data directory a less reliable test than the pid.  (We
> don't have a lot of OS crash & reboot occurrences.)

Well, then Tom's idea of using a random number seems pretty solid no
matter how you slice it. Maybe a UUID.

...Robert


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Pedro Gimeno" <pgsql-003(at)personal(dot)formauri(dot)es>, <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 15:08:12
Message-ID: 4AD8460C020000250002BA62@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Well, then Tom's idea of using a random number seems pretty solid no
> matter how you slice it. Maybe a UUID.

A random number is looking like the best option. I'm not sure why I'd
want to generate a perfectly good 128 bit random number and then throw
away six of the bits to dress it up as a UUID, though. Do the
libraries for that do enough to introduce entropy to compensate for
the lost bits? Any other benefit I'm missing?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Pedro Gimeno" <pgsql-003(at)personal(dot)formauri(dot)es>, pgsql-bugs(at)postgresql(dot)org, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 15:24:58
Message-ID: 6811.1255706698@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Well, then Tom's idea of using a random number seems pretty solid no
>> matter how you slice it. Maybe a UUID.

> A random number is looking like the best option. I'm not sure why I'd
> want to generate a perfectly good 128 bit random number and then throw
> away six of the bits to dress it up as a UUID, though. Do the
> libraries for that do enough to introduce entropy to compensate for
> the lost bits? Any other benefit I'm missing?

I was envisioning just using PostmasterRandom() (after initializing
the seed from time(NULL) as we do now). I don't think we need a
super-wide random number.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Pedro Gimeno" <pgsql-003(at)personal(dot)formauri(dot)es>, <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 15:32:41
Message-ID: 4AD84BC9020000250002BA8D@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I was envisioning just using PostmasterRandom() (after initializing
> the seed from time(NULL) as we do now). I don't think we need a
> super-wide random number.

Fine with me. Just that and CanAcceptConnections in the response?

It seems like pg_ping (client utility and related postmaster support)
should be a discrete patch. Improvements to pg_ctl and init scripts
would come later, as separate patches?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Pedro Gimeno" <pgsql-003(at)personal(dot)formauri(dot)es>, pgsql-bugs(at)postgresql(dot)org, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 15:48:05
Message-ID: 7164.1255708085@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> It seems like pg_ping (client utility and related postmaster support)
> should be a discrete patch. Improvements to pg_ctl and init scripts
> would come later, as separate patches?

Sounds sane to me. Alternatively, do the postmaster support and make
the presumably-minor pg_ctl mods to use it, and then a standalone
pg_ping utility could come later. I'm not sure how big the utility
would be, but surely bigger than the delta in pg_ctl.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Pedro Gimeno" <pgsql-003(at)personal(dot)formauri(dot)es>, <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 15:58:14
Message-ID: 4AD851C6020000250002BA9D@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Alternatively, do the postmaster support and make the
> presumably-minor pg_ctl mods to use it, and then a standalone
> pg_ping utility could come later. I'm not sure how big the utility
> would be, but surely bigger than the delta in pg_ctl.

Bigger than the delta for *just this change* to pg_ctl. I was
thinking of addressing all pg_ctl issues at once, but perhaps this
one makes sense on its own. If so, your alternative does sound
better.

-Kevin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Pedro Gimeno <pgsql-003(at)personal(dot)formauri(dot)es>, pgsql-bugs(at)postgresql(dot)org, Gerhard Leykam <gel123(at)sealsystems(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 17:53:34
Message-ID: 603c8f070910161053y53168705qcbc28764722e8be0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Oct 16, 2009 at 11:08 AM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> Well, then Tom's idea of using a random number seems pretty solid no
>> matter how you slice it.  Maybe a UUID.
>
> A random number is looking like the best option.  I'm not sure why I'd
> want to generate a perfectly good 128 bit random number and then throw
> away six of the bits to dress it up as a UUID, though.  Do the
> libraries for that do enough to introduce entropy to compensate for
> the lost bits?  Any other benefit I'm missing?

I'm confused. UUIDs throw away 6 bits?

Anyway, some smaller random number might be fine, too - not trying to
throw a monkey wrench into the process.

...Robert


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Pedro Gimeno" <pgsql-003(at)personal(dot)formauri(dot)es>, <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 18:04:15
Message-ID: 4AD86F4F020000250002BABD@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> UUIDs throw away 6 bits?

http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Version_4_.28random.29

-Kevin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Pedro Gimeno <pgsql-003(at)personal(dot)formauri(dot)es>, pgsql-bugs(at)postgresql(dot)org, Gerhard Leykam <gel123(at)sealsystems(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2009-10-16 18:32:00
Message-ID: 603c8f070910161132i2da52574obcf6d26ceefe5a42@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Oct 16, 2009 at 2:04 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> UUIDs throw away 6 bits?
>
> http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Version_4_.28random.29

How about that. You learn something new every day.

...Robert


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-bugs(at)postgresql(dot)org, Gerhard Leykam <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2010-02-22 17:31:46
Message-ID: 201002221731.o1MHVkH19768@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


Was this ever addressed?

---------------------------------------------------------------------------

Tom Lane wrote:
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> > Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I'm not sure whether we'd want to provide a function within libpq
> >> for this, or just code it in pg_ctl.
>
> > I'm inclined to think there would be value to a pg_ping utility to
> > support automated monitoring by unprivileged users on other boxes.
>
> True. I had first thought that pg_ctl itself could serve that purpose,
> but it's really designed around the assumption that it has direct access
> to $PGDATA, so it wouldn't fit well for monitoring from another machine.
>
> > That both suggests libpq as the location, and one or two additional
> > pieces of information. An indication of "in archive recovery" versus
> > production or shutdown, for example, might be useful. I'm not sure
> > what else might make sense.
>
> IIRC, that's already covered by the CanAcceptConnections state.
> We need to be pretty conservative about how much information we
> expose here, anyhow, since it will be handed out to absolutely
> anybody who can reach the postmaster port.
>
> >> Within libpq the natural thing would be to take a conninfo
> >> connection string, but I'm not sure that suits pg_ctl's purposes.
>
> > I'm a little lost on that. Would it cause any problems for pg_ctl,
> > or just be more than it would need if it's only implemented there?
>
> Well, given what we were saying about a postmaster.ports file, pg_ctl
> would typically be working with an absolute path to the socket file.
> Which is not what normally goes into a conninfo string. Perhaps that
> could be addressed by specifying the file contents differently, but
> I'd be wary of assuming that *all* users of the ports file will be
> libpq-based --- for instance a Java version of pg_ctl wouldn't be.
>
> regards, tom lane
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do
+ If your life is a hard drive, Christ can be your backup. +


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2010-02-22 17:55:17
Message-ID: 4B8270A5020000250002F485@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> Was this ever addressed?

It should probably be on the TODO list. I was going to try to do
this along with other items which came out of generating an LSB
conforming init script, but have been pulled in different directions
for now. When I get the time I've been intending to get back to
this, if nobody beats me to it. Do we want one entry with all the
miscellaneous pg_ctl issues I've got, or would it be better to keep
the separate?

-Kevin


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org, Gerhard Leykam <gel123(at)sealsystems(dot)de>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2010-02-22 17:58:21
Message-ID: 201002221758.o1MHwL304873@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Kevin Grittner wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> > Was this ever addressed?
>
> It should probably be on the TODO list. I was going to try to do
> this along with other items which came out of generating an LSB
> conforming init script, but have been pulled in different directions
> for now. When I get the time I've been intending to get back to
> this, if nobody beats me to it. Do we want one entry with all the
> miscellaneous pg_ctl issues I've got, or would it be better to keep
> the separate?

I think you should just edit the TODO wiki and list all the things we
agree need fixing:

http://wiki.postgresql.org/wiki/Todo

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do
+ If your life is a hard drive, Christ can be your backup. +


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2010-02-22 18:25:55
Message-ID: 4B8277D3020000250002F493@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> I think you should just edit the TODO wiki and list all the things
> we agree need fixing

Will do. It may take me a while to chase down all the issues from
the LSB script work.

If I recall correctly, some of what I was looking at seems necessary
for a well-behaved LSB conforming script, and there was support for
the ideas from some quarters, but it fell short of what I'd consider
a consensus (for example, escalating the stop request level over
time). Do we want to get into a discussion of such issues at this
point, add them to the list with a note that discussion is needed
before undertaking them, or should I keep them just in my own list?

A couple comments which I think were the last on the escalation of
stop mode:

http://archives.postgresql.org/pgsql-hackers/2009-09/msg01145.php

http://archives.postgresql.org/pgsql-hackers/2009-09/msg01158.php

Before continuing that discussion, I think we need to decide if *now
is the time* to continue that discussion (and other related ones);
and I would argue such discussion should wait. These links are just
provided to help answer the question of what I should do with such
disputed issues when I update the TODO list. I don't want to slip
something in without consensus, but I don't want the issue to be
lost, or missed by anyone who takes on pg_ctl work.

-Kevin


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>, "Gerhard Leykam" <gel123(at)sealsystems(dot)de>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #5118: start-status-insert-fatal
Date: 2010-02-23 17:00:39
Message-ID: 4B83B557020000250002F51B@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> I think you should just edit the TODO wiki and list all the things
> we agree need fixing:

Done, although with the wealth of opinions and dearth of agreement I
referenced much material and said that more discussion was needed
before starting development. If anyone figures I missed anything or
got overly expansive, have it it. ;-)

-Kevin