Re: BUG #6347: Reopening bug #6085

Lists: pgsql-bugs
From: alexander(dot)fortin(at)gmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #6347: Reopening bug #6085
Date: 2011-12-19 15:06:31
Message-ID: E1RcenP-00009B-FY@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 6347
Logged by: Alexander Fortin
Email address: alexander(dot)fortin(at)gmail(dot)com
PostgreSQL version: 9.1.2
Operating system: Ubuntu 10.04.3
Description:

Hi folks. I'm testing 9.1.2 (source compiled) pg_upgrade (upgrading from
8.4.9) and it seems that the problem exposed in bug #6085 is still there. In
my case, the only way to make pg_upgrade work is to actually force
unix_socket_directory = '/tmp/' for the 8.4.9 cluster.

Running in verbose mode
Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking current, bin, and data directories ok
Checking cluster versions ok
connection to database failed: could not connect to server: No such file or
directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.5432"?


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: alexander(dot)fortin(at)gmail(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6347: Reopening bug #6085
Date: 2012-02-03 14:59:07
Message-ID: 20120203145907.GB11939@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Mon, Dec 19, 2011 at 03:06:31PM +0000, alexander(dot)fortin(at)gmail(dot)com wrote:
> The following bug has been logged on the website:
>
> Bug reference: 6347
> Logged by: Alexander Fortin
> Email address: alexander(dot)fortin(at)gmail(dot)com
> PostgreSQL version: 9.1.2
> Operating system: Ubuntu 10.04.3
> Description:
>
> Hi folks. I'm testing 9.1.2 (source compiled) pg_upgrade (upgrading from
> 8.4.9) and it seems that the problem exposed in bug #6085 is still there. In
> my case, the only way to make pg_upgrade work is to actually force
> unix_socket_directory = '/tmp/' for the 8.4.9 cluster.
>
> Running in verbose mode
> Performing Consistency Checks on Old Live Server
> ------------------------------------------------
> Checking current, bin, and data directories ok
> Checking cluster versions ok
> connection to database failed: could not connect to server: No such file or
> directory
> Is the server running locally and accepting
> connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

Yes. I wasn't clear in my email reply:

http://archives.postgresql.org/pgsql-bugs/2011-07/msg00092.php

When I said this will be fixed in 9.1, I meant pg_ctl will work in 9.1
for non-default socket directories, but when the 9.1 pg_upgrade accesses
the 8.4 server, it has to use the 8.4 pg_ctl to do it, and that can't be
fixed in a back-branch.

I think we can only call this fixed when the old and new server is >= PG
9.1. Yeah, this isn't good, but it is the best we can do.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: alexander(dot)fortin(at)gmail(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6347: Reopening bug #6085
Date: 2012-02-03 18:52:29
Message-ID: 20120203185229.GC11939@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Feb 03, 2012 at 09:59:07AM -0500, Bruce Momjian wrote:
> On Mon, Dec 19, 2011 at 03:06:31PM +0000, alexander(dot)fortin(at)gmail(dot)com wrote:
> > The following bug has been logged on the website:
> >
> > Bug reference: 6347
> > Logged by: Alexander Fortin
> > Email address: alexander(dot)fortin(at)gmail(dot)com
> > PostgreSQL version: 9.1.2
> > Operating system: Ubuntu 10.04.3
> > Description:
> >
> > Hi folks. I'm testing 9.1.2 (source compiled) pg_upgrade (upgrading from
> > 8.4.9) and it seems that the problem exposed in bug #6085 is still there. In
> > my case, the only way to make pg_upgrade work is to actually force
> > unix_socket_directory = '/tmp/' for the 8.4.9 cluster.
> >
> > Running in verbose mode
> > Performing Consistency Checks on Old Live Server
> > ------------------------------------------------
> > Checking current, bin, and data directories ok
> > Checking cluster versions ok
> > connection to database failed: could not connect to server: No such file or
> > directory
> > Is the server running locally and accepting
> > connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
>
> Yes. I wasn't clear in my email reply:
>
> http://archives.postgresql.org/pgsql-bugs/2011-07/msg00092.php
>
> When I said this will be fixed in 9.1, I meant pg_ctl will work in 9.1
> for non-default socket directories, but when the 9.1 pg_upgrade accesses
> the 8.4 server, it has to use the 8.4 pg_ctl to do it, and that can't be
> fixed in a back-branch.
>
> I think we can only call this fixed when the old and new server is >= PG
> 9.1. Yeah, this isn't good, but it is the best we can do.

Actually, thinking more about this, the old pg_upgrade didn't use pg_ctl
wait/-w mode, but rather kept trying to connect until the server was up.
Once pg_ctl -w worked in more cases in PG 9.1, the new pg_upgrade
started using pg_ctl -w, but I didn't consider that we were unable to
fix pg_ctl -w for non-standard settings in back branches.

This can be seen as a regression in pg_upgrade functionality. Not sure
what we can do about this, but perhaps there should be a mention in the
pg_upgrad docs. I am going to wait to see if anyone else reports this
problem --- the last report was against Postgres 9.0 in July, 2011.

FYI, here is the 9.1 relesase not mention of the fix:

Improve <application>pg_ctl</> start's "wait"
(-w) option (Bruce Momjian, Tom Lane)

The wait mode is now significantly more robust. It will not get
confused by non-default postmaster port numbers, non-default
Unix-domain socket locations, permission problems, or stale
postmaster lock files.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: alexander(dot)fortin <alexander(dot)fortin(at)gmail(dot)com>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6347: Reopening bug #6085
Date: 2012-02-03 19:17:50
Message-ID: 1328296627-sup-566@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


Excerpts from Bruce Momjian's message of vie feb 03 15:52:29 -0300 2012:

> Actually, thinking more about this, the old pg_upgrade didn't use pg_ctl
> wait/-w mode, but rather kept trying to connect until the server was up.
> Once pg_ctl -w worked in more cases in PG 9.1, the new pg_upgrade
> started using pg_ctl -w, but I didn't consider that we were unable to
> fix pg_ctl -w for non-standard settings in back branches.

Hm, so what was wrong with just keep trying to connect? Surely it's not
optimal, but if it's more robust than the alternative, maybe it's
preferrable.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: "alexander(dot)fortin" <alexander(dot)fortin(at)gmail(dot)com>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6347: Reopening bug #6085
Date: 2012-02-03 19:32:31
Message-ID: 20120203193231.GD11939@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Feb 03, 2012 at 04:17:50PM -0300, Alvaro Herrera wrote:
>
> Excerpts from Bruce Momjian's message of vie feb 03 15:52:29 -0300 2012:
>
> > Actually, thinking more about this, the old pg_upgrade didn't use pg_ctl
> > wait/-w mode, but rather kept trying to connect until the server was up.
> > Once pg_ctl -w worked in more cases in PG 9.1, the new pg_upgrade
> > started using pg_ctl -w, but I didn't consider that we were unable to
> > fix pg_ctl -w for non-standard settings in back branches.
>
> Hm, so what was wrong with just keep trying to connect? Surely it's not
> optimal, but if it's more robust than the alternative, maybe it's
> preferrable.

Well, it didn't always work. What we used to do, and still do, is to
pass the port number in via -o '-p 4444', but that didn't handle the
socket location, which is the case for the bug reporter.

Now that I think of it, we might not have a regression from 9.0 --- my
big point is that the socket location, while fixed in 9.1, didn't fix it
in back branches, and therefore pg_upgrade doesn't handle them for old
pre-9.1 clusters.

I was unclear why the original pg_upgrade code used a separate
connection loop instead of pg_ctl -w, but when I found how broken pg_ctl
-w was, I fixed pg_ctl so at least going forward, it works for all
use-cases.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +