Re: HS/SR on AIX

Lists: pgsql-adminpgsql-hackers
From: Steve Singer <ssinger(at)ca(dot)afilias(dot)info>
To: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: HS/SR on AIX
Date: 2010-08-24 21:39:57
Message-ID: 4C743C2D.1060209@ca.afilias.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

I think I've been able to reproduce the issue floating around with
streaming replication on AIX.

LOG: could not bind IPv6 socket: The socket name is already in use.
HINT: Is another postmaster already running on port 5433? If not, wait
a few seconds and retry.
LOG: database system was shut down in recovery at 2010-08-24 21:08:37 UTC
LOG: entering standby mode
cp: cannot stat `/opt/rg/data_tb3/steve/wals/000000010000000000000001':
A file or directory in the path name does not exist.
LOG: redo starts at 0/1000020
LOG: record with zero length at 0/1012280
cp: cannot stat `/opt/rg/data_tb3/steve/wals/000000010000000000000001':
A file or directory in the path name does not exist.
FATAL: could not load library
"/opt/dbs/pgsql9-beta2/lib/libpqwalreceiver.so": 0509-022 Cannot
load module /opt/dbs/pgsql9-beta2/lib/libpqwalreceiver.so.
0509-150 Dependent module libpq.a(libpq.so.5) could
not be loaded.
0509-022 Cannot load module libpq.a(libpq.so.5).
0509-026 System error: A file or directory in the path
name does not exist.
0509-022 Cannot load module
/opt/dbs/pgsql9-beta2/lib/libpqwalreceiver.so.
0509-150 Dependent module
/opt/dbs/pgsql9-beta2/lib/libpqwalreceiver.so could not be loaded.

This worked fine with beta2 but now seems to be an issue on beta4.

If I do
export LIBPATH=/opt/dbs/pgsql9-beta2/lib/
before starting the standby postmaster then it seems to work.

I haven't yet tried running truss to try to look at where it is looking
for libpq from. liblibpqwalreceiver is being linked with this

gcc -maix64 -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -g -o libpqwalreceiver.so liblibpqwalreceiver.a
-Wl,-bE:liblibpqwalreceiver.exp -L../../../../src/port
-Wl,-bmaxdata:0x80000000,-bbigtoc -L/opt/freeware/lib
-Wl,-blibpath:/opt/dbs/pgsql9-beta2/lib:/opt/freeware/lib:/usr/lib:/lib
-Wl,-bnoentry -Wl,-H512 -Wl,-bM:SRE -L../../../../src/interfaces/libpq
-lpq -Wl,-bI:../../../../src/backend/postgres.imp

I'll try to look into this a bit more tomorrow or thursday.

--
Steve Singer
Afilias Canada
Data Services Developer
416-673-1142


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Steve Singer <ssinger(at)ca(dot)afilias(dot)info>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HS/SR on AIX
Date: 2010-08-24 23:56:31
Message-ID: 8446.1282694191@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Steve Singer <ssinger(at)ca(dot)afilias(dot)info> writes:
> I think I've been able to reproduce the issue floating around with
> streaming replication on AIX.

Excellent, because we weren't getting much from the original reporter.

> This worked fine with beta2 but now seems to be an issue on beta4.

> If I do
> export LIBPATH=/opt/dbs/pgsql9-beta2/lib/
> before starting the standby postmaster then it seems to work.

Fascinating. That seems to prove that it's an rpath problem. My
first guess is that the relevant change between beta2 and beta4 is
my LDFLAGS changes. See
http://archives.postgresql.org/pgsql-committers/2010-07/msg00060.php
and following commits.

regards, tom lane


From: Steve Singer <ssinger(at)ca(dot)afilias(dot)info>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HS/SR on AIX
Date: 2010-08-25 15:05:57
Message-ID: 4C753155.3070708@ca.afilias.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Tom Lane wrote:
> Steve Singer <ssinger(at)ca(dot)afilias(dot)info> writes:
>> I think I've been able to reproduce the issue floating around with
>> streaming replication on AIX.
>
> Excellent, because we weren't getting much from the original reporter.

I'm withdrawing my comment, today on a clean install of the binaries I
am not able to reproduce any of this.

Today with beta4 I can stream replication to the standby and bring the
standby up to read-write without issues.

Yesterday when I had put beta4 on top of beta2 without explicitly
deleting all of the beta2 binaries/libraries first. I'm thinking maybe
some portions of beta2 where still laying around.

It is also possible that an old version of a shared library was still
sitting in memory and was being picked up by the newer postgresql (man
slibclean on AIX)

I will do another clean build from the beta4 source tar to confirm that
I'm not still having the issue but I'm thinking the original reporter
might have done something similar and had some old artifacts laying around.

>
>> This worked fine with beta2 but now seems to be an issue on beta4.
>
>> If I do
>> export LIBPATH=/opt/dbs/pgsql9-beta2/lib/
>> before starting the standby postmaster then it seems to work.
>
> Fascinating. That seems to prove that it's an rpath problem. My
> first guess is that the relevant change between beta2 and beta4 is
> my LDFLAGS changes. See
> http://archives.postgresql.org/pgsql-committers/2010-07/msg00060.php
> and following commits.
>
> regards, tom lane
>

--
Steve Singer
Afilias Canada
Data Services Developer
416-673-1142


From: Steve Singer <ssinger(at)ca(dot)afilias(dot)info>
To: Steve Singer <ssinger(at)ca(dot)afilias(dot)info>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: HS/SR on AIX
Date: 2010-08-25 15:45:34
Message-ID: 4C753A9E.6020503@ca.afilias.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Steve Singer wrote:
> Tom Lane wrote:
>> Steve Singer <ssinger(at)ca(dot)afilias(dot)info> writes:
>>> I think I've been able to reproduce the issue floating around with
>>> streaming replication on AIX.
>>
>
> I will do another clean build from the beta4 source tar to confirm that
> I'm not still having the issue but I'm thinking the original reporter
> might have done something similar and had some old artifacts laying around.
>

A clean build from the beta4 source tarball where I'm careful to install
into a clean (ie no old beta2 artifacts laying around waiting to be
overwritten) isn't reproducing the issue.

I'm happy to try other things if people suggest them (or if the original
reporter is still getting this after making sure he cleans up old files
first) but I'm thinking that was the issue.

--
Steve Singer
Afilias Canada
Data Services Developer
416-673-1142


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Steve Singer <ssinger(at)ca(dot)afilias(dot)info>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, alanolya(at)invera(dot)com
Subject: Re: HS/SR on AIX
Date: 2010-08-26 05:44:27
Message-ID: AANLkTingGp2fH6M11ubgom-WcN6FKF9eO6Ecop8ig_4W@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

On Thu, Aug 26, 2010 at 12:45 AM, Steve Singer <ssinger(at)ca(dot)afilias(dot)info> wrote:
> A clean build from the beta4 source tarball where I'm careful to install
> into a clean (ie no old beta2 artifacts laying around waiting to be
> overwritten) isn't reproducing the issue.
>
> I'm happy to try other things if people suggest them (or if the original
> reporter is still getting this after making sure he cleans up old files
> first) but I'm thinking that was the issue.

Thanks for the report!

Alanoly, could you do a clean install and try the test again?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Alanoly Andrews <alanolya(at)invera(dot)com>
To: 'Fujii Masao' <masao(dot)fujii(at)gmail(dot)com>, Steve Singer <ssinger(at)ca(dot)afilias(dot)info>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: [HACKERS] HS/SR on AIX
Date: 2010-08-27 19:31:37
Message-ID: 09B23E7BF70425478C1330D893A722C602FEC019F7@MailSVR.invera.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Fujii,

All my tests so far were done on "clean" installs. Every version I tested on, beta2 through beta4, was compiled and installed in different and unique directories.

Regards.

Alanoly.

-----Original Message-----
From: Fujii Masao [mailto:masao(dot)fujii(at)gmail(dot)com]
Sent: Thursday, August 26, 2010 1:44 AM
To: Steve Singer
Cc: Tom Lane; PostgreSQL-development Hackers; Alanoly Andrews
Subject: Re: [HACKERS] HS/SR on AIX

On Thu, Aug 26, 2010 at 12:45 AM, Steve Singer <ssinger(at)ca(dot)afilias(dot)info> wrote:
> A clean build from the beta4 source tarball where I'm careful to install
> into a clean (ie no old beta2 artifacts laying around waiting to be
> overwritten) isn't reproducing the issue.
>
> I'm happy to try other things if people suggest them (or if the original
> reporter is still getting this after making sure he cleans up old files
> first) but I'm thinking that was the issue.

Thanks for the report!

Alanoly, could you do a clean install and try the test again?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
****************************************************
This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately.

Ce courriel est confidentiel et protg. L'expditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) dsign(s) est interdite. Si vous recevez ce courriel par erreur, veuillez m'en aviser immdiatement, par retour de courriel ou par un autre moyen.
****************************************************


From: Steve Singer <ssinger(at)ca(dot)afilias(dot)info>
To: Alanoly Andrews <alanolya(at)invera(dot)com>
Cc: 'Fujii Masao' <masao(dot)fujii(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: [HACKERS] HS/SR on AIX
Date: 2010-08-27 19:45:26
Message-ID: 4C7815D6.7070306@ca.afilias.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Alanoly Andrews wrote:
> Fujii,
>
> All my tests so far were done on "clean" installs. Every version I tested on, beta2 through beta4, was compiled and installed in different and unique directories.
>
> Regards.

Alanoly,

If you do an
export LIBPATH=/apps/pg_9.0_b4/lib
before starting postgres on the replica does it make a difference?

How about with a LIBPATH=/apps/pg_9.0_b4/lib/postgresql

(I'm not exactly sure where libpq.a is on your install)

>
> Alanoly.

--
Steve Singer
Afilias Canada
Data Services Developer
416-673-1142


From: Alanoly Andrews <alanolya(at)invera(dot)com>
To: 'Steve Singer' <ssinger(at)ca(dot)afilias(dot)info>
Cc: 'Fujii Masao' <masao(dot)fujii(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: [HACKERS] HS/SR on AIX
Date: 2010-08-27 19:57:40
Message-ID: 09B23E7BF70425478C1330D893A722C602FEC019F8@MailSVR.invera.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Steve,

I have tried all the LIBPATH's that you suggested. Besides, I don't think the problem is that postgres cannot find the "libpqwalreceiver" library. It does find it, but crashes on loading it. See below a repeat of the copy and paste from my first post, showing the sequence just before the crash:

(dbx) where
_alloc_initial_pthread(??) at 0x90000000049567c
__pth_init(??) at 0x900000000493ba4
uload(??, ??, ??, ??, ??, ??, ??, ??) at 0x9fffffff0001954
load_64.load(??, ??, ??) at 0x90000000004686c
loadAndInit() at 0x90000000047bd7c
dlopen(??, ??) at 0x90000000011cc4c
internal_load_library(libname = "/apps/pg_9.0_b4/lib/postgresql/libpqwalreceiver.so"), line 234 in "dfmgr.c"
load_file(filename = "libpqwalreceiver", restricted = '\0'), line 156 in "dfmgr.c"
WalReceiverMain(), line 248 in "walreceiver.c"
AuxiliaryProcessMain(argc = 2, argv = 0x0fffffffffffa8b8), line 428 in "bootstrap.c"
StartChildProcess(type = WalReceiverProcess), line 4405 in "postmaster.c"
sigusr1_handler(postgres_signal_arg = 30), line 4227 in "postmaster.c"
__fd_select(??, ??, ??, ??, ??) at 0x90000000011805c
postmaster.select(__fds = 5, __readlist = 0x0fffffffffffd0a8, __writelist = (nil), __exceptlist = (nil), __timeout = 0x0ffffffffffff0c0), line 229 in "time.h"
unnamed block in ServerLoop(), line 1391 in "postmaster.c"
unnamed block in ServerLoop(), line 1391 in "postmaster.c"
ServerLoop(), line 1391 in "postmaster.c"
PostmasterMain(argc = 1, argv = 0x00000001102aa4b0), line 1092 in "postmaster.c"
main(argc = 1, argv = 0x00000001102aa4b0), line 188 in "main.c"

Alanoly.

-----Original Message-----
From: Steve Singer [mailto:ssinger(at)ca(dot)afilias(dot)info]
Sent: Friday, August 27, 2010 3:45 PM
To: Alanoly Andrews
Cc: 'Fujii Masao'; Tom Lane; PostgreSQL-development Hackers; pgsql-admin(at)postgresql(dot)org
Subject: Re: [HACKERS] HS/SR on AIX

Alanoly Andrews wrote:
> Fujii,
>
> All my tests so far were done on "clean" installs. Every version I tested on, beta2 through beta4, was compiled and installed in different and unique directories.
>
> Regards.

Alanoly,

If you do an
export LIBPATH=/apps/pg_9.0_b4/lib
before starting postgres on the replica does it make a difference?

How about with a LIBPATH=/apps/pg_9.0_b4/lib/postgresql

(I'm not exactly sure where libpq.a is on your install)

>
> Alanoly.

--
Steve Singer
Afilias Canada
Data Services Developer
416-673-1142
****************************************************
This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately.

Ce courriel est confidentiel et protg. L'expditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) dsign(s) est interdite. Si vous recevez ce courriel par erreur, veuillez m'en aviser immdiatement, par retour de courriel ou par un autre moyen.
****************************************************


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alanoly Andrews <alanolya(at)invera(dot)com>
Cc: "'Steve Singer'" <ssinger(at)ca(dot)afilias(dot)info>, "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: [HACKERS] HS/SR on AIX
Date: 2010-08-27 20:21:08
Message-ID: 5269.1282940468@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Alanoly Andrews <alanolya(at)invera(dot)com> writes:
> I have tried all the LIBPATH's that you suggested. Besides, I don't
> think the problem is that postgres cannot find the "libpqwalreceiver"
> library. It does find it, but crashes on loading it.

I thought the point of Steve's remarks was that it might be loading
the wrong version, either of libpqwalreceiver.so itself or libpq.so.

regards, tom lane