Archive recovery crashes on win32 in HEAD - hot standby related?

Lists: pgsql-hackers
From: Magnus Hagander <magnus(at)hagander(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Archive recovery crashes on win32 in HEAD - hot standby related?
Date: 2010-01-16 13:19:28
Message-ID: 9837222c1001160519s2c947867p8e7c4f50dd99a99c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I was going to test the walreceiver stuff, but it turns out that basic
archive recovery appears to be broken in HEAD. From what I can tell,
it's related to Hot Standby code.
I get this (this is all on win32 - I got the same on win64, but moved
back to win32 to make sure it's not an issue with the win64 code)

LOG: restored log file "000000010000000000000001" from archive
LOG: automatic recovery in progress
LOG: initializing recovery connections
LOG: redo starts at 0/1000020
LOG: consistent recovery state reached at 0/1000050
LOG: startup process (PID 1348) was terminated by exception 0xC0000005
HINT: See C include file "ntstatus.h" for a description of the
hexadecimal value.
LOG: terminating any other active server processes

Stacktrace:
postgres!hash_seq_init(struct HASH_SEQ_STATUS * status = 0x00002d66,
struct HTAB * hashp = 0x00000001)+0x13
postgres!KnownAssignedXidsRemoveMany(unsigned int xid = 0, char
keepPreparedXacts = 1 '')+0x73
postgres!ProcArrayApplyRecoveryInfo(struct RunningTransactionsData *
running = 0x00002d66)+0x1a
postgres!standby_redo(struct XLogRecPtr lsn = struct XLogRecPtr,
struct XLogRecord * record = 0x00000000)+0x80
postgres!StartupXLOG(void)+0xcda
postgres!StartupProcessMain(void)+0x91
postgres!AuxiliaryProcessMain(int argc = <Memory access error>, char
** argv = <Memory access error>)+0x435
postgres!SubPostmasterMain(int argc = 3614962, char ** argv = 0x003728fd)+0x2b2
postgres!main(int argc = <Memory access error>, char ** argv = <Memory
access error>)+0x168
postgres!__tmainCRTStartup(void)+0x10f
WARNING: Stack unwind information not available. Following frames may be wrong.
KERNEL32!BaseProcessInitPostImport+0x8d

very trivial install - one master with zero activity, archiving with
plain "copy" commands...

Not knowing that code very well at this time, but is this perhaps a
structure not being properly initialized in EXEC_BACKEND case?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Archive recovery crashes on win32 in HEAD - hot standby related?
Date: 2010-01-16 13:28:48
Message-ID: 603c8f071001160528pbc3ab56je1b5d03a2d3677b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jan 16, 2010 at 8:19 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> Not knowing that code very well at this time, but is this perhaps a
> structure not being properly initialized in EXEC_BACKEND case?

It looks like KnownAssignedXidsHash is not initialized. That's
supposed to happen when CreateSharedProcArray calls
KnownAssignedXidsInit, but that only happens for the first process to
call that function... but without EXEC_BACKEND it'll just work anyway.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Archive recovery crashes on win32 in HEAD - hot standby related?
Date: 2010-01-16 16:10:00
Message-ID: 7286.1263658200@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sat, Jan 16, 2010 at 8:19 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> Not knowing that code very well at this time, but is this perhaps a
>> structure not being properly initialized in EXEC_BACKEND case?

> It looks like KnownAssignedXidsHash is not initialized. That's
> supposed to happen when CreateSharedProcArray calls
> KnownAssignedXidsInit, but that only happens for the first process to
> call that function... but without EXEC_BACKEND it'll just work anyway.

That code is completely broken as far as the division of labor between
"first" and not "first" is concerned ...

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Archive recovery crashes on win32 in HEAD - hot standby related?
Date: 2010-01-16 17:18:54
Message-ID: 23678.1263662334@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> I was going to test the walreceiver stuff, but it turns out that basic
> archive recovery appears to be broken in HEAD. From what I can tell,
> it's related to Hot Standby code.

I've committed a fix that makes it work in EXEC_BACKEND case on Unix.
Can't tell if there are any Windows-specific issues.

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Archive recovery crashes on win32 in HEAD - hot standby related?
Date: 2010-01-17 12:56:25
Message-ID: 9837222c1001170456m5b0bee59l2109e639223fb5a0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/16 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> I was going to test the walreceiver stuff, but it turns out that basic
>> archive recovery appears to be broken in HEAD. From what I can tell,
>> it's related to Hot Standby code.
>
> I've committed a fix that makes it work in EXEC_BACKEND case on Unix.
> Can't tell if there are any Windows-specific issues.

Seems to have worked - I can confirm I can now do archive recovery again.

Seems streaming replication is broken though, rebuilding a debug build
to see if I can figure out why.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/