Re: recovery_target_xid & crashes on the master

Lists: pgsql-hackers
From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Postgresql-Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: recovery_target_xid & crashes on the master
Date: 2007-06-04 16:26:13
Message-ID: 46643D25.7050109@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

I'm currently working on splitting StartupXLog into smaller
parts, because I need to reuse some of the parts for concurrent
wal recovery (for my GSoC project)

The function recoveryStopsHere in xlog.c checks if we should
stop recovery due to the values of recovery_target_xid and
recovery_target_time. For recovery_target_xid, we stop if
we see a commit or abort record for the given xid.

Now I wonder what happens if an (admittely rather confused) DBA
uses an xid of a transaction that was aborted because of a
crash of the master as recovery_target_xid. The way I read the
code, postgres will just recover until it reaches the end of
the xlog in that case because neither an COMMIT nor an ABORT
for that xid exists in the WAL.

I'm not sure if this is worth fixing - it seems like a rather
contrived corner case - but I though I'd bring it up...

greetings, Florian Pflug


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Cc: "Postgresql-Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: recovery_target_xid & crashes on the master
Date: 2007-06-04 18:12:54
Message-ID: 1180980774.2870.56.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2007-06-04 at 18:26 +0200, Florian G. Pflug wrote:

> The function recoveryStopsHere in xlog.c checks if we should
> stop recovery due to the values of recovery_target_xid and
> recovery_target_time. For recovery_target_xid, we stop if
> we see a commit or abort record for the given xid.
>
> Now I wonder what happens if an (admittely rather confused) DBA
> uses an xid of a transaction that was aborted because of a
> crash of the master as recovery_target_xid. The way I read the
> code, postgres will just recover until it reaches the end of
> the xlog in that case because neither an COMMIT nor an ABORT
> for that xid exists in the WAL.
>
> I'm not sure if this is worth fixing - it seems like a rather
> contrived corner case - but I though I'd bring it up...

Currently use of recovery_target_xid overrides recovery_target_time
because the first one is exact.

It would be possible to have *both*, so you could set one as a backstop
for the other. But you wouldn't do that unless you thought the xid might
be wrong, in which case why are you using it?

There's nothing to stop you specifying a stop time after the crash time
either, in which case we just go to end of logs.

So I'd say no change required, this time.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com