Change "database system identifier" in pg_control?

Lists: pgsql-admin
From: Joshua Colson <joshua(dot)colson(at)ination(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Change "database system identifier" in pg_control?
Date: 2006-12-14 23:40:40
Message-ID: 1166139640.7183.27.camel@s60r.ination.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Hello everyone,

I'm trying to change the SYSID in pg_control (at least I think that is
where it is) due to a catastrophic backup system failure, combined with
a timely administrative "oops". Does anyone know how I might go about
doing that?

The back story, for those interested:

We lost *all* of our recent backups due to massive hardware failure. A
new database was added within the last couple of weeks and yesterday
morning one of our admins toasted it. We've got a pg_dumpall from after
the database was created and all of the archive logs since. I was trying
to recreate the whole system on another machine to attempt restoration
of the data. The first problem that I ran into was the checkpoint was
incorrectly set (due to the system being a new installation). I was able
to fix that (I think) with the pg_resetxlog command but a subsequent
recovery attempt yields the following errors in the log:

2006-12-14 22:42:09 MST LOG: restored log file
"000000010000002300000079" from archive
2006-12-14 22:42:09 MST LOG: WAL file is from different system
2006-12-14 22:42:09 MST DETAIL: WAL file SYSID is
4965793265546637307, pg_control SYSID is 5008561987249897439
2006-12-14 22:42:09 MST LOG: invalid primary checkpoint record
2006-12-14 22:42:09 MST LOG: invalid secondary checkpoint link in
control file
2006-12-14 22:42:09 MST PANIC: could not locate a valid checkpoint
record
2006-12-14 22:42:09 MST LOG: startup process (PID 15145) was
terminated by signal 6
2006-12-14 22:42:09 MST LOG: aborting startup due to startup process
failure
2006-12-14 22:42:09 MST LOG: logger shutting down

So I thought I would try to change the SYSID to 4965793265546637307 and
have another shot at it.

I'm not a PG guru so please let me know, is there any chance that what
I'm doing will be successful, or am I just wasting my time?

Thank you very much for any help you can provide!

--
Joshua Colson <joshua(dot)colson(at)ination(dot)com>
iNation, LLC


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joshua Colson <joshua(dot)colson(at)ination(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Change "database system identifier" in pg_control?
Date: 2006-12-15 00:10:19
Message-ID: 29504.1166141419@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Joshua Colson <joshua(dot)colson(at)ination(dot)com> writes:
> So I thought I would try to change the SYSID to 4965793265546637307 and
> have another shot at it.
> I'm not a PG guru so please let me know, is there any chance that what
> I'm doing will be successful, or am I just wasting my time?

You'd have to modify pg_resetxlog to force that value into the sysid
rather than whatever is there. Hand-editing wouldn't work very well
because of (a) the CRC on pg_control and (b) the need to copy the
value into extant WAL segments.

regards, tom lane


From: Joshua Colson <joshua(dot)colson(at)ination(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Change "shutdown time" was: Change "database system
Date: 2006-12-15 15:23:16
Message-ID: 1166196196.6761.9.camel@s60r.ination.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

On Thu, 2006-12-14 at 19:10 -0500, Tom Lane wrote:
> You'd have to modify pg_resetxlog to force that value into the sysid
> rather than whatever is there. Hand-editing wouldn't work very well
> because of (a) the CRC on pg_control and (b) the need to copy the
> value into extant WAL segments.

Well, I ended up just using the pg_control file from the running
installation. The seemed to work in combination with pg_resetxlog but
now the restore is not working because the postmaster has a shutdown
timestamp of today's date and the recovery is thinking that it has
already recovered past the proper point in time. :(

Does anyone know how I can convince the postmaster to believe it was
shutdown at approximately the time the pg_dumpall happened?

Thank you all again for any assistance!

--
Joshua Colson <joshua(dot)colson(at)ination(dot)com>
iNation, LLC


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joshua Colson <joshua(dot)colson(at)ination(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Change "shutdown time" was: Change "database system
Date: 2006-12-15 17:57:04
Message-ID: 12366.1166205424@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Joshua Colson <joshua(dot)colson(at)ination(dot)com> writes:
> Well, I ended up just using the pg_control file from the running
> installation.

The problem with that is that it'll have a fairly current
last-checkpoint pointer. You really need a last-checkpoint pointer
pointing at where the rollforward needs to start.

regards, tom lane


From: Joshua Colson <joshua(dot)colson(at)ination(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Change "shutdown time" was: Change "database system
Date: 2006-12-15 18:07:46
Message-ID: 1166206066.6761.14.camel@s60r.ination.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

On Fri, 2006-12-15 at 12:57 -0500, Tom Lane wrote:
> The problem with that is that it'll have a fairly current
> last-checkpoint pointer. You really need a last-checkpoint pointer
> pointing at where the rollforward needs to start.

Can't I just reset the last-checkpoint pointer using pg_resetxlog, or
will it not do that?

Thanks.

--
Joshua Colson <joshua(dot)colson(at)ination(dot)com>
iNation, LLC