Re: Odd corruption issue reported on dba.stackexchange.com, need advice

Lists: pgsql-general
From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Odd corruption issue reported on dba.stackexchange.com, need advice
Date: 2012-07-19 03:35:05
Message-ID: 50078069.5050501@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi all

I've been following a strange issue on dba.stackexchange.com and I could
use opinions from those more clueful than I on the log files posted there.

http://dba.stackexchange.com/questions/20959/recover-postgresql-database-from-wal-errors-on-startup/20961#comment34356_20961

The short version is that the person did a bulk-load of some PostGIS
data using the osm2pgsql data-loader tool (
http://wiki.openstreetmap.org/wiki/Osm2pgsql) to populate a Pg 9.1
database. This appeared to succeed, but when the server was restarted it
failed to come up, complaining that "WAL contains references to invalid
pages" and "page 1493172 of relation base/16385/477861 was
uninitialized" (for many different pages).

The logs of the shutdown suggest that a backend probably crashed, but
that shouldn't cause the WAL and heap corruption observed by the OP.

It's interesting to observe the presence of both a fast and an immediate
shutdown request in the log.

A trimmed log follows, original linked to in the article above:

LOG: received fast shutdown request
LOG: aborting any active transactions
LOG: received immediate shutdown request
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
LOG: could not send data to client: Broken pipe
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
LOG: could not send data to client: Broken pipe

[Note: The posted log of the shutdown ends here. The poster asserts that
this is the complete log and that the following statement is the next
one in their log. Very strange.]

LOG: database system was interrupted; last known up at 2012-07-13
00:15:20 UTC
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 4D/A25FE548
LOG: unexpected pageaddr 4D/6F5C6000 in log file 77, segment 196,
offset 6053888
LOG: redo done at 4D/C45C5278
WARNING: page 1493172 of relation base/16385/477861 was uninitialized
WARNING: page 2247965 of relation base/16385/477861 was uninitialized
WARNING: page 1493172 of relation base/16385/477861 was uninitialized
PANIC: WAL contains references to invalid pages
LOG: startup process (PID 21574) was terminated by signal 6: Aborted
LOG: aborting startup due to startup process failure

--
Craig Ringer


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Odd corruption issue reported on dba.stackexchange.com, need advice
Date: 2012-07-24 00:48:51
Message-ID: 1343090931.18608.95.camel@sussancws0025
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 2012-07-19 at 11:35 +0800, Craig Ringer wrote:
> The short version is that the person did a bulk-load of some PostGIS
> data using the osm2pgsql data-loader tool
> ( http://wiki.openstreetmap.org/wiki/Osm2pgsql) to populate a Pg 9.1
> database. This appeared to succeed, but when the server was restarted
> it failed to come up, complaining that "WAL contains references to
> invalid pages" and "page 1493172 of relation base/16385/477861 was
> uninitialized" (for many different pages).
>
> The logs of the shutdown suggest that a backend probably crashed, but
> that shouldn't cause the WAL and heap corruption observed by the OP.

Is it possible that the machine has write cache enabled?

Regards,
Jeff Davis


From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Odd corruption issue reported on dba.stackexchange.com, need advice
Date: 2012-07-24 01:16:17
Message-ID: 500DF761.4090309@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 07/24/2012 08:48 AM, Jeff Davis wrote:
> On Thu, 2012-07-19 at 11:35 +0800, Craig Ringer wrote:
>> The short version is that the person did a bulk-load of some PostGIS
>> data using the osm2pgsql data-loader tool
>> ( http://wiki.openstreetmap.org/wiki/Osm2pgsql) to populate a Pg 9.1
>> database. This appeared to succeed, but when the server was restarted
>> it failed to come up, complaining that "WAL contains references to
>> invalid pages" and "page 1493172 of relation base/16385/477861 was
>> uninitialized" (for many different pages).
>>
>> The logs of the shutdown suggest that a backend probably crashed, but
>> that shouldn't cause the WAL and heap corruption observed by the OP.
> Is it possible that the machine has write cache enabled?
It shouldn't matter - only PostgreSQL was restarted, not the whole
machine - and cleanly at that. Very strange.

--
Craig Ringer


From: Marcin Mańk <marcin(dot)mank(at)gmail(dot)com>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Odd corruption issue reported on dba.stackexchange.com, need advice
Date: 2012-07-24 09:31:07
Message-ID: CAK61fk4UR1JtPsS+6SgL6kn+u3VNNqVVumYfqYKx4_kAUZJdDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Tue, Jul 24, 2012 at 3:16 AM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> wrote:

> It shouldn't matter - only PostgreSQL was restarted, not the whole machine
>> - and cleanly at that. Very strange.
>
>
> look at pg_ctlcluster, (which does the restart):

if (!fork()) {
close STDOUT;
exec $pg_ctl, '-D', $info{'pgdata'}, '-s', '-w', '-m', 'fast',
@pg_ctl_opts_from_cli, 'stop';
} else {
wait;
$result = $? >> 8;
}

# try harder if "fast" mode does not work
if (-f $info{'pgdata'}.'/postmaster.pid') {
print "(does not shutdown gracefully, now stopping
immediately)\n";
$result = system $pg_ctl, '-D', $info{'pgdata'}, '-s', '-w',
'-m', 'immediate', @pg_ctl_opts_from_cli, 'stop';
}

# if that still not helps, use the big hammer
if (-f $info{'pgdata'}.'/postmaster.pid') {
print "(does not shutdown, killing the process)";
$pid = get_running_pid $info{'pgdata'}.'/postmaster.pid';
kill (9, $pid) if $pid;
unlink $info{'pgdata'}.'/postmaster.pid';
$result = 0;
}

Could the "big hammer mode" be what's killed the database?

Greetings
Marcin


From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Marcin Mańk <marcin(dot)mank(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Odd corruption issue reported on dba.stackexchange.com, need advice
Date: 2012-07-24 13:48:36
Message-ID: 500EA7B4.3000608@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 07/24/2012 05:31 PM, Marcin Mańk wrote:
> On Tue, Jul 24, 2012 at 3:16 AM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au
> <mailto:ringerc(at)ringerc(dot)id(dot)au>> wrote:
>
> It shouldn't matter - only PostgreSQL was restarted, not the
> whole machine - and cleanly at that. Very strange.
>
> # if that still not helps, use the big hammer
> if (-f $info{'pgdata'}.'/postmaster.pid') {
> print "(does not shutdown, killing the process)";
> $pid = get_running_pid $info{'pgdata'}.'/postmaster.pid';
> kill (9, $pid) if $pid;
> unlink $info{'pgdata'}.'/postmaster.pid';
> $result = 0;
> }
>
> Could the "big hammer mode" be what's killed the database?

Gah, that's not very wise. While the DB is probably busy doing an
immediate shutdown with I/O in flight and backends chatting, let's just
KILL IT with no chance of recovery!

I don't know whether PostgreSQL is supposed to cope in this situation or
if it counts as abuse beyond what's reasonable. Either way it this is
the cause then either Pg or the shutdown script have a potential data
corruption bug.

Tom?

--
Craig Ringer


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: Marcin Mańk <marcin(dot)mank(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Odd corruption issue reported on dba.stackexchange.com, need advice
Date: 2012-07-24 14:20:12
Message-ID: CAOR=d=2S0twr5joUQTsuHJ-7mgKUdK2a-X2XVLjA7FGVB-QRcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Tue, Jul 24, 2012 at 7:48 AM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> wrote:
> On 07/24/2012 05:31 PM, Marcin Mańk wrote:
>
> On Tue, Jul 24, 2012 at 3:16 AM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> wrote:
>>>
>>> It shouldn't matter - only PostgreSQL was restarted, not the whole
>>> machine - and cleanly at that. Very strange.
>
> # if that still not helps, use the big hammer
> if (-f $info{'pgdata'}.'/postmaster.pid') {
> print "(does not shutdown, killing the process)";
> $pid = get_running_pid $info{'pgdata'}.'/postmaster.pid';
> kill (9, $pid) if $pid;
> unlink $info{'pgdata'}.'/postmaster.pid';
> $result = 0;
> }
>
> Could the "big hammer mode" be what's killed the database?
>
>
> Gah, that's not very wise. While the DB is probably busy doing an immediate
> shutdown with I/O in flight and backends chatting, let's just KILL IT with
> no chance of recovery!
>
> I don't know whether PostgreSQL is supposed to cope in this situation or if
> it counts as abuse beyond what's reasonable. Either way it this is the cause
> then either Pg or the shutdown script have a potential data corruption bug.

Yes PG should theoretically survive be able to survive anything as
long as fsync is being properly honored.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, Marcin Mańk <marcin(dot)mank(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Odd corruption issue reported on dba.stackexchange.com, need advice
Date: 2012-07-24 14:45:16
Message-ID: 21697.1343141116@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> writes:
> On Tue, Jul 24, 2012 at 7:48 AM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> wrote:
>>> # if that still not helps, use the big hammer
>>> if (-f $info{'pgdata'}.'/postmaster.pid') {
>>> print "(does not shutdown, killing the process)";
>>> $pid = get_running_pid $info{'pgdata'}.'/postmaster.pid';
>>> kill (9, $pid) if $pid;
>>> unlink $info{'pgdata'}.'/postmaster.pid';
>>> $result = 0;
>>> }

>> Could the "big hammer mode" be what's killed the database?

> Yes PG should theoretically survive be able to survive anything as
> long as fsync is being properly honored.

I will tell you what is horridly, horridly dangerous and stupid about
that script, and it's not the kill -9 on the postmaster. It's the
forced unlink on the postmaster.pid file, which (a) is entirely
unnecessary, and (b) defeats the safety interlock against starting
a new postmaster before all the old backends have flushed out.

Postgres will survive a postmaster kill just fine; that scenario
gets exercised fairly regularly, because of the Linux OOM killer :-(.
It will not survive having two independent sets of backends scribbling
on the same database, but that's what this script opens you up to.
If you ever used the "big hammer" and then started a new postmaster
before being entirely sure all the old postmaster's child processes
were gone, then that's why you have a corrupt database.

regards, tom lane


From: Marcin Mańk <marcin(dot)mank(at)gmail(dot)com>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, mpitt(at)debian(dot)org
Subject: Re: Odd corruption issue reported on dba.stackexchange.com, need advice
Date: 2012-07-25 07:41:58
Message-ID: CAK61fk49SZk3Cua8Rt3R+wE7hw2YmqXw2EDTJS33eB=7WTeLiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I privately pointed Martin Pitt (Debian maintainer) to this
discussion, his response below.

Martin, I believe what happened is:
the original complainer did /etc/init.d/postgresql restart
this called pg_ctlcluster --force restart
this effectively did: pg_ctlcluster --force stop (which removed the
pidfile); pg_ctlcluster start;
boom.

On Wed, Jul 25, 2012 at 6:38 AM, Martin Pitt <mpitt(at)debian(dot)org> wrote:
> Thanks for pointing this out!

> This mode (--force) is not on by default for command line operation,
> i. e. if you use pg_ctlcluster stop manually it will use the default
> "smart" mode and nothing else. --force is just used in the init
> script's "stop" mode, i. e. when the machine is shutting down. I tried
> to come up with something that guarantees that psql is down after that
> (we are restarting the machine -- it WILL be killed, the question is
> just how to make that as graceful as possible).

> I am happy about any improvements there. I saw Tom's mail about
> removing the .pid file; as the "start" mode checks for a stale pid
> file and cleans it up if the pid does not exist any more, or bails out
> if it still does, it is redundant. With the pointed out race condition
> I do agree that this is a dangerous thing to do, so I removed it in
> the development branch:

> http://anonscm.debian.org/loggerhead/pkg-postgresql/postgresql-common/trunk/revision/1181

> I'll see to getting that into Debian Wheezy as well (which is
> currently frozen).

> Please feel free to forward this to the thread, as I don't have the
> original mail to reply to.


From: Marcin Mańk <marcin(dot)mank(at)gmail(dot)com>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, mpitt(at)debian(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Odd corruption issue reported on dba.stackexchange.com, need advice
Date: 2012-07-25 07:47:49
Message-ID: CAK61fk7D4x0AFd8DQcRtt8+MriYn3pnmBcX=PAdCe20ehoX2pQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> On Wed, Jul 25, 2012 at 6:38 AM, Martin Pitt <mpitt(at)debian(dot)org> wrote:
>> http://anonscm.debian.org/loggerhead/pkg-postgresql/postgresql-common/trunk/revision/1181
>

http://anonscm.debian.org/loggerhead/pkg-postgresql/postgresql-common/trunk/annotate/1181/pg_ctlcluster

I still see some pid file unlinking there. Is this actually needed /
not dangerous?

Greetings
Marcin Mańk