Re: Unlogged tables can vanish after a crash

Lists: pgsql-hackers
From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Unlogged tables can vanish after a crash
Date: 2014-11-19 11:26:56
Message-ID: A737B7A37273E048B164557ADEF4A58B17D9FC1B@ntex2010a.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I observed an interesting (and I think buggy) behaviour today after one of
our clusters crashed due to an "out of space" condition in the data directory.

Five databases in that cluster have each one unlogged table.

The log reads as follows:

PANIC could not write to file "pg_xlog/xlogtemp.1820": No space left on device
...
LOG terminating any other active server processes
...
LOG all server processes terminated; reinitializing
LOG database system was interrupted; last known up at 2014-11-18 18:04:28 CET
LOG database system was not properly shut down; automatic recovery in progress
LOG redo starts at C9/50403B20
LOG redo done at C9/5AFFFF98
LOG checkpoint starting: end-of-recovery immediate
LOG checkpoint complete: ...
LOG autovacuum launcher started
LOG database system is ready to accept connections
...
PANIC could not write to file "pg_xlog/xlogtemp.4417": No space left on device
...
LOG terminating any other active server processes
...
LOG all server processes terminated; reinitializing
LOG database system was interrupted; last known up at 2014-11-18 18:04:38 CET
LOG database system was not properly shut down; automatic recovery in progress
LOG redo starts at C9/5B000070
LOG redo done at C9/5FFFE4E0
LOG checkpoint starting: end-of-recovery immediate
LOG checkpoint complete: ...
FATAL could not write to file "pg_xlog/xlogtemp.4442": No space left on device
LOG startup process (PID 4442) exited with exit code 1
LOG aborting startup due to startup process failure

After the problem was removed, the cluster was restarted.
The log reads as follows:

LOG ending log output to stderr Future log output will go to log destination "csvlog".
LOG database system was shut down at 2014-11-18 18:05:03 CET
LOG autovacuum launcher started
LOG database system is ready to accept connections

So no crash recovery was performed, probably because the startup process
failed *after* it completed the end-of-recovery checkpoint.

Now the main fork files for all five unlogged tables are gone; the init fork files
are still there.

Obviously the main fork got nuked during recovery, but the startup process died
before it could recreate them:

/*
* Preallocate additional log files, if wanted.
*/
PreallocXlogFiles(EndOfLog);

/*
* Reset initial contents of unlogged relations. This has to be done
* AFTER recovery is complete so that any unlogged relations created
* during recovery also get picked up.
*/
if (InRecovery)
ResetUnloggedRelations(UNLOGGED_RELATION_INIT);

It seems to me that the right fix would be to recreate the unlogged
relations *before* the checkpoint.

Yours,
Laurenz Albe


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unlogged tables can vanish after a crash
Date: 2014-11-19 11:42:49
Message-ID: 20141119114249.GE17845@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2014-11-19 11:26:56 +0000, Albe Laurenz wrote:
> I observed an interesting (and I think buggy) behaviour today after one of
> our clusters crashed due to an "out of space" condition in the data directory.

Hah, just a couple days I pushed a fix for that ;)

http://archives.postgresql.org/message-id/20140912112246.GA4984%40alap3.anarazel.de
and
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d3586fc8aa5d9365a5c50cb5e555971eb633a4ec

> So no crash recovery was performed, probably because the startup process
> failed *after* it completed the end-of-recovery checkpoint.
>
> Now the main fork files for all five unlogged tables are gone; the init fork files
> are still there.

You can "recover" them by restarting with -m immediate or so again.

> It seems to me that the right fix would be to recreate the unlogged
> relations *before* the checkpoint.

Yep, that's what we're doing now.

Greetings,

Andres Freund


From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Andres Freund *EXTERN*" <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unlogged tables can vanish after a crash
Date: 2014-11-19 11:56:20
Message-ID: A737B7A37273E048B164557ADEF4A58B17D9FC8D@ntex2010a.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund wrote:
> On 2014-11-19 11:26:56 +0000, Albe Laurenz wrote:
>> I observed an interesting (and I think buggy) behaviour today after one of
>> our clusters crashed due to an "out of space" condition in the data directory.
>
> Hah, just a couple days I pushed a fix for that ;)
>
> http://archives.postgresql.org/message-id/20140912112246.GA4984%40alap3.anarazel.de
> and
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d3586fc8aa5d9365a5c50cb5e555971eb633a4ec

Thanks, I didn't see that.
PostgreSQL, the database system where your bugs get fixed before you report them!

Yours,
Laurenz Albe