Re: Hard limit on WAL space used (because PANIC sucks)

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Christian Ullrich <chris(at)chrullrich(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hard limit on WAL space used (because PANIC sucks)
Date: 2013-06-08 21:10:34
Message-ID: CA+U5nMLYakgymU6LZqSinnYmqqD6h4AQHdRMoZUEkt2HEChqWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7 June 2013 10:02, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> On 07.06.2013 00:38, Andres Freund wrote:
>>
>> On 2013-06-06 23:28:19 +0200, Christian Ullrich wrote:
>>>
>>> * Heikki Linnakangas wrote:
>>>
>>>> The current situation is that if you run out of disk space while writing
>>>> WAL, you get a PANIC, and the server shuts down. That's awful. We can
>>>
>>>
>>>> So we need to somehow stop new WAL insertions from happening, before
>>>> it's too late.
>>>
>>>
>>>> A naive idea is to check if there's enough preallocated WAL space, just
>>>> before inserting the WAL record. However, it's too late to check that in
>>>
>>>
>>> There is a database engine, Microsoft's "Jet Blue" aka the Extensible
>>> Storage Engine, that just keeps some preallocated log files around,
>>> specifically so it can get consistent and halt cleanly if it runs out of
>>> disk space.
>>>
>>> In other words, the idea is not to check over and over again that there
>>> is
>>> enough already-reserved WAL space, but to make sure there always is by
>>> having a preallocated segment that is never used outside a disk space
>>> emergency.
>>
>>
>> That's not a bad technique. I wonder how reliable it would be in
>> postgres.
>
>
> That's no different from just having a bit more WAL space in the first
> place. We need a mechanism to stop backends from writing WAL, before you run
> out of it completely. It doesn't matter if the reservation is done by
> stashing away a WAL segment for emergency use, or by a variable in shared
> memory. Either way, backends need to stop using it up, by blocking or
> throwing an error before they enter the critical section.
>
> I guess you could use the stashed away segment to ensure that you can
> recover after PANIC. At recovery, there are no other backends that could use
> up the emergency segment. But that's not much better than what we have now.

Christian's idea seems good to me. Looks like you could be dismissing
this too early, especially since there's no better idea emerged.

I doubt that we're going to think of something others didn't already
face. It's pretty clear that most other DBMS do this with their logs.

For your fast wal insert patch to work well, we need a simple and fast
technique to detect out of space.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2013-06-08 21:17:45 Re: Cost limited statements RFC
Previous Message Simon Riggs 2013-06-08 21:00:18 Batch API for After Triggers