Re: WAL sync behaviour

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
Cc: Michael Stone <mstone+postgres(at)mathom(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: WAL sync behaviour
Date: 2005-11-10 16:39:34
Message-ID: 9791.1131640774@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Scott Marlowe <smarlowe(at)g2switchworks(dot)com> writes:
> On Thu, 2005-11-10 at 08:43, Michael Stone wrote:
>> There's no reason to use a journaled filesystem for the wal. Use ext2 in
>> preference to ext3.

> Not from what I understood. Ext2 can't guarantee that your data will
> even be there in any form after a crash. I believe only metadata
> journaling is needed though.

No, Mike is right: for WAL you shouldn't need any journaling. This is
because we zero out *and fsync* an entire WAL file before we ever
consider putting live WAL data in it. During live use of a WAL file,
its metadata is not changing. As long as the filesystem follows
the minimal rule of syncing metadata about a file when it fsyncs the
file, all the live WAL files should survive crashes OK.

We can afford to do this mainly because WAL files can normally be
recycled instead of created afresh, so the zero-out overhead doesn't
get paid during normal operation.

You do need metadata journaling for all non-WAL PG files, since we don't
fsync them every time we extend them; which means the filesystem could
lose track of which disk blocks belong to such a file, if it's not
journaled.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message mark 2005-11-10 16:53:13 Re: WAL sync behaviour
Previous Message Alex Turner 2005-11-10 16:34:03 Re: Sort performance on large tables