Re: pg_xlog on separate drive

Lists: pgsql-sql
From: "Travis Whitton" <tinymountain(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: pg_xlog on separate drive
Date: 2006-11-30 15:50:46
Message-ID: cf9b4f3e0611300750o29aa04d0if2a44e26252cfddd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Hey guys, sorry if this is slightly OT for this list, but I figure it's a
simple question. If I'm storing pg_xlog on a second non-redundant drive
using the symlink method and the journal drive were to crash, how difficult
is recovery? Will Postgresql simply be able to reinitialize the journal on a
new drive and carry on, or is there more to it than that? I realize any
pending transactions would be lost, but that's not a huge concern for me
because everything I'm importing comes from raw data.

Thanks,
Travis


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Travis Whitton" <tinymountain(at)gmail(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: pg_xlog on separate drive
Date: 2006-11-30 16:45:09
Message-ID: 14248.1164905109@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

"Travis Whitton" <tinymountain(at)gmail(dot)com> writes:
> Hey guys, sorry if this is slightly OT for this list, but I figure it's a
> simple question. If I'm storing pg_xlog on a second non-redundant drive
> using the symlink method and the journal drive were to crash, how difficult
> is recovery? Will Postgresql simply be able to reinitialize the journal on a
> new drive and carry on, or is there more to it than that? I realize any
> pending transactions would be lost, but that's not a huge concern for me
> because everything I'm importing comes from raw data.

Losing xlog is pretty bad: there's a serious risk of data corruption, in
that transactions made since your last checkpoint may be only partially
applied. I wouldn't recommend a setup in which xlog is less redundant
than your main storage array.

regards, tom lane


From: Markus Schaber <schabi(at)logix-tt(dot)com>
To: Travis Whitton <tinymountain(at)gmail(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: pg_xlog on separate drive
Date: 2006-12-04 10:56:32
Message-ID: 4573FEE0.2010406@logix-tt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Hi, Travis,

Travis Whitton wrote:
> Hey guys, sorry if this is slightly OT for this list, but I figure it's
> a simple question. If I'm storing pg_xlog on a second non-redundant
> drive using the symlink method and the journal drive were to crash, how
> difficult is recovery? Will Postgresql simply be able to reinitialize
> the journal on a new drive and carry on, or is there more to it than
> that? I realize any pending transactions would be lost, but that's not a
> huge concern for me because everything I'm importing comes from raw data.

The problem is that you risk inconsistency at data and structural level.

When the server crashes, it might happen that some pages in the data
files are written only partially (because most disks have a much smaller
blocksize than the PostgreSQL page size (which is 8k by default)).

Now, when the server cannot reply the WAL log, those half-written pages
will not be repaired, and your data may be inconsistent at a very low
sematic level (duplicate rows, missing rows, broken rows, backend
crashes etc.) with no way to repair.

HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org


From: "Travis Whitton" <tinymountain(at)gmail(dot)com>
To: "Markus Schaber" <schabi(at)logix-tt(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: pg_xlog on separate drive
Date: 2006-12-04 14:04:57
Message-ID: cf9b4f3e0612040604h320c2d87maa33e209545b74d3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Thanks for the replies guys. I think I may be ok in my case though because
I'll be importing data as a single daily batch from raw data. I'll be taking
nightly backups, and in the event of a crash, I can simply restore from a
recent backup and then reimport the raw data. I can now see why losing
pg_xlog would be a big problem if I were inserting and updating data
continuously throughout the day though.

Thanks,
Travis

On 12/4/06, Markus Schaber <schabi(at)logix-tt(dot)com> wrote:
>
> Hi, Travis,
>
> Travis Whitton wrote:
> > Hey guys, sorry if this is slightly OT for this list, but I figure it's
> > a simple question. If I'm storing pg_xlog on a second non-redundant
> > drive using the symlink method and the journal drive were to crash, how
> > difficult is recovery? Will Postgresql simply be able to reinitialize
> > the journal on a new drive and carry on, or is there more to it than
> > that? I realize any pending transactions would be lost, but that's not a
> > huge concern for me because everything I'm importing comes from raw
> data.
>
> The problem is that you risk inconsistency at data and structural level.
>
> When the server crashes, it might happen that some pages in the data
> files are written only partially (because most disks have a much smaller
> blocksize than the PostgreSQL page size (which is 8k by default)).
>
> Now, when the server cannot reply the WAL log, those half-written pages
> will not be repaired, and your data may be inconsistent at a very low
> sematic level (duplicate rows, missing rows, broken rows, backend
> crashes etc.) with no way to repair.
>
> HTH,
> Markus
> --
> Markus Schaber | Logical Tracking&Tracing International AG
> Dipl. Inf. | Software Development GIS
>
> Fight against software patents in Europe! www.ffii.org
> www.nosoftwarepatents.org
>