Re: WAL documentation changes

Lists: pgsql-hackers
From: Michael Renner <michael(dot)renner(at)amd(dot)co(dot)at>
To: bruce(at)momjian(dot)us
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: WAL documentation changes
Date: 2008-12-07 16:19:00
Message-ID: 493BF774.5070000@amd.co.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

the comment WRT WAL recovery and FS journals [1] is a bit misleading in
it's current form.

First, none of the general purpose filesystems I've seen so far do data
journalling per default, since it's a huge performance penalty, even for
non-RDBMS workloads. The feature you talk about is ext3 specific (and
should be pointed out as such) and only disables write ordering, meaning
that metadata and file content updates are not synchronized.

best regards,
Michael

[1] 64b3d98baaf96afea815b0c37ff918f02fda11c9


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Michael Renner <michael(dot)renner(at)amd(dot)co(dot)at>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL documentation changes
Date: 2008-12-10 11:08:45
Message-ID: 200812101108.mBAB8jd00978@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Renner wrote:
> Hi,
>
> the comment WRT WAL recovery and FS journals [1] is a bit misleading in
> it's current form.
>
> First, none of the general purpose filesystems I've seen so far do data
> journalling per default, since it's a huge performance penalty, even for
> non-RDBMS workloads. The feature you talk about is ext3 specific (and
> should be pointed out as such) and only disables write ordering, meaning
> that metadata and file content updates are not synchronized.

You are right that my docs were misleading. I have improved them by
mentioning that it is _data_ flush that as part of journalling that can
be a problem, and documented that the mount option listed is
ext3-specific, not linux-specific.

Updated docs attached. Please let me know if I can improve it some
more.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachment Content-Type Size
/rtmp/diff text/x-diff 1.5 KB

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Michael Renner <michael(dot)renner(at)amd(dot)co(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL documentation changes
Date: 2008-12-10 18:01:29
Message-ID: 494003F9.5050906@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>> First, none of the general purpose filesystems I've seen so far do data
>> journalling per default, since it's a huge performance penalty, even for
>> non-RDBMS workloads. The feature you talk about is ext3 specific (and
>> should be pointed out as such) and only disables write ordering, meaning
>> that metadata and file content updates are not synchronized.
>
> You are right that my docs were misleading. I have improved them by
> mentioning that it is _data_ flush that as part of journalling that can
> be a problem, and documented that the mount option listed is
> ext3-specific, not linux-specific.

Actually, I think that some of the other journalling filesystems allow
data journalling (I know ReiserFS does), they just don't default to it.
For that matter, a few (ZFS in particular) have data journalling which
can't be turned off. While it's not a tuning parameter, users should be
warned that they'll take a performance hit from it.

--Josh


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Michael Renner <michael(dot)renner(at)amd(dot)co(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL documentation changes
Date: 2008-12-16 03:44:50
Message-ID: 200812160344.mBG3iov14875@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
>
> >> First, none of the general purpose filesystems I've seen so far do data
> >> journalling per default, since it's a huge performance penalty, even for
> >> non-RDBMS workloads. The feature you talk about is ext3 specific (and
> >> should be pointed out as such) and only disables write ordering, meaning
> >> that metadata and file content updates are not synchronized.
> >
> > You are right that my docs were misleading. I have improved them by
> > mentioning that it is _data_ flush that as part of journalling that can
> > be a problem, and documented that the mount option listed is
> > ext3-specific, not linux-specific.
>
> Actually, I think that some of the other journalling filesystems allow
> data journalling (I know ReiserFS does), they just don't default to it.
> For that matter, a few (ZFS in particular) have data journalling which
> can't be turned off. While it's not a tuning parameter, users should be
> warned that they'll take a performance hit from it.

So I assume you are saying the docs are fine now.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: bruce(at)momjian(dot)us
Cc: josh(at)agliodbs(dot)com, michael(dot)renner(at)amd(dot)co(dot)at, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL documentation changes
Date: 2008-12-17 09:22:01
Message-ID: 20081217.182201.132921276.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce,

In your document change which one can be placed on non-journalling
file system? data? wal? or both?

For me it seems it's not clear.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> Josh Berkus wrote:
> >
> > >> First, none of the general purpose filesystems I've seen so far do data
> > >> journalling per default, since it's a huge performance penalty, even for
> > >> non-RDBMS workloads. The feature you talk about is ext3 specific (and
> > >> should be pointed out as such) and only disables write ordering, meaning
> > >> that metadata and file content updates are not synchronized.
> > >
> > > You are right that my docs were misleading. I have improved them by
> > > mentioning that it is _data_ flush that as part of journalling that can
> > > be a problem, and documented that the mount option listed is
> > > ext3-specific, not linux-specific.
> >
> > Actually, I think that some of the other journalling filesystems allow
> > data journalling (I know ReiserFS does), they just don't default to it.
> > For that matter, a few (ZFS in particular) have data journalling which
> > can't be turned off. While it's not a tuning parameter, users should be
> > warned that they'll take a performance hit from it.
>
> So I assume you are saying the docs are fine now.
>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> EnterpriseDB http://enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: josh(at)agliodbs(dot)com, michael(dot)renner(at)amd(dot)co(dot)at, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL documentation changes
Date: 2008-12-18 22:22:05
Message-ID: 200812182222.mBIMM5p07309@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tatsuo Ishii wrote:
> Bruce,
>
> In your document change which one can be placed on non-journalling
> file system? data? wal? or both?

Both. I have updated the docs to mention this, patch attached.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachment Content-Type Size
/rtmp/diff text/x-diff 1.2 KB

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>, "Tatsuo Ishii" <ishii(at)postgresql(dot)org>
Cc: <josh(at)agliodbs(dot)com>,<michael(dot)renner(at)amd(dot)co(dot)at>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL documentation changes
Date: 2008-12-18 22:26:50
Message-ID: 494A79CA.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Tatsuo Ishii wrote:
>> In your document change which one can be placed on non-journalling
>> file system? data? wal? or both?
>
> Both. I have updated the docs to mention this, patch attached.

Did you mean to say that journaled file systems are *not* necessary?

-Kevin


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, josh(at)agliodbs(dot)com, michael(dot)renner(at)amd(dot)co(dot)at, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL documentation changes
Date: 2008-12-18 22:29:31
Message-ID: 200812182229.mBIMTVA08748@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner wrote:
> >>> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > Tatsuo Ishii wrote:
> >> In your document change which one can be placed on non-journalling
> >> file system? data? wal? or both?
> >
> > Both. I have updated the docs to mention this, patch attached.
>
> Did you mean to say that journaled file systems are *not* necessary?

Yes, not needed for database reliablity. The patch text was attached;
was it unclear?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: <josh(at)agliodbs(dot)com>,<michael(dot)renner(at)amd(dot)co(dot)at>, "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL documentation changes
Date: 2008-12-18 22:32:14
Message-ID: 494A7B0E.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Kevin Grittner wrote:
>> Did you mean to say that journaled file systems are *not*
necessary?
>
> Yes, not needed for database reliablity. The patch text was
attached;
> was it unclear?

I think you accidentally left out the word "not".

-Kevin


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: josh(at)agliodbs(dot)com, michael(dot)renner(at)amd(dot)co(dot)at, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL documentation changes
Date: 2008-12-18 22:34:59
Message-ID: 200812182234.mBIMYxH09531@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner wrote:
> >>> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > Kevin Grittner wrote:
> >> Did you mean to say that journaled file systems are *not*
> necessary?
> >
> > Yes, not needed for database reliablity. The patch text was
> attached;
> > was it unclear?
>
> I think you accidentally left out the word "not".

Oops, right, added. Good catch. Warping that sentence into something
that allowed the mention of WAL and data files was obviously too much
for me. ;-)

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +