Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Anthony Iliopoulos <ailiop(at)altatus(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Greg Stark <stark(at)mit(dot)edu>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Geoff Winkless <pgsqladmin(at)geoff(dot)dj>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Catalin Iacob <iacobcatalin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date: 2018-04-09 19:44:31
Message-ID: 20180409194431.GD18969@technoir
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 09, 2018 at 12:29:16PM -0700, Andres Freund wrote:
> On 2018-04-09 21:26:21 +0200, Anthony Iliopoulos wrote:
> > What about having buffered IO with implied fsync() atomicity via
> > O_SYNC?
>
> You're kidding, right? We could also just add sleep(30)'s all over the
> tree, and hope that that'll solve the problem. There's a reason we
> don't permanently fsync everything. Namely that it'll be way too slow.

I am assuming you can apply the same principle of selectively using O_SYNC
at times and places that you'd currently actually call fsync().

Also assuming that you'd want to have a backwards-compatible solution for
all those kernels that don't keep the pages around, irrespective of future
fixes. Short of loading a kernel module and dealing with the problem directly,
the only other available options seem to be either O_SYNC, O_DIRECT or ignoring
the issue.

Best regards,
Anthony

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2018-04-09 19:46:29 Re: Warnings and uninitialized variables in TAP tests
Previous Message Justin Pryzby 2018-04-09 19:41:19 Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS