Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Anthony Iliopoulos <ailiop(at)altatus(dot)com>, Greg Stark <stark(at)mit(dot)edu>
Cc: Geoff Winkless <pgsqladmin(at)geoff(dot)dj>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Catalin Iacob <iacobcatalin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date: 2018-04-09 19:54:05
Message-ID: c60ac320-7390-80c4-d012-5b149d59aa40@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04/09/2018 09:37 PM, Andres Freund wrote:
>
>
> On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos <ailiop(at)altatus(dot)com> wrote:
>
>> I honestly do not expect that keeping around the failed pages will
>> be an acceptable change for most kernels, and as such the
>> recommendation
>> will probably be to coordinate in userspace for the fsync().
>
> Why is that required? You could very well just keep per inode
> information about fatal failures that occurred around. Report errors
> until that bit is explicitly cleared. Yes, that keeps some memory
> around until unmount if nobody clears it. But it's orders of
> magnitude less, and results in usable semantics.
>

Isn't the expectation that when a fsync call fails, the next one will
retry writing the pages in the hope that it succeeds?

Of course, it's also possible to do what you suggested, and simply mark
the inode as failed. In which case the next fsync can't possibly retry
the writes (e.g. after freeing some space on thin-provisioned system),
but we'd get reliable failure mode.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-04-09 19:59:34 Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Previous Message Anthony Iliopoulos 2018-04-09 19:51:12 Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS