Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Andres Freund <andres(at)anarazel(dot)de>
To: Anthony Iliopoulos <ailiop(at)altatus(dot)com>,Greg Stark <stark(at)mit(dot)edu>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>,Geoff Winkless <pgsqladmin(at)geoff(dot)dj>,Craig Ringer <craig(at)2ndquadrant(dot)com>,Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>,Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>,Bruce Momjian <bruce(at)momjian(dot)us>,Robert Haas <robertmhaas(at)gmail(dot)com>,Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,Catalin Iacob <iacobcatalin(at)gmail(dot)com>,PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date: 2018-04-09 19:37:03
Message-ID: 7E06F66B-E0AD-4C51-AC02-4AAFA1593923@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos <ailiop(at)altatus(dot)com> wrote:

>I honestly do not expect that keeping around the failed pages will
>be an acceptable change for most kernels, and as such the
>recommendation
>will probably be to coordinate in userspace for the fsync().

Why is that required? You could very well just keep per inode information about fatal failures that occurred around. Report errors until that bit is explicitly cleared. Yes, that keeps some memory around until unmount if nobody clears it. But it's orders of magnitude less, and results in usable semantics.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2018-04-09 19:41:19 Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Previous Message Magnus Hagander 2018-04-09 19:36:40 Re: Fix pg_rewind which can be run as root user