Re: Unexpected VACUUM FULL failure

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unexpected VACUUM FULL failure
Date: 2007-08-09 07:32:22
Message-ID: 1186644742.4208.66.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2007-08-08 at 23:23 -0400, Tom Lane wrote:
> I wrote:
> > ... Since we've whacked the tqual.c logic around recently,
> > the problem might actually lie there...
>
> In fact, I bet this is a result of the async-commit patch. The places
> where vacuum.c bleats "HEAP_MOVED_OFF was expected" are all places where
> it is looking at a tuple not marked XMIN_COMMITTED; it expects that
> after its first pass over the table, *every* tuple is either
> XMIN_COMMITTED or one that it moved. Async commit changed tqual.c
> so that tuples that are in fact known committed might not get marked
> XMIN_COMMITTED right away. The patch tries to prevent this from
> happening within VACUUM FULL by means of
>
> /*
> * VACUUM FULL assumes that all tuple states are well-known prior to
> * moving tuples around --- see comment "known dead" in repair_frag(),
> * as well as simplifications in tqual.c. So before we start we must
> * ensure that any asynchronously-committed transactions with changes
> * against this table have been flushed to disk. It's sufficient to do
> * this once after we've acquired AccessExclusiveLock.
> */
> XLogAsyncCommitFlush();
>
> but I bet lunch that that's not good enough. I still haven't reproduced
> it, but I'm thinking that the inexact bookkeeping that we created for
> clog page LSNs allows tuples to not get marked if the right sort of
> timing of concurrent transactions happens.
>
> Not sure about the best solution for this.

Good hunch. I plugged this hole earlier, but on further inspection I can
see the plug wasn't wide enough. XLogAsyncCommitFlush() is good enough,
but HeapTupleSatisfiesVacuum() still allowed the inexact bookkeeping to
sometimes skip hint bit setting, when executed with concurrent
transactions touching other tables.

ISTM that if we call HeapTupleSatisfiesVacuum() with an additional
boolean parameter, force, we can tell VF to always set the hint bits in
every case, not just HEAP_MOVED_IN and HEAP_MOVED_OUT.

Patch enclosed, but a little crufty. Gotta run now, talk later.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
async_commit_bug.v1.patch text/x-patch 6.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Adrian Maier 2007-08-09 07:47:21 Compilation of pg 7.4.17 fails on HP-UX
Previous Message Simon Riggs 2007-08-09 07:24:37 Re: HOT patch, missing things