Re: Incomplete freezing when truncating a relation during vacuum

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Incomplete freezing when truncating a relation during vacuum
Date: 2013-11-27 09:15:10
Message-ID: 20131127091510.GC28863@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-11-27 11:01:55 +0200, Heikki Linnakangas wrote:
> On 11/27/13 01:21, Andres Freund wrote:
> >On 2013-11-26 13:32:44 +0100, Andres Freund wrote:
> >>This seems to be the case since
> >>b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. I suggest we go back to using
> >>scan_all to determine whether we can set new_frozen_xid. That's a slight
> >>pessimization when we scan a relation fully without explicitly scanning
> >>it in its entirety, but given this isn't the first bug around
> >>scanned_pages/rel_pages I'd rather go that way. The aforementioned
> >>commit wasn't primarily concerned with that.
> >>Alternatively we could just compute new_frozen_xid et al before the
> >>lazy_truncate_heap.
> >
> >I've gone for the latter in this preliminary patch. Not increasing
> >relfrozenxid after an initial data load seems like a bit of a shame.
> >
> >I wonder if we should just do scan_all || vacrelstats->scanned_pages <
> >vacrelstats->rel_pages?
>
> Hmm, you did (scan_all || vacrelstats->scanned_pages <
> vacrelstats->rel_pages) for relminmxid, and just (vacrelstats->scanned_pages
> < vacrelstats->rel_pages) for relfrozenxid. That was probably not what you
> meant to do, the thing you did for relfrozenxid looks good to me.

I said it's a preliminary patch ;), really, I wasn't sure what of both
to go for.

> Does the attached look correct to you?

Looks good.

I wonder if we need to integrate any mitigating logic? Currently the
corruption may only become apparent long after it occurred, that's
pretty bad. And instructing people run a vacuum after the ugprade will
cause the corrupted data being lost if they are already 2^31 xids. But
integrating logic to fix things into heap_page_prune() looks somewhat
ugly as well.
Afaics the likelihood of the issue occuring on non-all-visible pages is
pretty low, since they'd need to be skipped due to lock contention
repeatedly.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-11-27 09:20:47 Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Previous Message Karsten Hilbert 2013-11-27 09:14:00 Re: [GENERAL] pg_upgrade ?deficiency