Re: On-the-fly index tuple deletion vs. hot_standby

From: Noah Misch <noah(at)leadboat(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Subject: Re: On-the-fly index tuple deletion vs. hot_standby
Date: 2011-06-14 05:08:27
Message-ID: 20110614050827.GE11441@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 13, 2011 at 04:41:11PM +0100, Simon Riggs wrote:
> On Thu, Jun 9, 2011 at 10:38 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Fri, Apr 22, 2011 at 11:10:34AM -0400, Noah Misch wrote:
> >> In an attempt to resuscitate this thread, here's my own shot at that. ?Apologies
> >> in advance if it's just an already-burning straw man.
> > [full proposal at http://archives.postgresql.org/message-id/20110422151034.GA8150@tornado.gateway.2wire.net]
> >
> > Anyone care to comment? ?On this system, which has vacuum_defer_cleanup_age set
> > to 3 peak hours worth of xid consumption, the problem caps recovery conflict
> > hold off at 10-20 minutes. ?It will have the same effect on standby feedback in
> > 9.1. ?I think we should start by using RecentGlobalXmin instead of RecentXmin as
> > the reuse horizon when wal_level = hot_standby, and backpatch that. ?Then,
> > independently consider for master a bloat-avoidance improvement like I outlined
> > most recently; I'm not sure whether that's worth it. ?In any event, I'm hoping
> > to get some consensus on the way forward.
>
> I like your ideas.
>
> (Also, I note that using xids in this way unnecessarily keeps bloat
> around for a long time, if we have periods of mostly read-only
> activity. Interesting point.)

Thanks. Yes, the current approach can mean two long-running-transaction
lifetimes before full space reclamation. Would be nice to avoid.

> I think we would only get away with this approach on leaf pages of the
> index. It doesn't seem worth trying for the locks if we were higher
> up.

Don't we only delete leaf pages? (And, one might say, pages that have become
leaves by having all children deleted.)

> On the standby side, its possible this could generate additional
> buffer pin deadlocks and/or contention. So I would also want to look
> at some deferral mechanism, so that we can mark the block removed, but
> not actually do so until some time later, or we really need to, for
> example when we write new data to that page.

That could be handy. I do wonder what high pin-hold durations arise regularly
in the field. Preventing standby buffer pin deadlocks would be a nice win in
any case.

> Got time for a patch in this coming CF?

Not for the June CF, unfortunately. Making a suitable test suite will be a
large portion of the work. The logic in question closes race conditions that
probably only arise under the heaviest field concurrency, so I'll need to
systematically visit every concurrency variation to have confidence that it's
correct.

nm

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-06-14 05:09:24 Re: SSI patch renumbered existing 2PC resource managers??
Previous Message Fujii Masao 2011-06-14 05:08:26 Cascade replication