Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <

From: Andres Freund <andres(at)anarazel(dot)de>
To: Kevin Grittner <kgrittn(at)gmail(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)postgresql(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Date: 2016-07-08 18:52:21
Message-ID: 20160708185221.ez2hnj4yalxaoqnm@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 2016-07-08 13:32:35 -0500, Kevin Grittner wrote:
> On Fri, Jul 8, 2016 at 12:53 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2016-07-08 11:00:50 -0500, Kevin Grittner wrote:
> >> On Wed, Jul 6, 2016 at 4:55 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >>
> >> > So I don't think that approach still allows old snapshot related
> >> > cleanups for toast triggered vacuums? Is that an acceptable
> >> > restriction?
> >>
> >> What I would rather see is that if the heap is vacuumed (whether or
> >> not by autovacuum) then the related TOAST table is also vacuumed
> >> (using the same horizon the heap used), but if the TOAST relation
> >> is chosen for vacuum by itself that it does not attempt to adjust
> >> the horizon based on old_snapshot_threshold.
> >
> > Uh, wouldn't that quote massively regress the autovacuum workload in
> > some cases? There's a reason they're considered separately after
> > all. And in many cases, even if there's lots of updates in the heap
> > table, the toast table doesn't get any updates. And the toast table is
> > often a lot larger than the data.
>
> Of course, the toast table has only one index, and it is narrow.

But that index and the table are often large...

> With the visibility map, it should visit only the needed pages in
> the toast's heap area, so any regression would be in the case that:
>
> (1) old_snapshot_threshold >= 0
> (2) the "normal" heap met the conditions for vacuum, but the heap
> didn't
> (3) when passing the toast heap based on visibility map, *some*
> cleanup was done (otherwise the TID list would be empty, so no
> index pass is needed)

Unfortunately btree performs an index scan, even if there's no tids to
clean up. See the unconditional calls to
lazy_cleanup_index()->amvacuumcleanup(). C.f.
/*
* If btbulkdelete was called, we need not do anything, just return the
* stats from the latest btbulkdelete call. If it wasn't called, we must
* still do a pass over the index, to recycle any newly-recyclable pages
* and to obtain index statistics.
*
* Since we aren't going to actually delete any leaf items, there's no
* need to go through all the vacuum-cycle-ID pushups.

> but I'm also sure that by containing toast size
> when it would otherwise grow for weeks or months, it could be a
> very large performance gain.

That's an argument for changing autovacuum heuristics, not for making
this change as a side-effect of a bugfix.

I'm a bit confused, why aren't we simply adding LSN interlock checks for
toast? Doesn't look that hard? Seems like a much more natural course of
fixing this issue?

Regards,

Andres

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2016-07-08 20:38:29 pgsql: Add more temporary code to record stack usage at server process
Previous Message Kevin Grittner 2016-07-08 18:32:35 Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2016-07-08 18:57:57 Re: MVCC overheads
Previous Message Peter Geoghegan 2016-07-08 18:49:26 Re: MVCC overheads