Avoiding bloat in the presence of a long-running transaction (Re: Freezing without write I/O)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Avoiding bloat in the presence of a long-running transaction (Re: Freezing without write I/O)
Date: 2013-06-07 19:44:52
Message-ID: 51B23834.5020803@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07.06.2013 22:15, Robert Haas wrote:
> On Fri, Jun 7, 2013 at 3:10 PM, Simon Riggs<simon(at)2ndquadrant(dot)com> wrote:
>> The long running query problem hasn't ever been looked at, it seems,
>> until here and now.
>
> For what it's worth (and that may not be much), I think most people
> will die a horrible death due to bloat after holding a transaction
> open for a tiny fraction of 2B XIDs. :-(

Yeah, we should fix that too ;-).

While we're at it: I've been thinking that we should try harder to
vacuum dead tuples that are no longer visible to any snapshot, even if
there's an even old snapshot. The stereotypical scenario is a table with
a single row that's updated very very frequently. Like a counter.
Autovacuum can normally keep it in check, but if you have a long-running
transaction, it will bloat horrendously. But if you only have one
long-running transaction with one really old snapshot, and everything
else is recent, you'd really only need to keep one old tuple around for
the old snapshot to see, and a recent version or two for the rest. At
worst, the database needs to bloat to double the size, but not more than
that.

To know which tuples are dead at such a fine-grained level, vacuum would
need to know in more detail what snapshots the backends have. I'm really
excited about Ants Aasma's proposal to use a CSN for snapshots, or more
precisely the variant using commit record's LSN for that. If a snapshot
is just a single integer, it becomes easier for backends to share their
snapshots, in limited amount of shared memory.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2013-06-07 19:46:06 Re: Redesigning checkpoint_segments
Previous Message Tom Lane 2013-06-07 19:31:33 Re: Bad error message on valuntil