Re: Freezing without write I/O

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Freezing without write I/O
Date: 2013-06-07 18:56:27
Message-ID: 51B22CDB.7050903@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07.06.2013 21:33, Simon Riggs wrote:
> Now that I consider Greg's line of thought, the idea we focused on
> here was about avoiding freezing. But Greg makes me think that we may
> also wish to look at allowing queries to run longer than one epoch as
> well, if the epoch wrap time is likely to come down substantially.
>
> To do that I think we'll need to hold epoch for relfrozenxid as well,
> amongst other things.

The biggest problem I see with that is that if a snapshot can be older
than 2 billion XIDs, it must be possible to store XIDs on the same page
that are more than 2 billion XIDs apart. All the discussed schemes where
we store the epoch at the page level, either explicitly or derived from
the LSN, rely on the fact that it's not currently necessary to do that.
Currently, when one XID on a page is older than 2 billion XIDs, that old
XID can always be replaced with FrozenXid, because there cannot be a
snapshot old enough to not see it.

I agree that it would be nice if you could find a way around that. You
had a suggestion on making room on the tuple header for the epoch. I'm
not sure I like that particular proposal, but we would need something
like that. If we settle for snapshots that can be at most, say, 512
billion transactions old, instead of 2 billion, then we would only need
one byte to store an epoch "offset" in the tuple header. Ie. deduce the
epoch of tuples on the page from the LSN on the page header, but allow
individual tuples to specify an offset from that deduced epoch.

In practice, I think we're still quite far from people running into that
2 billion XID limit on snapshot age. But maybe in a few years, after
we've solved all the more pressing vacuum and wraparound issues that
people currently run into before reaching that stage...

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-06-07 18:57:16 Re: Bad error message on valuntil
Previous Message Tom Lane 2013-06-07 18:49:37 Re: ALTER DEFAULT PRIVILEGES FOR ROLE is broken