Re: Freezing without write I/O

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Freezing without write I/O
Date: 2013-06-07 19:10:55
Message-ID: CA+U5nMKDFdy_1yAh-jpYbra27L3HW=Pk5qP_qQ=iUtDwrvwzKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7 June 2013 19:56, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> On 07.06.2013 21:33, Simon Riggs wrote:
>>
>> Now that I consider Greg's line of thought, the idea we focused on
>> here was about avoiding freezing. But Greg makes me think that we may
>> also wish to look at allowing queries to run longer than one epoch as
>> well, if the epoch wrap time is likely to come down substantially.
>>
>> To do that I think we'll need to hold epoch for relfrozenxid as well,
>> amongst other things.

> The biggest problem I see with that is that if a snapshot can be older than
> 2 billion XIDs, it must be possible to store XIDs on the same page that are
> more than 2 billion XIDs apart. All the discussed schemes where we store the
> epoch at the page level, either explicitly or derived from the LSN, rely on
> the fact that it's not currently necessary to do that. Currently, when one
> XID on a page is older than 2 billion XIDs, that old XID can always be
> replaced with FrozenXid, because there cannot be a snapshot old enough to
> not see it.

It does seem that there are two problems: avoiding freezing AND long
running queries

The long running query problem hasn't ever been looked at, it seems,
until here and now.

> I agree that it would be nice if you could find a way around that. You had a
> suggestion on making room on the tuple header for the epoch. I'm not sure I
> like that particular proposal, but we would need something like that. If we
> settle for snapshots that can be at most, say, 512 billion transactions old,
> instead of 2 billion, then we would only need one byte to store an epoch
> "offset" in the tuple header. Ie. deduce the epoch of tuples on the page
> from the LSN on the page header, but allow individual tuples to specify an
> offset from that deduced epoch.

I like the modification you propose. And I like it even better because
it uses just 1 byte, which is even more easily squeezed into the
existing tuple header, whether we go with my proposed squeezing route
or not.

> In practice, I think we're still quite far from people running into that 2
> billion XID limit on snapshot age. But maybe in a few years, after we've
> solved all the more pressing vacuum and wraparound issues that people
> currently run into before reaching that stage...

Your WALInsert lock patch will fix that. ;-)

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2013-06-07 19:14:32 Re: Hard limit on WAL space used (because PANIC sucks)
Previous Message Robert Haas 2013-06-07 19:04:57 Re: Parallell Optimizer