Re: Proposal for CSN based snapshots

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Markus Wanner <markus(at)bluegap(dot)ch>
Subject: Re: Proposal for CSN based snapshots
Date: 2014-05-15 19:40:06
Message-ID: CA+TgmoYiqga4KWz_1=O8D6ivmTAJfY=-fxO6Nk+D-o3M0M60=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 15, 2014 at 2:34 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Mon, May 12, 2014 at 06:01:59PM +0300, Heikki Linnakangas wrote:
>> >Some of the stuff in here will be influence whether your freezing
>> >replacement patch gets in. Do you plan to further pursue that one?
>>
>> Not sure. I got to the point where it seemed to work, but I got a
>> bit of a cold feet proceeding with it. I used the page header's LSN
>> field to define the "epoch" of the page, but I started to feel
>> uneasy about it. I would be much more comfortable with an extra
>> field in the page header, even though that uses more disk space. And
>> requires dealing with pg_upgrade.
>
> FYI, pg_upgrade copies pg_clog from the old cluster, so there will be a
> pg_upgrade issue anyway.
>
> I am not excited about a 32x increase in clog size, especially since we
> already do freezing at 200M transactions to allow for more aggressive
> clog trimming. Extrapolating that out, it means we would freeze every
> 6.25M transactions.

It seems better to allow clog to grow larger than to force
more-frequent freezing.

If the larger clog size is a show-stopper (and I'm not sure I have an
intelligent opinion on that just yet), one way to get around the
problem would be to summarize CLOG entries after-the-fact. Once an
XID precedes the xmin of every snapshot, we don't need to know the
commit LSN any more. So we could read the old pg_clog files and write
new summary files. Since we don't need to care about subcommitted
transactions either, we could get by with just 1 bit per transaction,
1 = committed, 0 = aborted. Once we've written and fsync'd the
summary files, we could throw away the original files. That might
leave us with a smaller pg_clog than what we have today.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-05-15 19:55:07 Re: autovacuum scheduling starvation and frenzy
Previous Message Heikki Linnakangas 2014-05-15 19:30:53 Re: Logical replication woes