Re: tracking commit timestamps

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: tracking commit timestamps
Date: 2013-12-10 09:56:45
Message-ID: 52A6E55D.6020603@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www

On 10/23/2013 01:16 AM, Alvaro Herrera wrote:
> There has been some interest in keeping track of timestamp of
> transaction commits. This patch implements that.
>
> There are some seemingly curious choices here. First, this module can
> be disabled, and in fact it's turned off by default. At startup, we
> verify whether it's enabled, and create the necessary SLRU segments if
> so. And if the server is started with this disabled, we set the oldest
> value we know about to avoid trying to read the commit TS of
> transactions of which we didn't keep record. The ability to turn this
> off is there to avoid imposing the overhead on systems that don't need
> this feature.
>
> Another thing of note is that we allow for some extra data alongside the
> timestamp proper. This might be useful for a replication system that
> wants to keep track of the origin node ID of a committed transaction,
> for example. Exactly what will we do with the bit space we have is
> unclear, so I have kept it generic and called it "commit extra data".
>
> This offers the chance for outside modules to set the commit TS of a
> transaction; there is support for WAL-logging such values. But the core
> user of the feature (RecordTransactionCommit) doesn't use it, because
> xact.c's WAL logging itself is enough. For systems that are replicating
> transactions from remote nodes, it is useful.
>
> We also keep track of the latest committed transaction. This is
> supposed to be useful to calculate replication lag.

Generally speaking, I'm not in favor of adding dead code, even if it
might be useful to someone in the future. For one, it's going to get
zero testing. Once someone comes up with an actual use case, let's add
that stuff at that point. Otherwise there's a good chance that we build
something that's almost but not quite useful.

Speaking of the functionality this does offer, it seems pretty limited.
A commit timestamp is nice, but it isn't very interesting on its own.
You really also want to know what the transaction did, who ran it, etc.
ISTM some kind of a auditing or log-parsing system that could tell you
all that would be much more useful, but this patch doesn't get us any
closer to that.

Does this handle XID wraparound correctly? SLRU has a maximum of 64k
segments with 32 SLRU pages each. With 12 bytes per each commit entry,
that's not enough to hold the timestamp and "commit extra data" of the
whole 2^31 XID range: (8192 * 32 * 65536) / 12 = 1431655765. And that's
with the default page size, with smaller pages you run into the limit
quicker.

It would be nice to teach SLRU machinery how to deal with more than 64k
segments. SSI code in twophase.c ran into the same limit, and all you
get is a warning there.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-12-10 11:18:22 Re: tracking commit timestamps
Previous Message Andres Freund 2013-12-10 09:38:48 Re: Time-Delayed Standbys

Browse pgsql-www by date

  From Date Subject
Next Message Achim Eisele 2013-12-10 10:05:37 Re: Karlsruhe PUG
Previous Message Fujii Masao 2013-12-10 05:16:38 Re: pgfoundry.org down