Re: tracking commit timestamps

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Subject: Re: tracking commit timestamps
Date: 2014-11-05 16:23:15
Message-ID: 545A4EF3.7040009@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www

On 11/5/14, 6:10 AM, Michael Paquier wrote:
> In addition, I wonder if this feature would be misused. Record
> transaction ids to a table to find out commit order (use case could be
> storing historical row versions for example). Do a dump and restore on
> another cluster, and all the transaction ids are completely meaningless
> to the system.
>
> I think you are forgetting the fact to be able to take a consistent dump using an exported snapshot. In this case the commit order may not be that meaningless..

Anssi's point is that you can't use xmin because it can change, but I think anyone working with this feature would understand that.

> Having the ability to record commit order into an audit table would be
> extremely welcome, but as is, this feature doesn't provide it.
>
> That's something that can actually be achieved with this feature if the SQL interface is able to query all the timestamps in a xid range with for example a background worker that tracks this data periodically. Now the thing is as well: how much timestamp history do we want to keep? The patch truncating SLRU files with frozenID may cover a sufficient range...

Except that commit time is not guaranteed unique *even on a single system*. That's my whole point. If we're going to bother with all the commit time machinery it seems really silly to provide a way to uniquely order every commit.

Clearly trying to uniquely order commits across multiple systems is a far larger problem, and I'm not suggesting we attempt that. But for a single system AIUI all we need to do is expose the LSN of each commit record and that will give you the exact and unique order in which transactions committed.

This isn't a hypothetical feature either; if we had this, logical replication systems wouldn't have to try and fake this via batches. You could actually recreate exactly what data was visible at what time to all transactions, not just repeatable read ones (as long as you kept snapshot data as well, which isn't hard).

As for how much data to keep, if you have a process that's doing something to record this information permanently all it needs to do is keep an old enough snapshot around. That's not that hard to do, even from user space.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2014-11-05 16:27:18 Re: Time to remove dummy autocommit GUC?
Previous Message Tom Lane 2014-11-05 15:57:05 Re: Order of views in stats docs

Browse pgsql-www by date

  From Date Subject
Next Message Andres Freund 2014-11-05 16:30:51 Re: tracking commit timestamps
Previous Message Magnus Hagander 2014-11-05 12:59:25 Re: [BUGS] BUG #11872: row height is not quite tall eneough