Re: tracking commit timestamps

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Subject: Re: tracking commit timestamps
Date: 2014-11-01 04:45:44
Message-ID: CAB7nPqSC7GDjMV=u5PW0N4ZKreduLAVz8gc7wMu78ig-HaycBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-www

On Sat, Nov 1, 2014 at 12:46 AM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:

> Hi,
>
> On 28/10/14 13:25, Simon Riggs wrote:
>
>> On 13 October 2014 10:05, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>>
>> I worked bit on this patch to make it closer to committable state.
>>>>
>>>
>> Here is updated version that works with current HEAD for the October
>>> committfest.
>>>
>>
>> I've reviewed this and it looks good to me. Clean, follows existing
>> code neatly, documented and tested.
>>
>>
> Thanks for looking at this.
>
> Please could you document a few things
>>
>> * ExtendCommitTS() works only because commit_ts_enabled can only be
>> set at server start.
>> We need that documented so somebody doesn't make it more easily
>> enabled and break something.
>> (Could we make it enabled at next checkpoint or similar?)
>>
>
> Maybe we could, but that means some kind of two step enabling facility and
> I don't want to write that as part of the initial patch since that will
> need separate discussion (i.e. do we want to have new GUC flag for this, or
> hack solution just for committs?). So maybe later in a follow-up patch.
>
> * The SLRU tracks timestamps of both xids and subxids. We need to
>> document that it does this because Subtrans SLRU is not persistent. If
>> we made Subtrans persistent we might need to store only the top level
>> xid's commitTS, but that's very useful for typical use cases and
>> wouldn't save much time at commit.
>>
>
> Attached version with the above comments near the relevant code.
>
On a personal note, I think that this is a useful feature, particularly
useful for replication solutions to resolve commit conflicts by using the
method of the first-transaction-that-commits-wins, but this has already
been mentioned on this thread. So yes I am a fan of it, and yes let's keep
the GUC controlling it at off by default.

Now here are a couple of comments at code level, this code seems not enough
baked for a commit:
1) The following renaming should be done:
- pg_get_transaction_committime to pg_get_transaction_commit_time
- pg_get_transaction_extradata to pg_get_transaction_extra_data
- pg_get_transaction_committime_data to pg_get_transaction_commit_time_data
- pg_get_latest_transaction_committime_data to
pg_get_latest_transaction_commit_time_data
2) This patch adds a new option -c in pg_resetxlog to set the transaction
XID of the transaction from which can be retrieved a commit timestamp, but
the documentation of pg_resetxlog is not updated.
3) General remark: committs is not a name suited IMO (ts for
transaction??). What if the code is changed to use commit_time instead?
This remark counts as well for the file names committs.c and committs.h,
and for pg_committs.
4) Nitpicky remark in pg_resetxlog, let's try to respect the alphabetical
order (not completely related to this patch), so not:
+ while ((c = getopt(argc, argv, "D:fl:m:no:O:x:e:c:")) != -1)
but:
+ while ((c = getopt(argc, argv, "c:e:D:fl:m:no:O:x:")) != -1)
5) --help message should be reworked (alphabetical order of the entries),
this will avoid some cycles of Peter as he usually spends time revisiting
and cleaning up such things:
printf(_(" -x XID set next transaction ID\n"));
+ printf(_(" -c XID set the oldest retrievable commit
timestamp\n"));
printf(_(" -?, --help show this help, then exit\n"));
6) To be consistent with everything, shouldn't track_commit_timestamp be
renamed to track_commit_time
7) This documentation portion should be reworked from that:
+ <para>
+ Record commit time of transactions. This parameter
+ can only be set in
+ the <filename>postgresql.conf</> file or on the server command
line.
+ The default value is off.
+ </para>
To roughly that (not the rewording and the use of <literal>):
+ <para>
+ Record commit time of transactions. This parameter
+ can only be set in <filename>postgresql.conf</> or on the server
command line.
+ The default value is <literal>off</literal>.
+ </para>
8) Let's update this file list more consistently:
-OBJS = clogdesc.o dbasedesc.o gindesc.o gistdesc.o hashdesc.o heapdesc.o \
+OBJS = clogdesc.o committsdesc.o dbasedesc.o gindesc.o gistdesc.o
hashdesc.o \
+ heapdesc.o \
mxactdesc.o nbtdesc.o relmapdesc.o seqdesc.o smgrdesc.o
spgdesc.o \
standbydesc.o tblspcdesc.o xactdesc.o xlogdesc.o
9) Hm?! "oldest commit time xid", no?
- "oldest running xid %u;
%s",
+ "oldest CommitTs xid: %u;
oldest running xid %u; %s",
10) I don't see why this diff is in the patch:
/*
- * See how many subxids, if any, are on the same page as the
parent, if
- * any.
+ * See how many subxids, if any, are on the same page as the parent.
*/
11) contrib/Makefile has not been updated with the new module test_committs
that this patch introduces.
12) In committs_desc(at)committsdesc(dot)c, isn't this block overthinking a bit:
+ else
+ appendStringInfo(buf, "UNKNOWN");
It may be better to remove it, no?
13) Isn't that 2014?
+ * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
14) I'd put the two checks in the reverse order:
+ Assert(xid != InvalidTransactionId);
+
+ if (!commit_ts_enabled)
+ return;
15) The multiple calls to PG_FUNCTION_INFO_V1 in committs.c are not
necessary. Those functions are already defined in pg_proc.h.
16) make installcheck (test committs_off) fails on a server that has
track_committs set to on. You should use an alternate output. I would
recommend removing as well the _off suffix in the test name. Let's use
commit_time. Also, it should be mentioned in parallel_schedule with a
comment that this test should always run alone and never in parallel with
other tests. Honestly, I also think that test_committs brings no additional
value and results in duplication code between src/test/regress and
contrib/test_committs. So I'd just rip it off. On top of that, I think that
"SHOW track_committs" should be added in the list of commands run in the
test. We actually want to check of commit time are really registered if the
feature switch is on or off.

I am still planning to do more extensive tests, and study a bit more
committs.c (with even more comments) as it is the core part of the feature.
For now I'd recommend to hold on commit fire for this patch.
Regards,
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-11-01 05:01:43 Re: group locking: incomplete patch, just for discussion
Previous Message Eric Ridge 2014-11-01 04:21:07 Re: Let's drop two obsolete features which are bear-traps for novices

Browse pgsql-www by date

  From Date Subject
Next Message Petr Jelinek 2014-11-01 11:19:32 Re: tracking commit timestamps
Previous Message Michael Paquier 2014-10-31 22:46:41 Re: tracking commit timestamps