Re: [REVIEW] pg_last_xact_insert_timestamp

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, "masao(dot)fujii" <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] pg_last_xact_insert_timestamp
Date: 2011-10-03 19:07:54
Message-ID: CA+TgmoZP5SOKW7hYD0vs+ZZ2m64S5Kytc_zoZry_tU1+yxtpDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 2, 2011 at 8:21 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> The problem is to find the replication delay, even when the system is quiet.
>
> What I have proposed finds the replication delay more accurately even
> than looking at the last commit, since often there are writes but no
> commits.
>
> If we focus on the problem, rather than the first suggested solution
> to that problem, we'll come out on top.

Sorry, but I still don't really think it's fair to say that you've
proposed a solution to this problem. Or if you have, neither I nor
Fujii Masao understand that proposal well enough to decide whether we
like it. You said "maybe we could WAL log something once per
checkpoint cycle" or "maybe we could add a new protocol message".
We've both replied with various emails saying "we don't understand how
that would solve the problem". If you want to add some detail to your
proposal, then we can weigh the pros and cons as compared with what
the patch does - but right now all you've provided is a theory that
there might be a better solution to this problem out there, not any
details about how it would work. Or if you have, then please post a
link to the message where those details are written out, because I
cannot find them on the thread.

I do, however, agree that that the case where the system is quiet is
the problem case for computing replication delay. It seems to me
that, even without this patch, if the system has a continuous stream
of commits, you can easily find the replication delay by differencing
the current time on the master with the value returned by
pg_last_xact_replay_timestamp(). However, if the master goes quiet,
then the slave will appear to be progressively farther behind. With
the addition of this patch, that problem goes away: you can now
difference the return value of pg_last_xact_insert_timestamp() on the
master with the return value of pg_last_xact_replay_timestamp() on the
slave. If the master goes quiet, then pg_last_xact_insert_timestamp()
will stop advancing, and so the two values you are comparing will be
equal once the slave has caught up, and remain equal until activity
resumes on the master.

Now, there is a more subtle remaining problem, which is that when
activity *does* resume on the master, there will be a (probably quite
short) window of time during which the slave will have a much earlier
timestamp than the one on the master. When the master has a commit
after a long idle period but the slave has not yet replayed the commit
record, the replication delay will appear to be equal to the length of
the idle period. But that doesn't seem like a sufficient reason to
reject the whole approach, because there are several ways around it.
First, you could simply decide that the large computed lag value,
although counterintuitive, is accurate under some definition, because,
well, that really is the lag between the last transaction committed on
the master and the last transaction committed on the standby, and if
you don't like the fact that timestamps behave that way, you should
compare using WAL positions instead. If you don't like that approach,
then a second, also viable approach is to teach your monitoring
software that the replication delay can never increase faster than the
rate at which clock time is passing. So if you were caught up a
minute ago, then you can't be more than a minute behind now.

Another point I want to make here is that there's probably more than
one useful definition of replication delay. The previous question
presupposes that you're trying to answer the question "if I have a
transaction that committed N seconds ago on the master, will it be
visible on the standby?". It's also a reasonable time-based
substitute for measuring the difference in master and standby WAL
positions, although certainly it's going to work better if the rate of
WAL generation is relatively even. But for a lot of people, it may be
that what they really want to know is "what is the expected time for
the standby to replay all generated but not yet applied WAL?" - or
maybe some third thing that I'm not thinking of - and this function
won't provide that. I think we can ultimately afford to provide more
than one mechanism here, so I don't see doing this as foreclosing any
other also-useful calculation that someone may wish to add in the
future.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2011-10-03 19:08:16 Re: Bug with pg_ctl -w/wait and config-only directories
Previous Message Bruce Momjian 2011-10-03 19:03:47 Re: Bug with pg_ctl -w/wait and config-only directories