Re: Keepalive for max_standby_delay

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Keepalive for max_standby_delay
Date: 2010-05-15 17:24:30
Message-ID: 1273944270.308.9032.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 2010-05-15 at 20:05 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Sat, 2010-05-15 at 19:30 +0300, Heikki Linnakangas wrote:
> >> Doesn't feel right to me either. If you want to expose the
> >> keepalive-time to queries, it should be a separate field, something like
> >> lastMasterKeepaliveTime and a pg_last_master_keepalive() function to
> >> read it.
> >
> > That wouldn't be good because then you couldn't easily monitor the
> > delay? You'd have to run two different functions depending on the state
> > of replication (for which we would need yet another function). Users
> > would just wrap that back up into a single function.
>
> What exactly is the user trying to monitor? If it's "how far behind is
> the standby", the difference between pg_current_xlog_insert_location()
> in the master and pg_last_xlog_replay_location() in the standby seems
> more robust and well-defined to me. It's a measure of XLOG location (ie.
> bytes) instead of time, but time is a complicated concept.

Maybe, but its meaningful to users and that is the point.

> Also note that as the patch stands, if you receive a keep-alive from the
> master at point X, it doesn't mean that the standby is fully up-to-date.
> It's possible that the walsender just finished sending a huge batch of
> accumulated WAL, say 1 GB, and it took 1 hour for the batch to be sent.
> During that time, a lot more WAL has accumulated, yet walsender sends a
> keep-alive with the current timestamp.

Not at all. The timestamp for the keepalive is calculated after the
pq_flush for the main WAL data. So it takes 10 years to send the next
blob of WAL data the timestamp will be current.

However, a point you made in an earlier thread is still true here. It
sounds like it would be best if startup process didn't re-access shared
memory for the next location until it had fully replayed all the WAL up
to the point it last accessed. That way the changing value of the shared
timestamp would have no effect on the calculated value at any point in
time. I will recode using that concept.

> In general, the purpose of a keep-alive is to keep the connection alive,
> but you're trying to accomplish something else too, and I don't fully
> understand what it is.

That surprises me. If nothing else, its in the title of the thread,
though since you personally added this to the Hot Standby todo more than
6 months ago I'd hope you of all people would understand the purpose.

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2010-05-15 17:58:03 Re: [HACKERS] List traffic
Previous Message Heikki Linnakangas 2010-05-15 17:05:41 Re: Keepalive for max_standby_delay