From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hot standby, race condition between recovery snapshot and commit |
Date: | 2009-11-15 14:19:24 |
Message-ID: | 1258294764.14054.1379.camel@ebony |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, 2009-11-15 at 14:43 +0200, Heikki Linnakangas wrote:
> This isn't absolutely necessary for the first version, but it's
> something to keep in mind...
Do I take that as agreement to the phased plan?
> In general, I'd like to remove as many as possible of those cases
> where the standby starts up, and can't open up for connections. It
> makes the standby a lot less useful if you can't rely on it being
> open. So I'd like to make it so that the standby can *always* open up.
Yes, of course. The only reason for restrictions being acceptable is
that we have 99% of what we want, yet may lose everything if we play for
100% too quickly.
The standby will open quickly in many cases, as is. There are also a
range of other ways of doing this.
> There's currently three cases where that can happen:
>
> 1. If the subxid cache has overflown.
>
> 2. If there's no running-xacts record after the checkpoint record for
> some reason. For example, one was written but not archive yet, or
> because the master crashed before it was written.
>
> 3. If too many AccessExclusiveLocks was being held.
>
> Case 3 should be pretty easy to handle. Just need to WAL log all the
> AccessExclusiveLocks, perhaps as separate WAL records (we already have
> a
> new WAL record type for logging locks) if we're worried about the
> running-xacts record growing too large. I think we could handle case 2
> if we wrote the running-xacts record *before* the checkpoint record.
> Then it would be always between the REDO pointer of the checkpoint
> record, and the checkpoint record itself, so it would always be seen
> by
> the WAL recovery. To handle case 1, we could scan pg_subtrans. It
> would
> add some amount of code and would add some more work to taking the
> running-xacts snapshot, but it could be done.
"Some amount of code" requires some amount of thought, followed by some
amount of review which takes some amount of time.
--
Simon Riggs www.2ndQuadrant.com
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2009-11-15 14:32:58 | Re: Summary and Plan for Hot Standby |
Previous Message | Heikki Linnakangas | 2009-11-15 14:07:08 | Re: Summary and Plan for Hot Standby |