Preliminary review of Synchronous Replication patches

Lists: pgsql-hackers
From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Subject: Preliminary review of Synchronous Replication patches
Date: 2010-07-21 14:11:07
Message-ID: 4C46FFFB.3080307@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Zoltán, Fujii and list,

Kevin asked me to do a preliminary review on both synchronous
replication patches. Relevant posts on -hackers are:

(A) http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php
(B)
http://archives.postgresql.org/message-id/AANLkTilgyL3Y1jkDVHX02433COq7JLmqicsqmOsbuyA1@mail.gmail.com
(1) http://archives.postgresql.org/pgsql-hackers/2010-05/msg00746.php
(2) http://archives.postgresql.org/pgsql-hackers/2010-05/msg01047.php
(3)
http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability

The first patch (A) was posted by Zoltán Böszörményi three months ago,
with comments on -hackers in thread (1). The second patch by Fujii Masao
a few days ago (B).

Since both patches overlap in functionality, applying one in core means
not applying the other. Initially I set out to do a complete review of
both patches and let the difficult choice of preferring one over the
other to fellow reviewers. However, for the following reasons I believe
that patch (A) should probably be withdrawn and the review effort
continued on (B).

* patch (A) was designed and programmed without prior community
involvement. This in itself doesn't make it a bad patch nor a bad way of
contributing source code, however thread (1) shows that some issues were
raised and more ideas existed.
* one of the leafs of thread (A) was (4) where Zoltán Böszörményi hints
there might be a new version of the patch (replacing XIDs with LSNs).
However to date no new version was posted. Also this in itself is not
ground for rejection, but together with the existence of patch (B) gives
rise to the idea that work on (A) might have halted.
* the work on patch (B) started actually with the post (1) where Fujii
Masao indicates he is going to write a patch too, and proposes to work
together with Zoltán Böszörményi on the design.
* patch (B) encompasses functionality of (A) and more, it also addresses
some, if not all ideas on the design that were raised in the comments on
patch (A)

Adding this up I have the impression that patch (A) will not get a newer
version, based on the fact that a newer patch (B) exists which has more
functionality and is partly based on community feedback on patch (A),
where patch (A) itself is not. Therefore I think that the focus and
review time during this commitfest should be on patch (B), unless Zoltán
Böszörményi disagrees and supplies a new version of this patch.

Depending on a reaction of Zoltán Böszörményi I think patch (A) should
be set to either "Returned With Feedback", if a new version is in the
making, or "Rejected" if not.

regards,
Yeb Havinga


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Boszormenyi Zoltan" <zb(at)cybertec(dot)at>, "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>, "Yeb Havinga" <yebhavinga(at)gmail(dot)com>, "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Preliminary review of Synchronous Replication patches
Date: 2010-07-21 15:34:26
Message-ID: 4C46CD320200002500033A9C@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Yeb Havinga <yebhavinga(at)gmail(dot)com> wrote:

> Kevin asked me to do a preliminary review on both synchronous
> replication patches.

Thanks for doing so.

BTW, Yeb has emailed me off-list that he has more specific notes on
both patches, but has run into high priority items on his "day job"
which will prevent him from posting those for another day or two.

> Since both patches overlap in functionality, applying one in core
> means not applying the other. Initially I set out to do a complete
> review of both patches and let the difficult choice of preferring
> one over the other to fellow reviewers. However, for the following
> reasons I believe that patch (A) should probably be withdrawn and
> the review effort continued on (B).

Unless there are objections, I will mark the patch by Zoltán
Böszörményi as Returned with Feedback in a couple days, and ask that
everyone interested in this feature focus on advancing the patch by
Fujii Masao. Given the scope and importance of this area, I think
we could benefit from another person or two signing on officially as
Reviewers.

-Kevin


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Yeb Havinga <yebhavinga(at)gmail(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Subject: Re: Preliminary review of Synchronous Replication patches
Date: 2010-07-22 04:37:20
Message-ID: AANLkTik=nNas3StXCV24y+h6oBU0KGfFKdJVS16oVmV5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 21, 2010 at 11:11 PM, Yeb Havinga <yebhavinga(at)gmail(dot)com> wrote:
> Kevin asked me to do a preliminary review on both synchronous replication patches. Relevant posts on -hackers are:

Thanks for the review!

> * patch (B) encompasses functionality of (A) and more, it also addresses
> some, if not all ideas on the design that were raised in the comments on
> patch (A)

Yeah, I'd like to cover many good functionalities in (A) as much as possible.
If you found something when you compared the patches, please feel free to
comment.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Preliminary review of Synchronous Replication patches
Date: 2010-07-23 15:47:59
Message-ID: 4C49B9AF.1020107@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner wrote:
> Unless there are objections, I will mark the patch by Zoltán
> Böszörményi as Returned with Feedback in a couple days, and ask that
> everyone interested in this feature focus on advancing the patch by
> Fujii Masao. Given the scope and importance of this area, I think
> we could benefit from another person or two signing on officially as
> Reviewers.
>
Hello Zoltán, Fujii, Kevin and list,

Here follows a severly time-boxed review of both synchronous replication
patches. Expect this review to be incomplete, though I believe the
general remarks to be valid. Off the list I've received word from Zoltan
that work on a new patch is planned. It would be great if ideas from
both patches could be merged into one.

regards,
Yeb Havinga

Patch A: Zoltán Böszörményi
Patch B: Fujii Masao

* Is the patch in context diff format?
A: Yes
B: Yes

* Does it apply cleanly to the current CVS HEAD?
A: No
B: Yes

* Does it include reasonable tests, necessary doc patches, etc?
A: Doc patches, and a contrib module for debugging purposes.
B: Doc patches.

For neither patch the documentation is final, see e.g.
* 25.2 - The section starting with "It should be noted that the log
shipping is asynchronous, i.e.," should be updated.
* 25.2.5 - Dito for "Streaming replication is asynchronous, so there..."

Testing any replication setup is not possible with the current
regression tool suite, and adding that is beyond the scope of any
synchronous replication patch. Perhaps the work of Markus Wanner with
dtester (that adds a make dcheck) can be assimilated for this purpose.

Both patches add synchronous replication to the current single-master
streaming replication features in 9.0, which means that there now is a
mechanism where a commit on the master does not finish until 'some or
more of the replicas are in sync'

* Does the patch actually implement that?
A: Yes
B: No, if no standbys are connected commits to not wait. (If I
understand the code from WaitXLogSend correct)

* Do we want that?
For database users who do never want to loose a single committed record,
synchronous replication is a relatively easy setup to achieve a high
level of 'security' (where secure means: less chance of losing data).
The easiness is relative to current solutions (the whole range of
mirrored disks, controllers, backups, log shipping etc).

* Do we already have it?
No

* Does it follow SQL spec, or the community-agreed behavior?
A: Unknown, though the choices in guc parameters suggest the patch's
been made to support actual use cases.
B: Design of patch B has been discussed on the mailing list as well as
the wiki (for links I refer to my previous mail)

* Are there dangers?
Besides the usual errors causing transactions to fail, in my opinion the
single largest danger would be that the master thinks a standby is in
sync, where in fact it isn't.

* Have all the bases been covered?
Patch A seems to cover two-phase commit where patch B does not. (I'm
time constrained and currently do not have a replicated cluster with
patch A anymore, so I cannot test).

# Does the feature work as advertised?
Patch A: yes
Patch B: yes

I ran a few tests with failures and had some recoverable problems, of
which I'm unsure they are related to the sync rep patches or streaming
replication in general. Work in the direction of a replication
regression test would be useful.

# Are there any assertion failures or crashes?
No

A note: reading source code of both patches makes it clear that these
are patches from experienced PostgreSQL programmers. The source code
reads just like 'normal' high quality postgresql source.

Differences:
Patch A synchronizes by sending back the Xids that have been received
and written to disk. When the master receives the xids, the respective
backends with having those xids active are unlocked and signalled to
continue. This means some locking of the procarray. The libpq protocol
was adapted so a client (the walreceiver) can report back the xids.
Patch B synchronizes by waiting to send wal data, until the sync
replicas have sending back LSN pointers in the wal files they have
synced to, in one of three ways. The libpq protocol was adapted so the
walreceiver can report back WAL file positions.

Perhaps the weakness of both patches is that they are not one. In my
opinion, both patches have their strengths. It would be great if these
strenght could be unified in a single patch.

patch A strengths:
* design of the guc parameters - meaning of
min_sync_replication_clients. As I understand it, it is not possible
with patch B to define a minimum number of synchronous standby servers a
transaction must be replicated to, before the commit succeeds. Perhaps
the name could be changed (quorum_min_sync_standbys?), but in my opinion
the definitive sync replication patch needs a parameter with this number
and exact meaning. The 'synchronous_slave' guc in boolean is also a good
one in my opinion.

patch B strengths:
* having the walsender synced by waiting for acked LSN, instead of
signalling respective backends with a specific XID active, seems like a
simpler and therefore better solution.
* the three different ways when to sync (recv,flush and replay), and
also that this is configurable per standby.

In patch B there's probably some work to do in WaitXLogSend(), after a
long day I'm having troubles understanding it's meaning but somehow I
believe that a more intuitive set of guc parameters can be found,
accompanied by clear code in this function. The way patch A synchronizes
and count synchronous standbys for a commit is clearer.

end of review.


From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Preliminary review of Synchronous Replication patches
Date: 2010-07-23 20:39:09
Message-ID: 4C49FDED.8070505@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner wrote:
> Unless there are objections, I will mark the patch by Zoltán
> Böszörményi as Returned with Feedback in a couple days, and ask that
> everyone interested in this feature focus on advancing the patch by
> Fujii Masao. Given the scope and importance of this area, I think
> we could benefit from another person or two signing on officially as
> Reviewers.
>
Hello Zoltán, Fujii, Kevin and list,

Here follows a severly time-boxed review of both synchronous replication
patches. Expect this review to be incomplete, though I believe the
general remarks to be valid. Off the list I've received word from Zoltan
that work on a new patch is planned. It would be great if ideas from
both patches could be merged into one.

regards,
Yeb Havinga

Patch A: Zoltán Böszörményi
Patch B: Fujii Masao

* Is the patch in context diff format?
A: Yes
B: Yes

* Does it apply cleanly to the current CVS HEAD?
A: No
B: Yes

* Does it include reasonable tests, necessary doc patches, etc?
A: Doc patches, and a contrib module for debugging purposes.
B: Doc patches.

For neither patch the documentation is final, see e.g.
* 25.2 - The section starting with "It should be noted that the log
shipping is asynchronous, i.e.," should be updated.
* 25.2.5 - Dito for "Streaming replication is asynchronous, so there..."

Testing any replication setup is not possible with the current
regression tool suite, and adding that is beyond the scope of any
synchronous replication patch. Perhaps the work of Markus Wanner with
dtester (that adds a make dcheck) can be assimilated for this purpose.

Both patches add synchronous replication to the current single-master
streaming replication features in 9.0, which means that there now is a
mechanism where a commit on the master does not finish until 'some or
more of the replicas are in sync'

* Does the patch actually implement that?
A: Yes
B: No, if no standbys are connected commits to not wait. (If I
understand the code from WaitXLogSend correct)

* Do we want that?
For database users who do never want to loose a single committed record,
synchronous replication is a relatively easy setup to achieve a high
level of 'security' (where secure means: less chance of losing data).
The easiness is relative to current solutions (the whole range of
mirrored disks, controllers, backups, log shipping etc).

* Do we already have it?
No

* Does it follow SQL spec, or the community-agreed behavior?
A: Unknown, though the choices in guc parameters suggest the patch's
been made to support actual use cases.
B: Design of patch B has been discussed on the mailing list as well as
the wiki (for links I refer to my previous mail)

* Are there dangers?
Besides the usual errors causing transactions to fail, in my opinion the
single largest danger would be that the master thinks a standby is in
sync, where in fact it isn't.

* Have all the bases been covered?
Patch A seems to cover two-phase commit where patch B does not. (I'm
time constrained and currently do not have a replicated cluster with
patch A anymore, so I cannot test).

# Does the feature work as advertised?
Patch A: yes
Patch B: yes

I ran a few tests with failures and had some recoverable problems, of
which I'm unsure they are related to the sync rep patches or streaming
replication in general. Work in the direction of a replication
regression test would be useful.

# Are there any assertion failures or crashes?
No

A note: reading source code of both patches makes it clear that these
are patches from experienced PostgreSQL programmers. The source code
reads just like 'normal' high quality postgresql source.

Differences:
Patch A synchronizes by sending back the Xids that have been received
and written to disk. When the master receives the xids, the respective
backends with having those xids active are unlocked and signalled to
continue. This means some locking of the procarray. The libpq protocol
was adapted so a client (the walreceiver) can report back the xids.
Patch B synchronizes by waiting to send wal data, until the sync
replicas have sending back LSN pointers in the wal files they have
synced to, in one of three ways. The libpq protocol was adapted so the
walreceiver can report back WAL file positions.

Perhaps the weakness of both patches is that they are not one. In my
opinion, both patches have their strengths. It would be great if these
strenght could be unified in a single patch.

patch A strengths:
* design of the guc parameters - meaning of
min_sync_replication_clients. As I understand it, it is not possible
with patch B to define a minimum number of synchronous standby servers a
transaction must be replicated to, before the commit succeeds. Perhaps
the name could be changed (quorum_min_sync_standbys?), but in my opinion
the definitive sync replication patch needs a parameter with this number
and exact meaning. The 'synchronous_slave' guc in boolean is also a good
one in my opinion.

patch B strengths:
* having the walsender synced by waiting for acked LSN, instead of
signalling respective backends with a specific XID active, seems like a
simpler and therefore better solution.
* the three different ways when to sync (recv,flush and replay), and
also that this is configurable per standby.

In patch B there's probably some work to do in WaitXLogSend(), after a
long day I'm having troubles understanding it's meaning but somehow I
believe that a more intuitive set of guc parameters can be found,
accompanied by clear code in this function. The way patch A synchronizes
and count synchronous standbys for a commit is clearer.

end of review.