Re: Support for N synchronous standby servers

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for N synchronous standby servers
Date: 2014-08-28 07:10:56
Message-ID: CAB7nPqRFSLmHbYonra0=p-X8MJ-XTL7oxjP_QXDJGsjpvWRXPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 27, 2014 at 2:46 PM, Rajeev rastogi
<rajeev(dot)rastogi(at)huawei(dot)com> wrote:
> I have done some more review, below are my comments:
Thanks!

> 1. There are currently two loops on *num_sync, Can we simplify the function SyncRepGetSynchronousNodes by moving the priority calculation inside the upper loop
> if (*num_sync == allowed_sync_nodes)
> {
> for (j = 0; j < *num_sync; j++)
> {
> Anyway we require priority only if *num_sync == allowed_sync_nodes condition matches.
> So in this loop itself, we can calculate the priority as well as assigment of new standbys with lower priority.
> Let me know if you see any issue with this.

OK, I see, yes this can minimize process a bit so I refactored the
code by integrating the second loop to the first. This has needed the
removal of the break portion as we need to find the highest priority
value among the nodes already determined as synchronous.

> 2. Comment inside the function SyncRepReleaseWaiters,
> /*
> * We should have found ourselves at least, except if it is not expected
> * to find any synchronous nodes.
> */
> Assert(num_sync > 0);
>
> I think "except if it is not expected to find any synchronous nodes" is not required.
> As if it has come till this point means at least this node is synchronous.
Yes, removed.

> 3. Document says that s_s_num should be lesser than max_wal_senders but code wise there is no protection for the same.
> IMHO, s_s_num should be lesser than or equal to max_wal_senders otherwise COMMIT will never return back the console without
> any knowledge of user.
> I see that some discussion has happened regarding this but I think just adding documentation for this is not enough.
> I am not sure what issue is observed in adding check during GUC initialization but if there is unavoidable issue during GUC initialization then can't we try to add check at later points.

The trick here is that you cannot really return a warning at GUC
loading level to the user as a warning could be easily triggered if
for example s_s_num is present before max_wal_senders in
postgresql.conf. I am open to any solutions if there are any (like an
error when initializing WAL senders?!). Documentation seems enough for
me to warn the user.

> 4. Similary interaction between parameters s_s_names and s_s_num. I see some discussion has happened regarding this and it is acceptable
> to have s_s_num more than s_s_names. But I was thinking should not give atleast some notice message to user for such case along with
> some documentation.

Done. I added the following in the paragraph "Server will wait":
Hence it is recommended to not set <varname>synchronous_standby_num</>
to a value higher than the number of elements in
<varname>synchronous_standby_names</>.

> 5. "At any one time there will be at a number of active synchronous standbys": this sentence is not proper.
What about that:
"At any one time there can be a number of active synchronous standbys
up to the number defined by <xref
linkend="guc-synchronous-standby-num">"

> 6. When this parameter is set to <literal>0</>, all the standby
> nodes will be considered as asynchronous.
>
> Can we make this as
> When this parameter is set to <literal>0</>, all the standby
> nodes will be considered as asynchronous irrespective of value of synchronous_standby_names.

Done. This seems proper for the user as we do not care at all about
s_s_names if _num = 0.

> 7. Are considered as synchronous the first elements of
> <xref linkend="guc-synchronous-standby-names"> in number of
> <xref linkend="guc-synchronous-standby-num"> that are
> connected.
>
> Starting of this sentence looks to be incomplete.
OK, I reworked this part as well. I hope it is clearer.

> 8. Standbys listed after this will take over the role
> of synchronous standby if the first one should fail.
>
> Should not we make it as:
>
> Standbys listed after this will take over the role
> of synchronous standby if any of the first synchronous-standby-num standby fails.
Fixed as proposed.

At the same I found a bug with pg_stat_get_wal_senders caused by a
NULL pointer that was freed when s_s_num = 0. Updated patch addressing
comments is attached. On top of that the documentation has been
reworked a bit by replacing the too-high amount of <xref> blocks by
<varname>, having a link to a given variable specified only once.
Regards,
--
Michael

Attachment Content-Type Size
20140828_multi_syncrep_v8.patch text/x-diff 28.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2014-08-28 07:15:35 pgsql: Allow units to be specified in relation option setting value.
Previous Message Fabien COELHO 2014-08-28 06:27:26 Re: postgresql latency & bgwriter not doing its job