Re: Sync Rep v19

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep v19
Date: 2011-03-04 11:24:06
Message-ID: 4D70CBD6.2020004@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2011-03-03 11:53, Simon Riggs wrote:
> Latest version of Sync Rep, which includes substantial internal changes
> and simplifications from previous version. (25-30 changes).
Testing more with the post v19 version from github with HEAD

commit 009875662e1b47012e1f4b7d30eb9e238d1937f6
Author: Simon Riggs <simon(at)2ndquadrant(dot)com>
Date: Fri Mar 4 06:13:43 2011 +0000

Allow SIGTERM messages in ProcessInterrupts() even when interrupts are
held, if WaitingForSyncRep

1) unexpected behaviour
- master has synchronous_standby_names = 'standby1,standby2,standby3'
- standby with 'standby2' connects first.
- LOG: 00000: standby "standby2" is now the synchronous standby with
priority 2

I'm still confused by the priority numbers. At first I thought that
priority 1 meant: this is the one that is currently waited for. Now I'm
not sure if this is the first potential standby that is not used, or
that it is actually the one waited for.
What I expected was that it would be connected with priority 1. And then
if the standby1 connect, it would become the one with prio1 and standby2
with prio2.

2) unexpected behaviour
- continued from above
- standby with 'asyncone' name connects next
- no log message on master

I expected a log message along the lines 'standby "asyncone" is now an
asynchronous standby'

3) more about log messages
- didn't get a log message that the asyncone standby stopped
- didn't get a log message that standby1 connected with priority 1
- after stop / start master, again only got a log that standby2
connectied with priority 2
- pg_stat_replication showed both standb1 and standby2 with correct prio#

4) More about the priority stuff. At this point I figured out prio 2 can
also be 'the real sync'. Still I'd prefer in pg_stat_replication a
boolean that clearly shows 'this is the one', with a source that is
intimately connected to the syncrep implemenation, instead of a
different implementation of 'if lowest connected priority and > 0, then
sync is true. If there are two different implementations, there is room
for differences, which doesn't feel right.

5) performance.
Seems to have dropped a a few dozen %. With v17 I earlier got ~650 tps
and after some more tuning over 900 tps. Now with roughly the same setup
I get ~ 550 tps. Both versions on the same hardware, both compiled
without debugging, and I used the same postgresql.conf start config.

I'm currently thinking about a failure test that would check if a commit
has really waited for the standby. What's the worst thing to do to a
master server? Ideas are welcome :-)

#!/bin/sh
psql -c "create a big table with generate_series"
echo 1 > /proc/sys/kernel/sysrq ; echo b > /proc/sysrq-trigger

regards,
Yeb Havinga

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-03-04 11:45:41 Re: Sync Rep v19
Previous Message Simon Riggs 2011-03-04 11:08:17 Re: Sync Rep v19