Re: WIP: Failover Slots

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: WIP: Failover Slots
Date: 2016-02-15 17:23:01
Message-ID: CALLjQTSCHvcsF6y7=ZhmdMjJUMGLqt1-6Pz2rtb7PfFLxFfBOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

here is my code level review:

0001:
This one looks ok except for broken indentation in the new notes in
xlogreader.c and .h. It's maybe slightly overdocumented but given the
complicated way the timeline reading works it's probably warranted.

0002:
+ /*
+ * No way to wait for the process since it's not a child
+ * of ours and there's no latch to set, so poll.
+ *
+ * We're checking this without any locks held, but
+ * we'll recheck when we attempt to drop the slot.
+ */
+ while (slot->in_use && slot->active_pid == active_pid
+ && max_sleep_micros > 0)
+ {
+ usleep(micros_per_sleep);
+ max_sleep_micros -= micros_per_sleep;
+ }

Not sure I buy this, what about postmaster crashes and fast shutdown
requests etc. Also you do usleep for 10s which is quite long. I'd
prefer the classic wait for latch with timeout and pg crash check
here. And even if we go with usleep, then it should be 1s not 10s and
pg_usleep instead of usleep.

0003:
There is a lot of documentation improvements here that are not related
to failover slots or timeline following, it might be good idea to
split those into separate patch as they are separately useful IMHO.

Other than that it looks good to me.

About other things discussed in this thread. Yes it makes sense in
certain situations to handle this outside of WAL and that does require
notions of nodes, etc. That being said, the timeline following is
needed even if this is handled outside of WAL. And once timeline
following is in, the slots can be handled by the replication solution
itself which is good. But I think the failover slots are still a good
thing to have - it provides HA solution for anything that uses slots,
and that includes physical replication as well. If the specific
logical replication solution wants to handle it for some reason itself
outside of WAL, it can create non-failover slot so in my opinion we
ideally need both types of slots (and that's exactly what this patch
gives us).

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2016-02-15 17:29:26 New pg_upgrade data directory inside old one?
Previous Message Fujii Masao 2016-02-15 17:18:15 Re: Existence check for suitable index in advance when concurrently refreshing.