Re: Timeline following for logical slots

From: Andres Freund <andres(at)anarazel(dot)de>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Timeline following for logical slots
Date: 2016-03-31 08:09:07
Message-ID: 20160331080907.GI13305@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2016-03-31 08:52:34 +0800, Craig Ringer wrote:
> On 31 March 2016 at 07:15, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>
>
> > > Available attached or at
> > >
> > https://github.com/2ndQuadrant/postgres/tree/dev/logical-decoding-timeline-following
> >
> > And pushed this too.
> >
>
> Much appreciated. Marked as committed at
> https://commitfest.postgresql.org/9/568/ .
>
> This gives us an option for failover of logical replication in 9.6, even if
> it's a bit cumbersome and complex for the client, in case failover slots
> don't make the cut. And, of course, it's a pre-req for failover slots,
> which I'll rebase on top of it shortly.

FWIW, I think it's dangerous to use this that way. If people manipulate
slots that way we'll have hellishly to debug issues. The test code needs
a big disclaimer to never ever be used in production, and we should
"disclaim any warranty" if somebody does that. To the point of not
fixing issues around it in back branches.

> Andres, I tried to address your comments as best I could. The main one that
> I think stayed open was about the loop that finds the last timeline on a
> segment. If you think that's better done by directly scanning the List* of
> timeline history entries I'm happy to prep a follow-up.

Have to look again.

+ * We start reading xlog from the restart lsn, even though in
+ * CreateDecodingContext we set the snapshot builder up using the
+ * slot's confirmed_flush. This means we might read xlog we don't
+ * actually decode rows from, but the snapshot builder might need it
+ * to get to a consistent point. The point we start returning data to
+ * *users* at is the confirmed_flush lsn set up in the decoding
+ * context.
+ */
still seems pretty misleading - and pretty much unrelated to the
callsite.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-03-31 08:10:31 Re: Correction for replication slot creation error message in 9.6
Previous Message Amit Kapila 2016-03-31 08:01:25 Re: Relation extension scalability