Re: Changeset Extraction v7.6.1

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changeset Extraction v7.6.1
Date: 2014-02-21 11:40:51
Message-ID: 20140221114051.GZ28858@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-02-19 13:07:11 -0500, Robert Haas wrote:
> On Tue, Feb 18, 2014 at 4:07 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >> 2. I think the snapshot-export code is fundamentally misdesigned. As
> >> I said before, the idea that we're going to export one single snapshot
> >> at one particular point in time strikes me as extremely short-sighted.
> >
> > I don't think so. It's precisely what you need to implement a simple
> > replication solution. Yes, there are usecases that could benefit from
> > more possibilities, but that's always the case.
> >
> >> For example, consider one-to-many replication where clients may join
> >> or depart the replication group at any time. Whenever somebody joins,
> >> we just want a <snapshot, LSN> pair such that they can apply all
> >> changes after the LSN except for XIDs that would have been visible to
> >> the snapshot.
> >
> > And? They need to create individual replication slots, which each
> > will get a snapshot.
>
> So we have to wait for startup N times, and transmit the change stream
> N times, instead of once? Blech.

I can't get too excited about this. If we later want to add a command to
clone an existing slot, sure, that's perfectly fine with me. That will
then stream at exactly the same position. Easy, less than 20 LOC + docs
probably.

We have much more waiting e.g. in the CONCURRENTLY commands and it's not
causing that many problems.

Note that it'd be a *significant* overhead to contiuously be able to
export snapshots that are useful for looking at normal relations. Bot
for computing snapshots and for not being able to remove those rows.

> >> And in fact, we don't even need any special machinery
> >> for that; the client can just make a connection and *take a snapshot*
> >> once decoding is initialized enough.
> >
> > No, they can't. Two reasons: For one the commit order between snapshots
> > and WAL isn't necessarily the same.

> So what?

So you can't just use a plain snapshot and dump using it, without
getting into inconsistencies.

> > For another, clients now need logic
> > to detect whether a transaction's contents has already been applied or
> > has not been applied yet, that's nontrivial.

> My point is, I think we should be trying to *make* that trivial,
> rather than doing this.

I think *this* *is* making it trivial.

Maybe I've missed it, but I haven't seen any alternative that comes even
*close* to being as easy to implement in a replication
solution. Currently you can use it like:

CREATE_REPLICATION_SLOT <name> LOGICAL
copy data using the exported snapshot
START_REPLICATION SLOT <name> LOGICAL
stream changes.

Where you can do the START_REPLICATION as soon as some other sesion has
imported the snapshot. Really not much to worry about additionally.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2014-02-21 11:54:19 Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Previous Message Atri Sharma 2014-02-21 11:28:22 Re: Proposal: IMPORT FOREIGN SCHEMA statement.