Re: replication identifier format

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: replication identifier format
Date: 2014-06-23 16:20:29
Message-ID: CA+TgmobR7uFxKqUdC7UbQcqP-pQfrQuCUh72cwVzkFrp2dfjGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 23, 2014 at 11:28 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> Oh, great. Somehow I missed the fact that that had been addressed. I
>> had assumed that we still needed global identifiers in which case I
>> think they'd need to be 64+ bits (preferably more like 128). If they
>> only need to be locally significant that makes things much better.
>
> Well, I was just talking about the 'short ids' here and how they are
> used in crash recovery/shmem et al. Those indeed don't need to be
> coordinated.
> If you ever use logical decoding on a system that receives changes from
> other systems (cascading replication, multimaster) you'll likely want to
> add the *long* form of that identifier to the output in the output
> plugin so the downstream nodes can identify the source. How one
> specific replication solution deals with coordinating this between
> systems is essentially that suite's problem.

OK.

> The external identifier currently is a 'text' column, so essentially
> unlimited. (Well, I just noticed that the table currently doesn't have a
> toast table assigned, so it's only a couple kb right now, but ...)

OK. I have no clear reason to dislike that.

>> Is there any real reason to add a pg_replication_identifier table, or
>> should we just let individual replication solutions manage the
>> identifiers within their own configuration tables?
>
> I don't think that'd work. During crash recovery the short/internal IDs
> are read from WAL records and need to be unique across *all*
> databases. Since there's no way for different replication solutions or
> even the same to coordinate this across databases (as there's no way to
> add shared relations) it has to be builtin.

That makes sense.

> It's also useful so we can have stuff like the
> 'pg_replication_identifier_progress' view which tells you internal_id,
> external_id, remote_lsn, local_lsn. Just showing the internal ID would
> imo be bad.

OK.

>> I guess one
>> question is: What happens if there are multiple replication solutions
>> in use on a single server? How do they coordinate?
>
> What's your concern here? You're wondering how they can make sure the
> identifiers they create are non-overlapping?

Yeah, I was just thinking that might be why you installed a catalog
table for this, but now I see that there are several other reasons
also.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-06-23 16:21:35 Re: SQL access to database attributes
Previous Message Vik Fearing 2014-06-23 16:19:49 Re: please review source(SQLServer compatible)‏