Quick Links

Re: A Modest Upgrade Proposal

Lists:	pgsql-hackers

From:	David Fetter <david(at)fetter(dot)org>
To:	PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	A Modest Upgrade Proposal
Date:	2016-05-17 00:52:33
Message-ID:	20160517005233.GA26009@fetter.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Folks,

We have a problem.

With all due respect to the valiant efforts of people over the years
who have tried to make an upgrade-in-place system work, I would like
to note that such a system faces what I believe are insurmountable
barriers to being done correctly. I will then propose an alternative.

We have seen each one of the following on multiple occasions:

- It's extraordinarily unglamorous work. This further restricts the
already tiny pool of people who might work on it. If somebody has a
sustainable way to increase the glamour, that might help, but...

- To do correctly, it requires broad and intimate knowledge of the
storage system and the systems below it (what is and isn't actually
invariant across filesystems and kernels, e.g.) at a level that even
most core engine hackers do not possess.

- It's always done under extreme time pressure, namely between feature
freeze (more properly, all-other-code-freeze, if it's to be actually
correct) and release. We haven't even attempted the "properly"
version for what I hope are pretty obvious reasons.

- It's extraordinarily difficult to test even normal cases, let alone
corner cases, especially in light of the time pressure.

- Failure modes tend to be silent (or at least whispering) data
corruption, not infrequently permanent.

That all sounds grim because it is.

HOWEVER

All is not lost.

We can relax the in-place requirement because of the economics of
computing. The components of a node have been getting drastically
cheaper for decades while (amazingly, if you think about it)
increasing in quality. Rented ("cloud") nodes have gotten steadily
cheaper and better, too, although not over quite as long a haul.

In light of the above, it is perfectly reasonable to require, at least
temporarily, setting up duplicate storage, or another node.

I am aware that some cases exist where this is not possible, but I
don't think we should twist ourselves into pretzels to accommodate a
tiny minority of our users, which my experience in the field leads me
to believe is the case.

As a relatively (to our users) minor course correction, I would like
to propose the following:

- Keep the current pg_upgrade code, but put loud deprecation warnings
all over it, most emphatically all over its documentation.

- Develop a logical upgrade path as a part of the (Yay! Sexy!) logical
replication that's already in large part built.

This path would, of course, run either locally or across a network,
and be testable in both cases. There would be a downgrade path,
namely switching origin nodes.

What say?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-05-17 01:20:34
Message-ID:	573A71E2.3050500@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 05/16/2016 05:52 PM, David Fetter wrote:
> Folks,

> This path would, of course, run either locally or across a network,
> and be testable in both cases. There would be a downgrade path,
> namely switching origin nodes.
>
> What say?

What happens when the database is 5TB in size and you only have 500GB
available but that 500GB won't exhaust before the 18 month lease expiry?

>
> Cheers,
> David.
>

--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	David Fetter <david(at)fetter(dot)org>
Cc:	PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-05-17 01:22:45
Message-ID:	20160517012245.GA125296@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

David Fetter wrote:

> As a relatively (to our users) minor course correction, I would like
> to propose the following:

> - Develop a logical upgrade path as a part of the (Yay! Sexy!) logical
> replication that's already in large part built.
>
> This path would, of course, run either locally or across a network,
> and be testable in both cases.

This is one use case that pglogical intends to fulfill. If you're able
to contribute to that project, I'm sure many would appreciate it. Right
now the hottest question seems to be: is this something that should be
an extension, or should it be part of core with its own set of DDL etc?
The current patch is geared towards the former, so if the community at
large prefers to have it as the latter and would oppose the former, now
is the time to speak up so that the course can be corrected.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>
Cc:	PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-05-17 01:28:23
Message-ID:	573A73B7.8020805@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 05/16/2016 06:22 PM, Alvaro Herrera wrote:
> David Fetter wrote:
>
>> As a relatively (to our users) minor course correction, I would like
>> to propose the following:
>
>> - Develop a logical upgrade path as a part of the (Yay! Sexy!) logical
>> replication that's already in large part built.
>>
>> This path would, of course, run either locally or across a network,
>> and be testable in both cases.
>
> This is one use case that pglogical intends to fulfill. If you're able
> to contribute to that project, I'm sure many would appreciate it. Right
> now the hottest question seems to be: is this something that should be
> an extension, or should it be part of core with its own set of DDL etc?
> The current patch is geared towards the former, so if the community at
> large prefers to have it as the latter and would oppose the former, now
> is the time to speak up so that the course can be corrected.

Alvaro,

Thank you for bringing this to light. Is there a contributor FAQ for
PgLogical so that people can help?

Sincerely,

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-05-17 01:32:41
Message-ID:	20160517013241.GA126159@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Joshua D. Drake wrote:

> Alvaro,
>
> Thank you for bringing this to light. Is there a contributor FAQ for
> PgLogical so that people can help?

Hmm, I don't think there's any contributor FAQ. It's supposed to be a
regular patch submission, after all -- it needs user interface review, a
review of the communication protocol, tests, code-level review, etc.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-05-17 01:34:28
Message-ID:	573A7524.908@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 05/16/2016 06:32 PM, Alvaro Herrera wrote:
> Joshua D. Drake wrote:
>
>> Alvaro,
>>
>> Thank you for bringing this to light. Is there a contributor FAQ for
>> PgLogical so that people can help?
>
> Hmm, I don't think there's any contributor FAQ. It's supposed to be a
> regular patch submission, after all -- it needs user interface review, a
> review of the communication protocol, tests, code-level review, etc.

O.k. so we should discuss all PgLogical things here and not on Github? I
am just trying to figure out what the proper mode here is. I don't think
anybody wants us to double up efforts.

From:	David Fetter <david(at)fetter(dot)org>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-05-17 01:48:27
Message-ID:	20160517014827.GA1782@fetter.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 16, 2016 at 06:20:34PM -0700, Joshua D. Drake wrote:
> On 05/16/2016 05:52 PM, David Fetter wrote:
> > Folks,
>
> > This path would, of course, run either locally or across a
> > network, and be testable in both cases. There would be a
> > downgrade path, namely switching origin nodes.
> >
> > What say?
>
> What happens when the database is 5TB in size and you only have
> 500GB available but that 500GB won't exhaust before the 18 month
> lease expiry?

We cannot prepare for every eventuality.

The downside risk of a binary upgrade in the type of case you describe
is in no conceivable instance better than "rent or borrow another
server with more storage attached and replicate to it."

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-05-17 01:59:27
Message-ID:	20160517015927.GA127080@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Joshua D. Drake wrote:
> On 05/16/2016 06:32 PM, Alvaro Herrera wrote:
> >Joshua D. Drake wrote:
> >
> >>Alvaro,
> >>
> >>Thank you for bringing this to light. Is there a contributor FAQ for
> >>PgLogical so that people can help?
> >
> >Hmm, I don't think there's any contributor FAQ. It's supposed to be a
> >regular patch submission, after all -- it needs user interface review, a
> >review of the communication protocol, tests, code-level review, etc.
>
> O.k. so we should discuss all PgLogical things here and not on Github? I am
> just trying to figure out what the proper mode here is. I don't think
> anybody wants us to double up efforts.

As far as I am concerned, by all means ignore Github and discuss issues
in pgsql-hackers. Github is being used only because it provides a
convenient Git mirror, which is said to be easier to use than attaching
huge patches back and forth. I think it may be more convenient also to
keep track of issues people have reported so that they can be marked as
fixed in commit messages, etc.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-07 20:01:45
Message-ID:	CA+TgmoYYM+Jsf6J66z2-ALVhXmg+Cc0tsUybB_nYw4szkVUgGA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 16, 2016 at 9:22 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> David Fetter wrote:
>> As a relatively (to our users) minor course correction, I would like
>> to propose the following:
>
>> - Develop a logical upgrade path as a part of the (Yay! Sexy!) logical
>> replication that's already in large part built.
>>
>> This path would, of course, run either locally or across a network,
>> and be testable in both cases.
>
> This is one use case that pglogical intends to fulfill. If you're able
> to contribute to that project, I'm sure many would appreciate it. Right
> now the hottest question seems to be: is this something that should be
> an extension, or should it be part of core with its own set of DDL etc?
> The current patch is geared towards the former, so if the community at
> large prefers to have it as the latter and would oppose the former, now
> is the time to speak up so that the course can be corrected.

There was an unconference session on this topic at PGCon and quite a
number of people there stated that they found DDL to be an ease-of-use
feature and wanted to have it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	David Fetter <david(at)fetter(dot)org>
Cc:	PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-07 20:10:30
Message-ID:	CA+TgmoaA8xmCEUYYiXo2tZu2epofrRc35fN1as7dtGAdOpQLsQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 16, 2016 at 8:52 PM, David Fetter <david(at)fetter(dot)org> wrote:
> In light of the above, it is perfectly reasonable to require, at least
> temporarily, setting up duplicate storage, or another node.
>
> I am aware that some cases exist where this is not possible, but I
> don't think we should twist ourselves into pretzels to accommodate a
> tiny minority of our users, which my experience in the field leads me
> to believe is the case.

So, on the one hand, I agree that logical replication is a great way
to facilitate major version upgrades. On the other hand, I think it's
completely wrong to suppose that only a tiny minority of people can't
use it. In some cases, hardware availability is definitely an issue.
But even when people have the hardware, being able to cleanly do a
cutover from one master to another is not necessarily something people
are set up to do. Getting that to work well requires more brainpower
than many users are willing to give to their database. A lot of
people want to just shut the database down, upgrade it, and start it
back up.

pg_upgrade does that, kinda. I'd like to have something better, but
in the absence of that, I think it's quite wrong to think about
deprecating it, even if we had logical replication fully integrated
into core today. Which we by no means do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-07 23:15:19
Message-ID:	CANP8+jKH2x25ABoZ8hudt3h7iOT0xVq2CJbHRY9ws-gpB=mcfA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 7 July 2016 at 21:01, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, May 16, 2016 at 9:22 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > David Fetter wrote:
> >> As a relatively (to our users) minor course correction, I would like
> >> to propose the following:
> >
> >> - Develop a logical upgrade path as a part of the (Yay! Sexy!) logical
> >> replication that's already in large part built.
> >>
> >> This path would, of course, run either locally or across a network,
> >> and be testable in both cases.
> >
> > This is one use case that pglogical intends to fulfill. If you're able
> > to contribute to that project, I'm sure many would appreciate it. Right
> > now the hottest question seems to be: is this something that should be
> > an extension, or should it be part of core with its own set of DDL etc?
> > The current patch is geared towards the former, so if the community at
> > large prefers to have it as the latter and would oppose the former, now
> > is the time to speak up so that the course can be corrected.
>
> There was an unconference session on this topic at PGCon and quite a
> number of people there stated that they found DDL to be an ease-of-use
> feature and wanted to have it.
>

Yes, I ran the unconference session. It was a shame you weren't able to
stay for the whole discussion.

We all agreed that an in-core solution was desirable, if only for wider
adoption.

About half the people wanted DDL and about half the people didn't. When we
discussed why we wanted DDL there wasn't any answers apart from the thought
that we want to be able to backup the replication configurations, which
seemed to be possible with or without DDL. Any such backup would need to be
easily removed from the objects themselves, to avoid external dependencies
on making recovery work.

Chris Browne finally summed it up by saying we could wait on having DDL
until some time later, once we've decided on things like how we configure
it, how we secure it and what/how to store it in the catalog. "We could
probably live without DDL in the first version."

Personally, I'm in the group of people that don't see the need for DDL.
There are already many successful features that don't utilize DDL, such as
backup, advisory locks and some features that use DDL that don't really
need to such as LISTEN/NOTIFY, full text search etc.. Also note that both
Oracle and SQLServer have moved away from DDL in favour of function APIs,
most NoSQL databases and almost all languages prefer functional interfaces
over parsed text languages, so I don't see a huge industry revival for DDL
as means of specifying things.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-07 23:18:28
Message-ID:	CANP8+jL=2fMUq3CoRPJqTBdzn_tE_D7Orh2m8EZSFK-nNJHa+A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 7 July 2016 at 21:10, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> pg_upgrade does that, kinda. I'd like to have something better, but
> in the absence of that, I think it's quite wrong to think about
> deprecating it, even if we had logical replication fully integrated
> into core today. Which we by no means do.
>

I don't see any problem with extending pg_upgrade to use logical
replication features under the covers.

It seems very smooth to be able to just say

pg_upgrade --online

and then specify whatever other parameters that requires.

It would be much easier to separate out that as a use-case so we can be
sure we get that in 10.0, even if nothing else lands.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>
Cc:	PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-07 23:48:28
Message-ID:	577EEA4C.60104@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 07/07/2016 01:10 PM, Robert Haas wrote:
> On Mon, May 16, 2016 at 8:52 PM, David Fetter <david(at)fetter(dot)org> wrote:
>> In light of the above, it is perfectly reasonable to require, at least
>> temporarily, setting up duplicate storage, or another node.

I would much rather see more brain power put into pg_upgrade or in place
upgrades than logical replication (as a upgrade solution).

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 00:14:00
Message-ID:	CANP8+jJ3HHSxUSWv0x3RhjdrQH_EPqDKTXOAaOF4op0CoAC9Dw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8 July 2016 at 00:48, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:

> On 07/07/2016 01:10 PM, Robert Haas wrote:
>
>> On Mon, May 16, 2016 at 8:52 PM, David Fetter <david(at)fetter(dot)org> wrote:
>>
>>> In light of the above, it is perfectly reasonable to require, at least
>>> temporarily, setting up duplicate storage, or another node.
>>>
>>
> pg_upgrade does that, kinda. I'd like to have something better, but
>> in the absence of that, I think it's quite wrong to think about
>> deprecating it, even if we had logical replication fully integrated
>> into core today. Which we by no means do.
>>
>
> I would much rather see more brain power put into pg_upgrade or in place
> upgrades than logical replication (as a upgrade solution).

Why is that?

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 00:47:47
Message-ID:	577EF833.4010601@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 07/07/2016 05:14 PM, Simon Riggs wrote:

> I would much rather see more brain power put into pg_upgrade or in
> place upgrades than logical replication (as a upgrade solution).
>
>
> Why is that?

First, let me state that I don't have a problem with logical replication
as an upgrade solution. I have used one form or another many times. I
have also used pg_upgrade and will use pg_upgrade every single time I
can over replication (even pg_logical which is reasonably simple) if I
can. *KISS* is the mantra.

I certainly think logical replication has an absolute place (especially
if upgrading from something like 9.2 -> 9.5). I just don't think it is
as useful (generally) as a solid pg_upgrade or in-place upgrade solution.

We have had logical replication as a solution for over a decade. First
there was slony then londiste and then others. They all suffered from
various issues and limitations.

* Horrible overhead
* Long running transaction
* Need for lots of extra space

It is true that something like pg_logical doesn't suffer from those
three things but it does suffer from others:

* No DDL - Agreed, not "required" but certainly a very nice feature.

* Lack of simplicity

Users, like simple. It is one of the key reasons there is a migration to
the cloud, simplicity. Everything from scaling, to pricing, to
provisioning etc...

If I take a step back and say to myself, "What would *really* rock in
terms of PostgreSQL upgrades?" The answer is pretty simple:

apt-get update; apt-get upgrade;
service postgresql upgrade;

Which would pass a flag to "insert technology here" that started
PostgreSQL in a mode that told it, "Hey, you are going to need to check
a few things and probably modify a few things before you enter "ready
for transactions"".

I am fully aware that what I am saying is not easy. There are a whole
ton of issues (what if we are replicating to a slave?).

Anyway, that's why. I am by far more a consultant than an engineer now
and I can only relay what I run into when I speak either at conferences
or clients.

Sincerely,

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 01:17:34
Message-ID:	CANP8+jL4SVd3Var1ofZMhEmnC-=bMqM8rYzDTgiy_qJVMbvtBQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8 July 2016 at 01:47, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:

> It is true that something like pg_logical doesn't suffer from those three
> things but it does suffer from others:
>
> * No DDL - Agreed, not "required" but certainly a very nice
> feature.
>
> * Lack of simplicity
>
> Users, like simple. It is one of the key reasons there is a migration to
> the cloud, simplicity. Everything from scaling, to pricing, to provisioning
> etc...
>

Well, you can't run DDL during pg_upgrade either. I've never seen a
solution that supported that, and if it did, it would certainly violate the
"simple" rule you advocate.

Simplicity is key, I agree. But that's just a user interface feature, not a
comment on what's underneath the covers. pg_upgrade is not simple and is
never likely to be so, under the covers.

Anyway, I'm cool if you don't want to use it, for while or never. Options
are good.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 01:27:55
Message-ID:	CANP8+jLK+3U4ind50r2wauLWA0GTSkYAzVhf1YekOBHGtxdn-A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8 July 2016 at 01:47, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:

> * Long running transaction
>

And of course you can't run any transactions at all during pg_upgrade, not
just long running ones.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 01:41:52
Message-ID:	CA+TgmobFt_zN__sd0296hgUELc=8O1Oj8vpkgRvaDf+YvZ3_dA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 7, 2016 at 7:15 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Yes, I ran the unconference session. It was a shame you weren't able to stay
> for the whole discussion.

I thought I sat through, at least, most of it, but you barely gave
anyone else a chance to talk, which kind of misses the point of an
unconference. The portion which I attended was not about how to move
the development of the feature forward, but just involved describing
it. I thought it was a shame that the time wasn't used better.

> We all agreed that an in-core solution was desirable, if only for wider
> adoption.

Yep.

> About half the people wanted DDL and about half the people didn't. When we
> discussed why we wanted DDL there wasn't any answers apart from the thought
> that we want to be able to backup the replication configurations, which
> seemed to be possible with or without DDL. Any such backup would need to be
> easily removed from the objects themselves, to avoid external dependencies
> on making recovery work.

I really don't think that's accurate. There might have been 50% of
people who thought that not having DDL was acceptable, but I think
there were very few people who found it preferable.

> Chris Browne finally summed it up by saying we could wait on having DDL
> until some time later, once we've decided on things like how we configure
> it, how we secure it and what/how to store it in the catalog. "We could
> probably live without DDL in the first version."

Right. In other words, DDL would be desirable, but he'd be willing to
live without it if that somehow made things easier. But it really
doesn't. Adding new DDL commands is not particularly difficult.

> Personally, I'm in the group of people that don't see the need for DDL.
> There are already many successful features that don't utilize DDL, such as
> backup, advisory locks and some features that use DDL that don't really need
> to such as LISTEN/NOTIFY, full text search etc.. Also note that both Oracle
> and SQLServer have moved away from DDL in favour of function APIs, most
> NoSQL databases and almost all languages prefer functional interfaces over
> parsed text languages, so I don't see a huge industry revival for DDL as
> means of specifying things.

DDL is our standard way of getting things into the system catalogs.
We have no system catalog metadata that is intended to be populated by
any means other than DDL. If you want to add a column to a table, you
say ALTER TABLE .. ADD COLUMN. If you want to add a column to an
extension, you say ALTER EXTENSION .. ADD TABLE. If you want to add
an option to a foreign table, you say ALTER FOREIGN TABLE .. OPTIONS
(ADD ..). Therefore, I think it is entirely reasonable and obviously
consistent with existing practice that if you want to add a table to a
replication set, you should write ALTER REPLICATION SET .. ADD TABLE.
I don't understand why logical replication should be the one feature
that departs from the way that all of our other features work. Sure,
we have other features that do not involve DDL, but (1) one of your
examples is full text search, which of course does have DDL, and was
moved from an interface that did not involve DDL to one that did
because the latter is better and (2) your other examples don't involve
defining catalog contents, which makes them apples-to-oranges
comparisons.

The burden of proof isn't on me to demonstrate why this feature "needs
DDL"; it's on you to explain why replication-related operations that
establish persistent database state don't need to behave just like all
other commands. Really, where this jumped the shark for me is when
you argued that this stuff didn't even need pg_dump support. Come on.
This feature doesn't get a pass from handling all of the things that
every existing similar feature needs to deal with.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 01:53:53
Message-ID:	CANP8+j+iNsuBgu126ROofdjDyECQ2WD7Aa3St257WM6em79wJQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8 July 2016 at 02:41, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Thu, Jul 7, 2016 at 7:15 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > Yes, I ran the unconference session. It was a shame you weren't able to
> stay
> > for the whole discussion.
>
> I thought I sat through, at least, most of it, but you barely gave
> anyone else a chance to talk, which kind of misses the point of an
> unconference. The portion which I attended was not about how to move
> the development of the feature forward, but just involved describing
> it. I thought it was a shame that the time wasn't used better.

I think the problem was that I gave everybody an even shot at commenting,
rather than focusing on a few key developers.

There were twenty people actively involved in that discussion.

> > We all agreed that an in-core solution was desirable, if only for wider
> > adoption.
>
> Yep.
>
> > About half the people wanted DDL and about half the people didn't. When
> we
> > discussed why we wanted DDL there wasn't any answers apart from the
> thought
> > that we want to be able to backup the replication configurations, which
> > seemed to be possible with or without DDL. Any such backup would need to
> be
> > easily removed from the objects themselves, to avoid external
> dependencies
> > on making recovery work.
>
> I really don't think that's accurate. There might have been 50% of
> people who thought that not having DDL was acceptable, but I think
> there were very few people who found it preferable.

Without being in the room, its kinda hard for you to know, right?

> > Chris Browne finally summed it up by saying we could wait on having DDL
> > until some time later, once we've decided on things like how we configure
> > it, how we secure it and what/how to store it in the catalog. "We could
> > probably live without DDL in the first version."
>
> Right. In other words, DDL would be desirable, but he'd be willing to
> live without it if that somehow made things easier. But it really
> doesn't. Adding new DDL commands is not particularly difficult.
>
> > Personally, I'm in the group of people that don't see the need for DDL.
>

> The burden of proof isn't on me to demonstrate why this feature "needs
> DDL"; it's on you to explain why replication-related operations that
> establish persistent database state don't need to behave just like all
> other commands. Really, where this jumped the shark for me is when
> you argued that this stuff didn't even need pg_dump support. Come on.
> This feature doesn't get a pass from handling all of the things that
> every existing similar feature needs to deal with.

I don't agree, not least because I wasn't the only one saying it.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 02:25:52
Message-ID:	CANP8+j+UH-GVra5-p_citjw3YHas9JXVPKSm2LZbO+5bte=Hqw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8 July 2016 at 02:41, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> > Personally, I'm in the group of people that don't see the need for DDL.
> > There are already many successful features that don't utilize DDL, such
> as
> > backup, advisory locks and some features that use DDL that don't really
> need
> > to such as LISTEN/NOTIFY, full text search etc.. Also note that both
> Oracle
> > and SQLServer have moved away from DDL in favour of function APIs, most
> > NoSQL databases and almost all languages prefer functional interfaces
> over
> > parsed text languages, so I don't see a huge industry revival for DDL as
> > means of specifying things.
>
> DDL is our standard way of getting things into the system catalogs.
> We have no system catalog metadata that is intended to be populated by
> any means other than DDL. If you want to add a column to a table, you
> say ALTER TABLE .. ADD COLUMN. If you want to add a column to an
> extension, you say ALTER EXTENSION .. ADD TABLE. If you want to add
> an option to a foreign table, you say ALTER FOREIGN TABLE .. OPTIONS
> (ADD ..). Therefore, I think it is entirely reasonable and obviously
> consistent with existing practice that if you want to add a table to a
> replication set, you should write ALTER REPLICATION SET .. ADD TABLE.
> I don't understand why logical replication should be the one feature
> that departs from the way that all of our other features work. Sure,
> we have other features that do not involve DDL, but (1) one of your
> examples is full text search, which of course does have DDL, and was
> moved from an interface that did not involve DDL to one that did
> because the latter is better and (2) your other examples don't involve
> defining catalog contents, which makes them apples-to-oranges
> comparisons.
>

pg_am has existed for decades without supporting DDL and we have gone to
great lengths over many years to allow catalog tables to be
inserted/updated/deleted by normal SQL rather than DDL, so not all catalog
access is via DDL. One of my examples was full text search and it does have
DDL, but that was an anti-example; all the feedback I have is that it was
much easier to use before it had DDL and that forcing it to use DDL pretty
much killed it for most users.

Anyway, backups and replication slots don't use DDL because they need to
work on standbys. So if you are arguing in favour of forcing logical
replication to never work on standbys, I'm interested in why that
restriction is useful and sensible, especially since we already agreed that
a failover mechanism for use of logical replication on standbys was
desirable. It seems likely that we're discussing this at too high a level
and that we each see things the other does not.

As I already said, I accept that there needs to be some way to backup
replication config; the meeting continued after you left.

I note also that replication slots aren't backed up by pg_dump; I see
analogy here and think that at least some parts of logical replication will
be similar and not require DDL at all, just as slots do not.

pg_dump support doesn't require DDL, in any case, nor is it certain yet
that pg_dump is the right utility for backup.

The main point I see is that the user interface mechanisms have very little
to do with DDL or not. Having a command called ALTER REPLICATION SLOT or a
function called pg_alter_replication_slot() makes little real difference to
a user.

We have much to discuss in terms of security, the way it should work and
what options to support and a sidetrack into syntax isn't warranted at this
early stage. Please lets discuss those important things first, then return
to whether DDL makes sense or not; it may do, or may not, or more likely
which parts of it need DDL and which do not.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 02:55:17
Message-ID:	10216.1467946517@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On 8 July 2016 at 02:41, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> DDL is our standard way of getting things into the system catalogs.
>> We have no system catalog metadata that is intended to be populated by
>> any means other than DDL.

> pg_am has existed for decades without supporting DDL

That argument has been obsoleted by events ;-) ... and in any case, the
reason we went without CREATE ACCESS METHOD for so long was not that we
encouraged "INSERT INTO pg_am" but that non-core index AMs were
effectively unsupported anyway, until we thought of a reasonable way to
let them generate WAL. Without the WAL stumbling block, I'm sure we would
have built CREATE ACCESS METHOD long ago. It's just way too hard to deal
with issues like cross-version changes otherwise.

> and we have gone to
> great lengths over many years to allow catalog tables to be
> inserted/updated/deleted by normal SQL rather than DDL, so not all catalog
> access is via DDL.

But *all* of that is on "if you break it you get to keep both pieces"
terms. In particular we do not promise the catalogs will remain stable
across versions, so that inserts or updates on catalogs are very likely
to break in new versions. I think that all such operations should be
understood as emergency procedures, not recommended standard practice.

> One of my examples was full text search and it does have
> DDL, but that was an anti-example; all the feedback I have is that it was
> much easier to use before it had DDL and that forcing it to use DDL pretty
> much killed it for most users.

That's just unsupported FUD. I would say that most of the problems we've
had with text search DDL came from the fact that previously people had
done things in other ways and transitioning was hard. That experience
doesn't make me want to repeat it; but building a feature that's supposed
to be managed by direct catalog updates is precisely repeating that
mistake.

I'm okay with things like replication configuration being managed outside
the system catalogs entirely (as indeed they are now). But if a feature
has a catalog, it should have DDL to manipulate the catalog. Direct SQL
on a catalog should *never* become standard operating procedure.

regards, tom lane

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 03:18:24
Message-ID:	20160708031824.GA711849@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:

> > pg_am has existed for decades without supporting DDL
>
> That argument has been obsoleted by events ;-) ... and in any case, the
> reason we went without CREATE ACCESS METHOD for so long was not that we
> encouraged "INSERT INTO pg_am" but that non-core index AMs were
> effectively unsupported anyway, until we thought of a reasonable way to
> let them generate WAL. Without the WAL stumbling block, I'm sure we would
> have built CREATE ACCESS METHOD long ago.

Note that the alternative to DDL-based replication handling is not
INSERT INTO pg_replication, but a function-based interface such as
SELECT pg_replication_node_create(foo, bar); so there's no need to
hardcode catalog definitions; nor there is a need to skip backup-ability
of logical replication config: pg_dump support can be added by having it
output function calls -- not catalog INSERTs!

The only difference between DDL and no DDL is that a function-based
interface can be added with a few pg_proc.h entries, whereas the DDL
stuff requires gram.y additions, new nodes, etc.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 08:59:07
Message-ID:	CANP8+jJVbXu7bqVs7-krH3YYUxbNbXTDc4DBTOa8Uvwo5Bqtgw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8 July 2016 at 03:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> > One of my examples was full text search and it does have
> > DDL, but that was an anti-example; all the feedback I have is that it was
> > much easier to use before it had DDL and that forcing it to use DDL
> pretty
> > much killed it for most users.
>
> That's just unsupported FUD.

No, its me relaying opinions I have heard back to this list, for the
purposes of understanding them.

("Fear, Uncertainty and Doubt" or FUD is doesn't apply here, unless its
meant in the same way as "that's rubbish, I disagree".)

> I would say that most of the problems we've
> had with text search DDL came from the fact that previously people had
> done things in other ways and transitioning was hard. That experience
> doesn't make me want to repeat it; but building a feature that's supposed
> to be managed by direct catalog updates is precisely repeating that
> mistake.
>
> I'm okay with things like replication configuration being managed outside
> the system catalogs entirely (as indeed they are now). But if a feature
> has a catalog, it should have DDL to manipulate the catalog.

It's a good rule. In this case all it does is move the discussion to
"should it have a catalog?".

Let me return to my end point from last night: it's becoming clear that
asking the question "DDL or not?" is too high level a thought and is
leading to argument. The most likely answer is "some", but still not sure.
I am looking at this in more detail and will return in a few days with a
much more specific design that we can use to answer the question in detail.

> Direct SQL
> on a catalog should *never* become standard operating procedure.
>

Agreed, but it has always been considered to be something we should
consider when making DDL work.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Petr Jelinek <petr(at)2ndquadrant(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 10:09:17
Message-ID:	577F7BCD.20305@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 08/07/16 10:59, Simon Riggs wrote:
> On 8 July 2016 at 03:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>
> > One of my examples was full text search and it does have
> > DDL, but that was an anti-example; all the feedback I have is that it was
> > much easier to use before it had DDL and that forcing it to use DDL pretty
> > much killed it for most users.
>
> That's just unsupported FUD.
>
>
> No, its me relaying opinions I have heard back to this list, for the
> purposes of understanding them.
>
> ("Fear, Uncertainty and Doubt" or FUD is doesn't apply here, unless its
> meant in the same way as "that's rubbish, I disagree".)
>
> I would say that most of the problems we've
> had with text search DDL came from the fact that previously people had
> done things in other ways and transitioning was hard. That experience
> doesn't make me want to repeat it; but building a feature that's
> supposed
> to be managed by direct catalog updates is precisely repeating that
> mistake.
>
> I'm okay with things like replication configuration being managed
> outside
> the system catalogs entirely (as indeed they are now). But if a feature
> has a catalog, it should have DDL to manipulate the catalog.
>
>
> It's a good rule. In this case all it does is move the discussion to
> "should it have a catalog?".
>

I think it should have catalog for most things. Otherwise it will be
really hard to track mapping of tables to replication sets, etc. We'd
have to invent mechanism of tracking dependencies outside of catalog,
which is not endeavor that I want to go through. The feature is complex
enough without this. Which means there should be DDL for those things as
well. So the work I am doing now is based on this assumption. The DDL
versus function from the point of implementation means tons of
additional boilerplate code but it's not complex thing to do.

One interesting thing will be making sure we can replicate from physical
standby in the future as you mentioned elsewhere in the thread but I
think that should be possible as long as you define the catalogs on
master (not really sure yet if there are any barriers or not).

About the pg_dump support. While I don't think we'll necessarily want to
dump all the information related to logical replication (like
subscriptions), I definitely think we should dump replication sets and
their membership info for example.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 10:19:09
Message-ID:	CANP8+jJnDupbYGv9oQ0dqFG8xfwF8tE23GJVOCgk9A80WcmmkQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8 July 2016 at 11:09, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:

>
> One interesting thing will be making sure we can replicate from physical
> standby in the future as you mentioned elsewhere in the thread but I think
> that should be possible as long as you define the catalogs on master (not
> really sure yet if there are any barriers or not).
>

Agreed, after having spent the morning working on the details.

> About the pg_dump support. While I don't think we'll necessarily want to
> dump all the information related to logical replication (like
> subscriptions), I definitely think we should dump replication sets and
> their membership info for example.

Agreed

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 10:47:15
Message-ID:	CAMsr+YFTw-i1nkoCDbWT8HaUTh_m1Bbuh43DBCcC6m_8atC1wA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8 July 2016 at 09:41, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

>
> > Personally, I'm in the group of people that don't see the need for DDL.
> > There are already many successful features that don't utilize DDL, such
> as
> > backup, advisory locks and some features that use DDL that don't really
> need
> > to such as LISTEN/NOTIFY, full text search etc.. Also note that both
> Oracle
> > and SQLServer have moved away from DDL in favour of function APIs, most
> > NoSQL databases and almost all languages prefer functional interfaces
> over
> > parsed text languages, so I don't see a huge industry revival for DDL as
> > means of specifying things.
>
> DDL is our standard way of getting things into the system catalogs.
> We have no system catalog metadata that is intended to be populated by
> any means other than DDL.

Replication slots? (Arguably not catalogs, I guess)

Replication origins?

> If you want to add a column to a table, you
> say ALTER TABLE .. ADD COLUMN. If you want to add a column to an
> extension, you say ALTER EXTENSION .. ADD TABLE. If you want to add
> an option to a foreign table, you say ALTER FOREIGN TABLE .. OPTIONS
> (ADD ..). Therefore, I think it is entirely reasonable and obviously
> consistent with existing practice that if you want to add a table to a
> replication set, you should write ALTER REPLICATION SET .. ADD TABLE.
> I don't understand why logical replication should be the one feature
> that departs from the way that all of our other features work.

Because unlike all the other features, it can work usefully *across
versions*.

We have no extension points for DDL.

For function interfaces, we do.

That, alone, makes a function based interface overwhelmingly compelling
unless there are specific things we *cannot reasonably do* without DDL.

> Really, where this jumped the shark for me is when
> you argued that this stuff didn't even need pg_dump support. Come on.
> This feature doesn't get a pass from handling all of the things that
> every existing similar feature needs to deal with.
>

Well, replication slots and replication origins aren't handled by pg_dump
(or pg_basebackup). So not quite. Nor, for that matter, is streaming
physical replication handled by pg_dumpall. What makes this different?

That said, with pg_dump, the question to me isn't one of "support vs don't
support", it's how it should work and look.

In many cases it's actively undesirable to dump and restore logical
replication state. Most, I'd say. There probably are cases where it's
desirable to retain logical replication state such that restoring a dump
resumes replication, but I challenge you to come up with any sensible and
sane way that can actually be implemented. Especially since you must
obviously consider the possibility of both upstream and downstream being
restored from dumps.

IMO the problem mostly devolves to making sure dumps taken of different DBs
are consistent so new replication sessions can be established safely. And
really, I think it's a separate feature to logical replication its self.

To what extent are you approaching this from the PoV of wanting to use this
in FDW sharding? It's unclear what vision for users you have behind the
things you say must be done, and I'd like to try to move to more concrete
ground. You want DDL? OK, what should it look like? What does it add over a
function based interface? What's cluster-wide and what's DB-local? etc.

FWIW, Petr is working on some code in the area, but I don't know how far
along the work is.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 10:58:17
Message-ID:	CAMsr+YGEQXxMX9ta0WBcn-ms8=MMmASLykt0CqSNwfE8OhWA7w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8 July 2016 at 11:18, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> Tom Lane wrote:
> > Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
>
> > > pg_am has existed for decades without supporting DDL
> >
> > That argument has been obsoleted by events ;-) ... and in any case, the
> > reason we went without CREATE ACCESS METHOD for so long was not that we
> > encouraged "INSERT INTO pg_am" but that non-core index AMs were
> > effectively unsupported anyway, until we thought of a reasonable way to
> > let them generate WAL. Without the WAL stumbling block, I'm sure we
> would
> > have built CREATE ACCESS METHOD long ago.
>
> Note that the alternative to DDL-based replication handling is not
> INSERT INTO pg_replication, but a function-based interface such as
> SELECT pg_replication_node_create(foo, bar); so there's no need to
> hardcode catalog definitions; nor there is a need to skip backup-ability
> of logical replication config: pg_dump support can be added by having it
> output function calls -- not catalog INSERTs!
>

Yeah. Direct DDL on the catalogs is crazy-talk, I can't imagine anyone
seriously suggesting that as an alternative. The only ways ahead are a
function-based interface or DDL.

Personally I strongly favour function-based for this. With named parameters
and default parameters it's nicely readable and self-documenting, so I
don't really buy the usability argument. You get slightly better output
from \h for DDL than from \df for a function, but only marginally, and
that's about it. Now, if we find that there are areas where a function
based interface is actually limiting, sure, lets use DDL. But not just for
the sake of DDL.

Note that you can implement a function based version in extensions for
older versions. This matters for logical replication because one of the
major appeals of it is up-version migration. If we rely on a bunch of new
DDL there isn't going to be a sane way to implement the decoding upstream
side in a way that'll work for connecting to old versions where the output
plugin has been backported as an extension.

Take pg_dump. Can you imagine pg_dump not supporting dumping from older
versions? Well, why should we not try to make it easy and practical to
stream from older versions?

Now, if the consensus here is that "we" don't care about supporting
decoding from the versions of Pg people actually use in the wild and making
it easier for them to move up to newer ones, well, that's why pglogical was
done as an extension. It'll be hard to get enthusiastic about some
re-imagined logical replication in-core that does much less than pglogical
for fewer target versions and fewer use cases though. Especially since "we
should use DDL" seems to have stayed at the hand-waving stage so far, with
no concrete proposals for what that DDL should look like and why it's
better.

The only difference between DDL and no DDL is that a function-based
> interface can be added with a few pg_proc.h entries, whereas the DDL
> stuff requires gram.y additions, new nodes, etc.
>

... and unlike DDL, a function based interface can be exposed for older
versions by extensions.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Petr Jelinek <petr(at)2ndquadrant(dot)com>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-08 11:37:23
Message-ID:	577F9073.1020003@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 08/07/16 12:47, Craig Ringer wrote:
> On 8 July 2016 at 09:41, Robert Haas <robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>> wrote:
>
> If you want to add a column to a table, you
> say ALTER TABLE .. ADD COLUMN. If you want to add a column to an
> extension, you say ALTER EXTENSION .. ADD TABLE. If you want to add
> an option to a foreign table, you say ALTER FOREIGN TABLE .. OPTIONS
> (ADD ..). Therefore, I think it is entirely reasonable and obviously
> consistent with existing practice that if you want to add a table to a
> replication set, you should write ALTER REPLICATION SET .. ADD TABLE.
> I don't understand why logical replication should be the one feature
> that departs from the way that all of our other features work.
>
>
> Because unlike all the other features, it can work usefully *across
> versions*.

I don't see how that matters for definitions in catalogs though. It's
not like we want to do any kind of RPC to add table to replication set
on the remote node.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-13 18:45:06
Message-ID:	CA+Tgmoa_yYoizSjVUCbNCKa7exP20Vm11qpLKn3OXYsewZ13hQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 7, 2016 at 8:53 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> I thought I sat through, at least, most of it, but you barely gave
>> anyone else a chance to talk, which kind of misses the point of an
>> unconference. The portion which I attended was not about how to move
>> the development of the feature forward, but just involved describing
>> it. I thought it was a shame that the time wasn't used better.
>
> I think the problem was that I gave everybody an even shot at commenting,
> rather than focusing on a few key developers.

If that had been what happened, I wouldn't consider it a problem, but
I don't think that's what happened.

>> I really don't think that's accurate. There might have been 50% of
>> people who thought that not having DDL was acceptable, but I think
>> there were very few people who found it preferable.
> Without being in the room, its kinda hard for you to know, right?

I was in the room for that part.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-13 19:06:23
Message-ID:	CA+TgmoZ7CaTptpPH14Qm-yCVdwv0Hg0X5mxq7OBj2RjTJHfLpQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 7, 2016 at 9:25 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> I note also that replication slots aren't backed up by pg_dump; I see
> analogy here and think that at least some parts of logical replication will
> be similar and not require DDL at all, just as slots do not.

I agree with that. Of course, it's *impossible* to usefully back up a
slot because the key ingredient in a slot is the LSN after which WAL
should be preserved - and it's meaningless to preserve that across a
dump and restore cycle. But, for example, replication set definitions
can be preserved across a dump and restore and I am quite sure users
will find it very unfortunate if they aren't.

> We have much to discuss in terms of security, the way it should work and
> what options to support and a sidetrack into syntax isn't warranted at this
> early stage. Please lets discuss those important things first, then return
> to whether DDL makes sense or not; it may do, or may not, or more likely
> which parts of it need DDL and which do not.

We've sort of hijacked this whole thread which was originally about
something different, so maybe it would be better to start a new thread
specifically to talk about the design of logical replication. For my
money, though, I don't find the designs I've seen so far to be
particularly compelling - and I think that the problem is that we tend
to think about this from the point of view of the capabilities that
must be available within a single instance. Physical replication has
the same issue. Users don't want to configure archive_command and
wal_keep_segments and max_wal_senders and wal_level and set up an
archive and create recovery.conf on the standby. They want to spin up
a new standby - and we don't provide any way to just do that.
pg_basebackup's -X stream and -R options represent significant
progress in that direction, but I don't think we've really taken it as
far as it can go yet, which is not to say I know exactly what's
missing. Similarly, when the master fails, users want to promote a
standby (either one they choose or the one that is determined to be
furthest ahead) and remaster the others and that's not something you
can "just do".

Similarly, for logical replication, users will want to do things like
(1) spin up a new logical replication slave out of thin air,
replicating an entire database or several databases or selected
replication sets within selected databases; or (2) subscribe an
existing database to another server, replicating an entire database or
several databases; or (3) repoint an existing subscription at a new
server after a master change or dump/reload, resynchronizing table
contents if necessary; or (4) stop replication, either with or without
dropping the local copies of the replicated tables. (This is not an
exhaustive list, I'm sure.)

I don't mean to imply that the existing designs are bad as far as they
go. In each case, the functionality that has been built is good. But
it's all focused, as it seems to me, on providing capabilities rather
than on providing a way for users to manage a group of database
servers using high-level primitives. That higher-level stuff largely
gets left to add-on tools, which I don't think is serving us
particularly well. Those add-on tools often find that the core
support doesn't quite do everything they'd like it to do: that's why
WAL-E and repmgr, for example, end up having to do some creative
things to deliver certain features. We need to start thinking of
groups of servers rather than individual servers as the unit of
deployment.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-13 19:06:27
Message-ID:	57869133.1090103@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 07/07/2016 01:01 PM, Robert Haas wrote:

> There was an unconference session on this topic at PGCon and quite a
> number of people there stated that they found DDL to be an ease-of-use
> feature and wanted to have it.

Yeah, I haven't meet anyone yet that would like to have:

select replicate_these_relations('['public']);

vs:

ALTER SCHEMA public ENABLE REPLICATION;

(or something like that).

Sincerely,

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-13 19:26:23
Message-ID:	CA+TgmoZzf__YWt9qnOw7jAKUuydPRKLPE0cic7tJb7F_qzYrxA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Jul 8, 2016 at 5:47 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
>> DDL is our standard way of getting things into the system catalogs.
>> We have no system catalog metadata that is intended to be populated by
>> any means other than DDL.
>
> Replication slots? (Arguably not catalogs, I guess)
>
> Replication origins?

Those things aren't catalogs, are they? I mean, as I said in the
other email I just sent in reply to Simon, if you did a pg_dump and a
pg_restore, I don't think it would be useful to preserve replication
slot LSNs afterwards. If I'm wrong, and that is a useful thing to do,
then we should have a pg_dump flag to do it. Either way, I think we
do have some work to do figuring out how you can dump, restore, and
then resume logical replication, probably by establishing a new slot
and then incrementally resynchronizing without having to copy
unchanged rows.

That having been said, I think the choice not to use DDL for slots was
somewhat unfortunate. We now have CREATE_REPLICATION_SLOT that can be
used via the replication protocol but there is no corresponding CREATE
REPLICATION SLOT for the regular protocol; I think that's kinda
strange.

>> If you want to add a column to a table, you
>> say ALTER TABLE .. ADD COLUMN. If you want to add a column to an
>> extension, you say ALTER EXTENSION .. ADD TABLE. If you want to add
>> an option to a foreign table, you say ALTER FOREIGN TABLE .. OPTIONS
>> (ADD ..). Therefore, I think it is entirely reasonable and obviously
>> consistent with existing practice that if you want to add a table to a
>> replication set, you should write ALTER REPLICATION SET .. ADD TABLE.
>> I don't understand why logical replication should be the one feature
>> that departs from the way that all of our other features work.
>
> Because unlike all the other features, it can work usefully *across
> versions*.

So what?

> We have no extension points for DDL.
>
> For function interfaces, we do.
>
> That, alone, makes a function based interface overwhelmingly compelling
> unless there are specific things we *cannot reasonably do* without DDL.

I don't understand this. We add new DDL in new releases, and we avoid
changing the meaning existing of DDL. Using function interfaces won't
make it possible to change the meaning of existing syntax, and it
won't make it any more possible to add new syntax. It will just make
replication commands be spelled differently from everything else.

> In many cases it's actively undesirable to dump and restore logical
> replication state. Most, I'd say. There probably are cases where it's
> desirable to retain logical replication state such that restoring a dump
> resumes replication, but I challenge you to come up with any sensible and
> sane way that can actually be implemented. Especially since you must
> obviously consider the possibility of both upstream and downstream being
> restored from dumps.

Yes, these issues need lots of thought, but I think that replication
set definitions, at least, are sensible to dump and reload.

> IMO the problem mostly devolves to making sure dumps taken of different DBs
> are consistent so new replication sessions can be established safely. And
> really, I think it's a separate feature to logical replication its self.

I think what is needed has more to do with coping with the situation
when the snapshots aren't consistent. Having a way to make sure they
are consistent is a great idea, but there WILL be situations when
replication between two 10TB databases gets broken and it will not be
good if the only way to recover is to reclone.

> To what extent are you approaching this from the PoV of wanting to use this
> in FDW sharding? It's unclear what vision for users you have behind the
> things you say must be done, and I'd like to try to move to more concrete
> ground. You want DDL? OK, what should it look like? What does it add over a
> function based interface? What's cluster-wide and what's DB-local? etc.

I've thought about that question, a little bit, but it's not really
what underlies my concerns here. I'm concerned about dump-and-restore
preserving as much state as is usefully possible, because I think
that's critical for the user experience, and I'm concerned with having
the commands we use to manage replication not be spelled totally
differently than our other commands.

However, as far as sharding is concerned, no matter how it gets
implemented, I think logical replication is a key feature.
Postgres-XC/XL has the idea of "replicated" tables which are present
on every data node, and that's very important for efficient
implementation of joins. If you do a join between a little table and
a big sharded table, you want to be able to push that down to the
shards, and you can only do that if the entirety of the little table
is present on every shard or by creating a temporary copy on every
shard. In many cases, the former will be preferable. So, think it's
important for sharding that logical replication is fully integrated
into core in such a manner as to be available as a building block for
other features.

At the least, I'm guessing that we'll want a way for whatever code is
planning join execution to figure out which tables have up-to-date
copies on servers that are involved in the query. As far as the
FDW-based approach to sharding is concerned, one thing to think about
is whether postgres_fdw and logical replication could share one notion
of where the remote servers are.

> FWIW, Petr is working on some code in the area, but I don't know how far
> along the work is.

OK, thanks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Christopher Browne <cbbrowne(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-13 20:42:05
Message-ID:	CAFNqd5XG80rHKddb4x2rZPpQ4cD5wZBx=WcDT4ueOdy+Pcmang@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 13 July 2016 at 15:06, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Thu, Jul 7, 2016 at 9:25 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > I note also that replication slots aren't backed up by pg_dump; I see
> > analogy here and think that at least some parts of logical replication
> will
> > be similar and not require DDL at all, just as slots do not.
>
> I agree with that. Of course, it's *impossible* to usefully back up a
> slot because the key ingredient in a slot is the LSN after which WAL
> should be preserved - and it's meaningless to preserve that across a
> dump and restore cycle. But, for example, replication set definitions
> can be preserved across a dump and restore and I am quite sure users
> will find it very unfortunate if they aren't.
>

There should be some way of dumping and restoring these sorts of structures,
and if I were thinking of the name of a tool to dump them, it seems to me
that pg_dump is a pretty good name for it... (Look for slonikconfdump.sh
for the latest iteration of the Slony variation, if interested...)

I have implemented "slony_dump" a couple of times; if that had become a
built-in, I sure hope a pg_dump flag could have been the thing to request
such.

The same seems likely true of FDW configuration; it sure would be nice to
be able to dump that in a consistent, reusable way. Again, nice to have
that be an extension of pg_dump.

Replication configuration should be able to be dumped out in a form that
can be readily loaded somewhere else. It might not be something to have
pg_dump do by default, but it should sure be somewhere; if it isn't, then
that's a reasonably serious shortcoming. Slony didn't have such until
2009; a serious implementation of Logical Replication shouldn't wait
that long.

If what gets spit out is a series of
select replicate_these_relations('['public']');
requests, well, I can actually live with that.

In the long run, it's preferable to have
ALTER SCHEMA PUBLIC ENABLE REPLICATION;

but if the desired syntax isn't clear, at the start, we can surely live with
having functions, initially, as long as:
a) We know that's not intended as being the ultimate solution;
b) There's *some* sort of upgrade path that is helpful to indicate the
syntax that falls out;
c) There's tooling to dump out cluster information, whatever the syntax
form.

I'm getting quoted on being OK with not having syntax, initially...
I'm still fine with that, but take the above caveats to see my intent.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

From:	Petr Jelinek <petr(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-13 20:48:14
Message-ID:	5786A90E.6060806@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 13/07/16 21:06, Robert Haas wrote:
>
>> We have much to discuss in terms of security, the way it should work and
>> what options to support and a sidetrack into syntax isn't warranted at this
>> early stage. Please lets discuss those important things first, then return
>> to whether DDL makes sense or not; it may do, or may not, or more likely
>> which parts of it need DDL and which do not.
>
> We've sort of hijacked this whole thread which was originally about
> something different, so maybe it would be better to start a new thread
> specifically to talk about the design of logical replication. For my
> money, though, I don't find the designs I've seen so far to be
> particularly compelling - and I think that the problem is that we tend
> to think about this from the point of view of the capabilities that
> must be available within a single instance.
> ...
>
> Similarly, for logical replication, users will want to do things like
> (1) spin up a new logical replication slave out of thin air,
> replicating an entire database or several databases or selected
> replication sets within selected databases; or (2) subscribe an
> existing database to another server, replicating an entire database or
> several databases; or (3) repoint an existing subscription at a new
> server after a master change or dump/reload, resynchronizing table
> contents if necessary; or (4) stop replication, either with or without
> dropping the local copies of the replicated tables. (This is not an
> exhaustive list, I'm sure.)
>

Well this all can be done using pglogical so I don't really understand
what you mean when you say that you don't like the design or what's the
actual problem here (although I don't plan to implement everything in
the first patch submission).

> I don't mean to imply that the existing designs are bad as far as they
> go. In each case, the functionality that has been built is good. But
> it's all focused, as it seems to me, on providing capabilities rather
> than on providing a way for users to manage a group of database
> servers using high-level primitives. That higher-level stuff largely
> gets left to add-on tools, which I don't think is serving us
> particularly well. Those add-on tools often find that the core
> support doesn't quite do everything they'd like it to do: that's why
> WAL-E and repmgr, for example, end up having to do some creative
> things to deliver certain features. We need to start thinking of
> groups of servers rather than individual servers as the unit of
> deployment.
>

You can't build the highlevel management parts without first having the
per node lower level parts done. It would be nice to have highlevel
parts as well but nobody wrote that so far. I hope you don't expect
logical replication patch to do all that, because if you do, you'll be
disappointed.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-14 06:06:31
Message-ID:	CAMsr+YGXJJhwyrK=NZyT8aoTdQzQfvUQyFfZdobQyLt6zJVAPQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 14 July 2016 at 03:06, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Physical replication has
> the same issue. Users don't want to configure archive_command and
> wal_keep_segments and max_wal_senders and wal_level and set up an
> archive and create recovery.conf on the standby. They want to spin up
> a new standby - and we don't provide any way to just do that. [...snip...]

Similarly, when the master fails, users want to promote a
> standby (either one they choose or the one that is determined to be
> furthest ahead) and remaster the others and that's not something you
> can "just do".
>

Oh, I absolutely agree. But that's some pretty epic scope creep, and
weren't you just saying we should cut logical replication to the bone to
get the bare minimum in, letting users deal with keeping table definitions
in sync, etc? We've got a development process where it takes a year to get
even small changes in - mostly for good reasons, but it means it makes
little sense to tie one feature to much bigger ones.

I often feel like with PostgreSQL we give users a box and some assembly
instructions, rather than a complete system. But rather than getting a bike
in a box with a manual, you get the frame in the box, and a really good
manual on how to use the frame, plus some notes to take a look elsewhere to
find wheels, brakes, and a seat, plus an incomplete list of eight different
wheel, brake and seat types. Many of which won't work well together or only
work for some conditions.

But damn, we make a good bike frame, and we document the exact stress
tolerances of the forks!

Similarly, for logical replication, users will want to do things like
> (1) spin up a new logical replication slave out of thin air,
> replicating an entire database or several databases or selected
> replication sets within selected databases; or (2) subscribe an
> existing database to another server, replicating an entire database or
> several databases; or (3) repoint an existing subscription at a new
> server after a master change or dump/reload, resynchronizing table
> contents if necessary; or (4) stop replication, either with or without
> dropping the local copies of the replicated tables. (This is not an
> exhaustive list, I'm sure.)
>

Yep, and all of that's currently either fiddly or impossible.

To do some of those things even remotely well takes a massive amount more
infrastructure though. Lots of people here will dismiss that, like they
always do for things like connection pooling, by saying "an external tool
can do that". Yeah, it can, but it sucks, you get eight different tools
that each solve 80% of the problem (a different subset each), with erratic
docs and maintenance and plenty of bugs. But OTOH even if we all agreed Pg
should have magic self-healing auto-provisioning auto-discovering
auto-scaling logical replication magic, there's a long path from that to
delivering even the basics. Look at how long 2ndQ people have been working
on just getting the basic low level mechanisms in place. There have been
process issues there too, but most of it comes down to sheer scale and the
difficulty of doing it in a co-ordinated, community friendly way that
produces a robust result.

In addition to host management, you've also got little things like a way to
dump schemas from multiple DBs and unify them in a predictable, consistent
way, then keep them up to date as the schemas on each upstream change.
While blocking changes that can't be merged into the downstream or allowing
the downstream to fail. Since right now our schema copy mechanism is "run
this binary and feed the SQL it produces into the other DB" we're rather a
long way from there!

I don't mean to imply that the existing designs are bad as far as they
> go. In each case, the functionality that has been built is good. But
> it's all focused, as it seems to me, on providing capabilities rather
> than on providing a way for users to manage a group of database
> servers using high-level primitives.

100% agree.

BDR tried to get part-way there, but has as many problems as solutions, and
to get that far it imposes a lot of restrictions. It's great for one set of
use cases but has to be used carefully and with a solid understanding.

Many of the limtiations and restrictions imposed by BDR are because of
limitations in the core server that make a smoother, more transparent
solution unfeasable. Like with DDL management, our issues with full table
rewrites, cluster-wide vs database-specific DDL, etc etc etc.

That higher-level stuff largely
> gets left to add-on tools, which I don't think is serving us
> particularly well.

> Those add-on tools often find that the core
> support doesn't quite do everything they'd like it to do: that's why
> WAL-E and repmgr, for example, end up having to do some creative
> things to deliver certain features. We need to start thinking of
> groups of servers rather than individual servers as the unit of
> deployment.

Yes... but it's a long path there, and we'll need to progressively build
server infrastructure to make that posible.

There's also the issue that most companies who work in the PostgreSQL space
have their own tools and have their own interests to protect. We could
pretend that wasn't the case, but we'd still trip over the elephant we're
refusing to see.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-14 06:29:36
Message-ID:	CAMsr+YErNYTHy3qdoyR2=A_tNPgMk-kVZsVd=aV4X=1sCej0qw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 14 July 2016 at 03:26, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Jul 8, 2016 at 5:47 AM, Craig Ringer <craig(at)2ndquadrant(dot)com>
> wrote:
> >> DDL is our standard way of getting things into the system catalogs.
> >> We have no system catalog metadata that is intended to be populated by
> >> any means other than DDL.
> >
> > Replication slots? (Arguably not catalogs, I guess)
> >
> > Replication origins?
>
> Those things aren't catalogs, are they? I mean, as I said in the
> other email I just sent in reply to Simon, if you did a pg_dump and a
> pg_restore, I don't think it would be useful to preserve replication
> slot LSNs afterwards. If I'm wrong, and that is a useful thing to do,
> then we should have a pg_dump flag to do it. Either way, I think we
> do have some work to do figuring out how you can dump, restore, and
> then resume logical replication, probably by establishing a new slot
> and then incrementally resynchronizing without having to copy
> unchanged rows.

Yes, I'd like that too. I'd also like to have fully parallized writeable
queries right now. But we can't build everything all at once.

Before doing parallelized writes, things like dsm, dsm queues, group
locking, worker management, and read parallelism were all necessary.

It's the same with cluster-wide management, dump and restore of replication
state to re-create a replication setup elsewhere, etc. We have to build the
groundwork first. Trying to pour the top storey concrete when the bottom
storey isn't even there yet isn't going to work out. You've argued
effectively the same thing elsewhere, saying that the pglogical submission
tried to do too much and should be further cut down.

I think we're in broad agreement about the desirable direction. What I'm
trying to say is that dump and restore of a logical replication
configuration's state is way harder than you probably expect it to be, and
is not something it's realistic to do at the same time as introducing the
bare bones of logical replication.

We absolutely should dump

> > We have no extension points for DDL.
> >
> > For function interfaces, we do.
> >
> > That, alone, makes a function based interface overwhelmingly compelling
> > unless there are specific things we *cannot reasonably do* without DDL.
>
> I don't understand this. We add new DDL in new releases, and we avoid
> changing the meaning existing of DDL. Using function interfaces won't
> make it possible to change the meaning of existing syntax, and it
> won't make it any more possible to add new syntax. It will just make
> replication commands be spelled differently from everything else.
>

Say you want to upgrade from 9.4 to 10.0 using the new logical replication
features. How would that be possible if you can't add the required
interfaces for setting up the downstream side to 9.4 as an extension?

I think what we're leaning toward here is "don't do that". Tools like
pglogical will have to carry that load until the Pg versions with built-in
replication become the "old" versions to be upgraded _from_.

Ideally the new infrastructure won't have to make normal (non-walsender)
libpq connections and will work entirely over the walsender protocol.
That's not extensible at all, so the point becomes kind of moot, it just
can't be used for downversion upgrades. Pity, but cleaner in the long run.

It does make me wonder if we should look at extension points for the
walsender protocol though, now we're like to have a future desire for newer
versions to connect to older versions - it'd be great if we could do
something like pg_upgrade_support to allow an enhanced logical migration
from 10.0 to 11.0 by installing some extension in 10.0 first.

>
> > In many cases it's actively undesirable to dump and restore logical
> > replication state. Most, I'd say. There probably are cases where it's
> > desirable to retain logical replication state such that restoring a dump
> > resumes replication, but I challenge you to come up with any sensible and
> > sane way that can actually be implemented. Especially since you must
> > obviously consider the possibility of both upstream and downstream being
> > restored from dumps.
>
> Yes, these issues need lots of thought, but I think that replication
> set definitions, at least, are sensible to dump and reload.
>

Yes, I agree that replication set definitions should be able to be dumped
and reloaded.

> I'm concerned about dump-and-restore
> preserving as much state as is usefully possible, because I think
> that's critical for the user experience
>

Right. See the pain points caused by our current dump issues like the
brokenness around dumping security labels, grants, etc on the database its
self. It certainly matters.

The keyword there is "usefully" though. Replication sets: definitely
useful. Knowledge about what peers we were connected to and what we were up
to on those peers: possibly useful, if we have some way to meaningfully
encode that knowledge, but far from crucial, especially since we can't
actually resume replay from them without replication slots and replication
identifiers we can't dump.

It seems we were mostly crossing wires about different assumptions about
what dump and restore would include.

> However, as far as sharding is concerned, no matter how it gets
> implemented, I think logical replication is a key feature.
> Postgres-XC/XL has the idea of "replicated" tables which are present
> on every data node, and that's very important for efficient
> implementation of joins. If you do a join between a little table and
> a big sharded table, you want to be able to push that down to the
> shards, and you can only do that if the entirety of the little table
> is present on every shard or by creating a temporary copy on every
> shard. In many cases, the former will be preferable. So, think it's
> important for sharding that logical replication is fully integrated
> into core in such a manner as to be available as a building block for
> other features.
>

Yep.

I've been looking at ways to integrate pglogical into XL, but it's very far
from easy. It's one area where moving it in core would be helpful.

> At the least, I'm guessing that we'll want a way for whatever code is
> planning join execution to figure out which tables have up-to-date
> copies on servers that are involved in the query. As far as the
> FDW-based approach to sharding is concerned, one thing to think about
> is whether postgres_fdw and logical replication could share one notion
> of where the remote servers are.
>

Yeah, that brings us back to the whole node management concept.

I think there are a few folks doing some preliminary work on that who may
be able to chime in.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-17 18:08:25
Message-ID:	42bcaeeb-80f5-1eae-5cc8-7f36f97f0485@BlueTreble.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 7/13/16 2:06 PM, Joshua D. Drake wrote:
> On 07/07/2016 01:01 PM, Robert Haas wrote:
>
>> There was an unconference session on this topic at PGCon and quite a
>> number of people there stated that they found DDL to be an ease-of-use
>> feature and wanted to have it.
>
> Yeah, I haven't meet anyone yet that would like to have:
>
> select replicate_these_relations('['public']);
>
> vs:
>
> ALTER SCHEMA public ENABLE REPLICATION;
>
> (or something like that).

I generally agree, but I think the more important question is "Why?". Is
it becouse DDL looks more like a sentence? Is it because arrays are a
PITA? Is it too hard to call functions?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

From:	Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-17 18:49:43
Message-ID:	0b1b89f2-67b4-9271-29ac-56efcaf010f0@BlueTreble.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 7/7/16 8:17 PM, Simon Riggs wrote:
> Simplicity is key, I agree. But that's just a user interface feature,
> not a comment on what's underneath the covers. pg_upgrade is not simple
> and is never likely to be so, under the covers.

Right, and what I'd prefer effort put into is making managing
replication in all forms easier. Replication has a lot of uses outside
of upgrades.

FWIW, I've actually never used pg_upgrade because I view it as high-risk
for the environments I've dealt with. There's no ability to fall back to
the old version without losing data, and because of it's binary nature
the odds of some kind of a corruption event happening are far higher
than with something like londiste. Certainly many environments don't
have those concerns though. Having options are good.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-17 18:50:52
Message-ID:	CA+TgmoY+U6saSVgqjCVyP=3e3k8zSZKz0KP4wzB-miU6kFSxAg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 14, 2016 at 2:29 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> Yes, I'd like that too. I'd also like to have fully parallized writeable
> queries right now. But we can't build everything all at once.

I agree.

> Before doing parallelized writes, things like dsm, dsm queues, group
> locking, worker management, and read parallelism were all necessary.

Yep.

> It's the same with cluster-wide management, dump and restore of replication
> state to re-create a replication setup elsewhere, etc. We have to build the
> groundwork first. Trying to pour the top storey concrete when the bottom
> storey isn't even there yet isn't going to work out. You've argued
> effectively the same thing elsewhere, saying that the pglogical submission
> tried to do too much and should be further cut down.

Absolutely. I didn't mean to imply that the scope of that submission
should be expanded. What I'm a bit concerned about is that maybe we
haven't given enough thought to how some of this stuff is going to
work. Petr replied earlier with an assertion that all of the things
that I mentioned could be done using pglogical, but I'm not convinced.
I don't see how you can use pglogical to build a self-healing
replicated cluster, which is ultimately what people want. Maybe
that's just because I'm not personally deeply enmeshed in that
project, but I do try to read all of the relevant threads on
pgsql-hackers and keep tabs on what is happening.

Suppose server A is publishing to server B. Well, clearly, A needs to
have a slot for server B, but does that slot correspond precisely to
the publication, or is that represented in some other way? How is the
subscription represented on server B? What happens if either A or B
undergoes a dump-and-restore cycle? It's just fuzzy to me how this
stuff is supposed to work in detail.

>> I don't understand this. We add new DDL in new releases, and we avoid
>> changing the meaning existing of DDL. Using function interfaces won't
>> make it possible to change the meaning of existing syntax, and it
>> won't make it any more possible to add new syntax. It will just make
>> replication commands be spelled differently from everything else.
>
> Say you want to upgrade from 9.4 to 10.0 using the new logical replication
> features. How would that be possible if you can't add the required
> interfaces for setting up the downstream side to 9.4 as an extension?
>
> I think what we're leaning toward here is "don't do that". Tools like
> pglogical will have to carry that load until the Pg versions with built-in
> replication become the "old" versions to be upgraded _from_.

That may be true, but it's hard to tell whether that's going to be
feasible anyway without a fleshed-out proposal for how this is all
going to work. If this can be made to work for upgrades from 9.4 with
only an extension, that would IMHO be worth trying to do. But, just
for example, adding a replication set capability to 10 isn't going to
affect that one way or the other. For upgrades, you'll want to
replicate the whole database.

> Ideally the new infrastructure won't have to make normal (non-walsender)
> libpq connections and will work entirely over the walsender protocol. That's
> not extensible at all, so the point becomes kind of moot, it just can't be
> used for downversion upgrades. Pity, but cleaner in the long run.

Yeah. I'm entirely willing to leave downgrades to earlier versions to
extensions. "Cleaner in the long run" has got to be a high priority
for core features; if we had not followed that policy in the past,
we'd have an unmaintainable mess now.

> It does make me wonder if we should look at extension points for the
> walsender protocol though, now we're like to have a future desire for newer
> versions to connect to older versions - it'd be great if we could do
> something like pg_upgrade_support to allow an enhanced logical migration
> from 10.0 to 11.0 by installing some extension in 10.0 first.

Maybe, but let's get something that can work from >=10.0 to >=10.0 first.

>> I'm concerned about dump-and-restore
>> preserving as much state as is usefully possible, because I think
>> that's critical for the user experience
>
> Right. See the pain points caused by our current dump issues like the
> brokenness around dumping security labels, grants, etc on the database its
> self. It certainly matters.
>
> The keyword there is "usefully" though. Replication sets: definitely useful.
> Knowledge about what peers we were connected to and what we were up to on
> those peers: possibly useful, if we have some way to meaningfully encode
> that knowledge, but far from crucial, especially since we can't actually
> resume replay from them without replication slots and replication
> identifiers we can't dump.
>
> It seems we were mostly crossing wires about different assumptions about
> what dump and restore would include.

Yes, that may be the case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Jan Wieck <jan(at)wi3ck(dot)info>
To:	Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-17 18:55:20
Message-ID:	CAGBW59djFRsJjBELyDJMPQxnTRzFTqxYhE0=DDPuOT=YcvxgFQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Jul 17, 2016 at 2:08 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:

> On 7/13/16 2:06 PM, Joshua D. Drake wrote:
>
>> On 07/07/2016 01:01 PM, Robert Haas wrote:
>>
>> There was an unconference session on this topic at PGCon and quite a
>>> number of people there stated that they found DDL to be an ease-of-use
>>> feature and wanted to have it.
>>>
>>
>> Yeah, I haven't meet anyone yet that would like to have:
>>
>> select replicate_these_relations('['public']);
>>
>> vs:
>>
>> ALTER SCHEMA public ENABLE REPLICATION;
>>
>> (or something like that).
>>
>
> I generally agree, but I think the more important question is "Why?". Is
> it becouse DDL looks more like a sentence? Is it because arrays are a PITA?
> Is it too hard to call functions?

Once you get fine grained enough to support replicating different sets
of possibly overlapping objects/namespaces to different groups of
recipients, the DDL approach becomes just as convoluted as calling
functions and nobody will memorize the entire syntax.

Jan

>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info

From:	Petr Jelinek <petr(at)2ndquadrant(dot)com>
To:	Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-17 19:22:00
Message-ID:	578BDAD8.4060703@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 17/07/16 20:08, Jim Nasby wrote:
> On 7/13/16 2:06 PM, Joshua D. Drake wrote:
>> On 07/07/2016 01:01 PM, Robert Haas wrote:
>>
>>> There was an unconference session on this topic at PGCon and quite a
>>> number of people there stated that they found DDL to be an ease-of-use
>>> feature and wanted to have it.
>>
>> Yeah, I haven't meet anyone yet that would like to have:
>>
>> select replicate_these_relations('['public']);
>>
>> vs:
>>
>> ALTER SCHEMA public ENABLE REPLICATION;
>>
>> (or something like that).
>
> I generally agree, but I think the more important question is "Why?". Is
> it becouse DDL looks more like a sentence? Is it because arrays are a
> PITA? Is it too hard to call functions?

For me it's many small reasons. I want to store it in catalogs and some
things there are nicer when you manipulate using standard DDL processing
(like dependencies for example). The syntax is also bit nicer. Our
documentation works better for DDLs than functions (that's something we
should fix but I am not doing it as part of this patch). Same goes for
psql tab completion. We automatically gain things like event triggers.
The support in pg_dump is also more straightforward with DDL.

It might make sense to have functions for manipulating slots and origins
as those are just primitives which user should not have to fiddle with
but for things that are directly meant for user interaction DDL just
feels better.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Petr Jelinek <petr(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-17 20:02:26
Message-ID:	578BE452.6040007@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 17/07/16 20:50, Robert Haas wrote:
>
>> It's the same with cluster-wide management, dump and restore of replication
>> state to re-create a replication setup elsewhere, etc. We have to build the
>> groundwork first. Trying to pour the top storey concrete when the bottom
>> storey isn't even there yet isn't going to work out. You've argued
>> effectively the same thing elsewhere, saying that the pglogical submission
>> tried to do too much and should be further cut down.
>
> Absolutely. I didn't mean to imply that the scope of that submission
> should be expanded. What I'm a bit concerned about is that maybe we
> haven't given enough thought to how some of this stuff is going to
> work. Petr replied earlier with an assertion that all of the things
> that I mentioned could be done using pglogical, but I'm not convinced.
> I don't see how you can use pglogical to build a self-healing
> replicated cluster, which is ultimately what people want. Maybe
> that's just because I'm not personally deeply enmeshed in that
> project, but I do try to read all of the relevant threads on
> pgsql-hackers and keep tabs on what is happening.
>

That really depends on what you call self-healing replicated cluster.

> Suppose server A is publishing to server B. Well, clearly, A needs to
> have a slot for server B, but does that slot correspond precisely to
> the publication, or is that represented in some other way? How is the
> subscription represented on server B? What happens if either A or B
> undergoes a dump-and-restore cycle? It's just fuzzy to me how this
> stuff is supposed to work in detail.

Yeah, that's because it is. The dump/restore cycle would work provided
you stopped the replication before doing it. That might not be perfect
but it's still more than physical can do. Solving more complex scenarios
is something for the future. Logical PITR might be the answer for that,
not sure yet.

About slots. Slots are just primitive which helps us to get snapshot
mapped to LSN, keep the historical catalog and wal files. There is no
reason for single replication path to be forever tied to single slot
though. In fact in pglogical (and my plan for core is same) we already
create limited lifespan slots to get new snapshots when either adding
new table or re-syncing the existing one. This is one of the reasons why
I don't really see much usefulness in being able to do snapshot inside
the slot once it started replicating already btw, using multiple slots
has also advantage of parallelism (replication of other tables does not
lag because we are syncing another one).

>
>> It does make me wonder if we should look at extension points for the
>> walsender protocol though, now we're like to have a future desire for newer
>> versions to connect to older versions - it'd be great if we could do
>> something like pg_upgrade_support to allow an enhanced logical migration
>> from 10.0 to 11.0 by installing some extension in 10.0 first.
>
> Maybe, but let's get something that can work from >=10.0 to >=10.0 first.
>

Agreed.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Jan Wieck <jan(at)wi3ck(dot)info>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-17 21:16:41
Message-ID:	578BF5B9.5010301@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 07/17/2016 11:55 AM, Jan Wieck wrote:

> Yeah, I haven't meet anyone yet that would like to have:
>
> select replicate_these_relations('['public']);
>
> vs:
>
> ALTER SCHEMA public ENABLE REPLICATION;
>
> (or something like that).
>
>
> I generally agree, but I think the more important question is
> "Why?". Is it becouse DDL looks more like a sentence? Is it because
> arrays are a PITA? Is it too hard to call functions?

IMO, because it isn't code. I think that people forget that many, many
DBAs are not developers, they are business analysts that happen to also
be DBAs. Similarly, there is a reason why MongoDB/NoSQL will never be as
popular as good old fashion SQL.

>
>
> Once you get fine grained enough to support replicating different sets
> of possibly overlapping objects/namespaces to different groups of
> recipients, the DDL approach becomes just as convoluted as calling
> functions and nobody will memorize the entire syntax.
>

Ehh, partially true. For example, I don't know every single nuance of
ALTER TABLE but that is what the \h is for. Replication would be no
different.

From:	Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To:	Petr Jelinek <petr(at)2ndquadrant(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-18 22:17:36
Message-ID:	c00c1a9f-a488-8d58-da17-ff15bfbc9a1b@BlueTreble.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 7/17/16 2:22 PM, Petr Jelinek wrote:
>> I generally agree, but I think the more important question is "Why?". Is
>> it becouse DDL looks more like a sentence? Is it because arrays are a
>> PITA? Is it too hard to call functions?
>
> For me it's many small reasons. I want to store it in catalogs and some
> things there are nicer when you manipulate using standard DDL processing
> (like dependencies for example).

Fair point.

> The syntax is also bit nicer. Our
> documentation works better for DDLs than functions (that's something we
> should fix but I am not doing it as part of this patch). Same goes for
> psql tab completion. We automatically gain things like event triggers.

I'd think all of those we'd want to be able to support for functions as
well...

> The support in pg_dump is also more straightforward with DDL.

Hmm... not sure why that is. It does seem to me that support for
extension configuration isn't as strong as it could be.

> It might make sense to have functions for manipulating slots and origins
> as those are just primitives which user should not have to fiddle with
> but for things that are directly meant for user interaction DDL just
> feels better.

I do agree that DDL "feels better" (which I think is what JD was
alluding too).

I had a secret agenda in asking why it's better though: can we find a
way to allow extensions to do "DDL-ish" things in a better way than how
they're stuck doing them today. I suspect it will never be practical to
have extensions modifying grammar willy-nilly, but maybe there's some
other things we could do to make life easier. One thought is an
"extension command" mode you can enter that means everything you're
typing gets treated as a call to a function in that extension:

EXTENSION MODE citus;
master_create_distributed_table 'github_events', 'created_at', 'append';
EXTENSION MODE;

instead of SELECT master_create_distributed_table('github_events',
'created_at', 'append');

obviously that's completely pointless for a single command, but if you
needed to do a bunch of things it starts saving typing.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-18 22:33:30
Message-ID:	578D593A.3000002@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 07/18/2016 03:17 PM, Jim Nasby wrote:
> On 7/17/16 2:22 PM, Petr Jelinek wrote:

> I do agree that DDL "feels better" (which I think is what JD was
> alluding too).

Yes and no. It reads better and is more clear to those who are not
developers or have a developer background which is, many in the database
field. It is also easier to type. I type 120 a minute on a roll, that is
until I have to do this ('[ ..... Simple wording base command structure
is much more efficient.

ALTER TABLE FOO ENABLE REPLICATION ON SLAVE 0;

select enable_replication_for_table('['foo']',0);

Guess which one was typed without a single error and more quickly.

Sincerely,

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-27 20:35:06
Message-ID:	20160727203506.GG27321@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Jul 8, 2016 at 12:18:28AM +0100, Simon Riggs wrote:
> On 7 July 2016 at 21:10, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> pg_upgrade does that, kinda. I'd like to have something better, but
> in the absence of that, I think it's quite wrong to think about
> deprecating it, even if we had logical replication fully integrated
> into core today. Which we by no means do.
>
> I don't see any problem with extending pg_upgrade to use logical replication
> features under the covers.
>
> It seems very smooth to be able to just say
>
> pg_upgrade --online
>
> and then specify whatever other parameters that requires.
>
> It would be much easier to separate out that as a use-case so we can be sure we
> get that in 10.0, even if nothing else lands.

Uh, while "pg_upgrade --online" looks cool, I am not sure a solution
based on logical replication would share _any_ code with the existing
pg_upgrade tool, so it seems best to use another binary for this.

I guess we could use the pg_dump/pg_restore pg_upgrade code to create
the objects, and use logical replication to copy the rows, but what does
this gain us that pg_dump/pg_restore doesn't? Wouldn't you just create
the standby using logical replication and just switch-over? Why use
pg_upgrade at all? Am I missing something?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-28 01:22:17
Message-ID:	CAMsr+YFz63drDzjTwmTWSzvop7V+axrK1Ncq56aSNhqbULJmow@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 28 July 2016 at 04:35, Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> On Fri, Jul 8, 2016 at 12:18:28AM +0100, Simon Riggs wrote:
> > On 7 July 2016 at 21:10, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > pg_upgrade does that, kinda. I'd like to have something better, but
> > in the absence of that, I think it's quite wrong to think about
> > deprecating it, even if we had logical replication fully integrated
> > into core today. Which we by no means do.
> >
> > I don't see any problem with extending pg_upgrade to use logical
> replication
> > features under the covers.
> >
> > It seems very smooth to be able to just say
> >
> > pg_upgrade --online
> >
> > and then specify whatever other parameters that requires.
> >
> > It would be much easier to separate out that as a use-case so we can be
> sure we
> > get that in 10.0, even if nothing else lands.
>
> Uh, while "pg_upgrade --online" looks cool, I am not sure a solution
> based on logical replication would share _any_ code with the existing
> pg_upgrade tool, so it seems best to use another binary for this.
>

It might, actually. One approach for online upgrade is to:

* pg_basebackup the master
* start the replica and let it catch up
* create a logical replication slot on the master
* replace the replication.conf on the basebackup so it stops recovery at
the lsn of the replication slot's confirmed_flush_lsn
* stop the replica and pg_upgrade it
* have the upgraded replica, now a master, replay from the old master over
logical replication
* once caught up, switch over

This means a full dump and reload with a full rebuild of all indexes, etc,
isn't needed. All shared catalog stuff is copied (until we switch to
logical rep for the final catch-up).

I guess we could use the pg_dump/pg_restore pg_upgrade code to create
> the objects, and use logical replication to copy the rows, but what does
> this gain us that pg_dump/pg_restore doesn't?

A consistent switch-over point, where the upgrade can happen while the
master is still writing.

We create a slot, dump from the slot's exported snapshot, and switch over
to logical replication consistently at the end of the dump.

That's pretty much what BDR and pglogical do.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-28 02:38:33
Message-ID:	CAB7nPqQorVtGeCGL3ro-21N7iUh7XYeyW_4wqOwZ3uaZQq-b-w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 28, 2016 at 10:22 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> It might, actually. One approach for online upgrade is to:
>
> * pg_basebackup the master
> * start the replica and let it catch up
> * create a logical replication slot on the master
> * replace the replication.conf on the basebackup so it stops recovery at the
> lsn of the replication slot's confirmed_flush_lsn
> * stop the replica and pg_upgrade it
> * have the upgraded replica, now a master, replay from the old master over
> logical replication
> * once caught up, switch over
>
> This means a full dump and reload with a full rebuild of all indexes, etc,
> isn't needed. All shared catalog stuff is copied (until we switch to logical
> rep for the final catch-up).

This is a per-database logic to perform an upgrade of a single
database, right? If a cluster has multiple databases you need one
logical slot per database to complete an upgrade, which is where
sync_synchronous_names which is able to take now multiple entries
helps as well to ensure that the former master is in sync with the all
the logical slots in place.
--
Michael

From:	David Fetter <david(at)fetter(dot)org>
To:	Jan Wieck <jan(at)wi3ck(dot)info>
Cc:	Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-28 04:34:51
Message-ID:	20160728043451.GA3013@fetter.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Jul 17, 2016 at 02:55:20PM -0400, Jan Wieck wrote:
> On Sun, Jul 17, 2016 at 2:08 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> > On 7/13/16 2:06 PM, Joshua D. Drake wrote:
> >> On 07/07/2016 01:01 PM, Robert Haas wrote:
> >> There was an unconference session on this topic at PGCon and quite a
> >>> number of people there stated that they found DDL to be an ease-of-use
> >>> feature and wanted to have it.
> >>
> >> Yeah, I haven't meet anyone yet that would like to have:
> >>
> >> select replicate_these_relations('['public']);
> >>
> >> vs:
> >>
> >> ALTER SCHEMA public ENABLE REPLICATION;
> >>
> >> (or something like that).
> >>
> >
> > I generally agree, but I think the more important question is "Why?". Is
> > it becouse DDL looks more like a sentence? Is it because arrays are a PITA?
> > Is it too hard to call functions?
>
> Once you get fine grained enough to support replicating different
> sets of possibly overlapping objects/namespaces to different groups
> of recipients, the DDL approach becomes just as convoluted as
> calling functions and nobody will memorize the entire syntax.

I don't see this as an actual problem. I've written parts of the
SELECT syntax, but I haven't memorized even all of that.

DDL doesn't promise to be more complicated or easier to get wrong than
function calls, as far as I can tell. The opposite could well be
true.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A Modest Upgrade Proposal
Date:	2016-07-29 02:00:16
Message-ID:	20160729020016.GB12804@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jul 28, 2016 at 09:22:17AM +0800, Craig Ringer wrote:
> It might, actually. One approach for online upgrade is to:
>
> * pg_basebackup the master
> * start the replica and let it catch up
> * create a logical replication slot on the master
> * replace the replication.conf on the basebackup so it stops recovery at the
> lsn of the replication slot's confirmed_flush_lsn
> * stop the replica and pg_upgrade it
> * have the upgraded replica, now a master, replay from the old master over
> logical replication
> * once caught up, switch over
>
> This means a full dump and reload with a full rebuild of all indexes, etc,
> isn't needed. All shared catalog stuff is copied (until we switch to logical
> rep for the final catch-up).

Right, using pg_upgrade as part of a logical upgrade procedure makes
sense. I was referring to having pg_upgrade --online do all of those
bullets plus what is does now --- that just seems like it would fail as
too complex, and if someone wanted to do just pg_upgrade without the
logical, the manual page would be incomprehensible.

In summary, I think we need to keep pg_upgrade doing what it does well,
and come up with another tool that does those bullets. Heck, people
already can't find --link.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +