Re: disposition of remaining patches

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: disposition of remaining patches
Date: 2011-02-18 22:47:42
Message-ID: AANLkTim=Ne5ECeoBzEW=8z4DJcZWTO9EVLWgzvhMC_PD@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The CommitFest application currently reflects 17 remaining patches for
CommitFest 2011-01.

1. Change pg_last_xlog_receive_location not to move backwards. We
don't have complete consensus on what to do here. If we can agree on
a way forward, I think we can finish this one up pretty quickly. It's
partially being held up by #2.
2. Synchronous replication. Splitting up this patch has allowed some
progress to be made here, but there is a lot left to do, and I fear
that trying to hash out the design issues at this late date is not
going to lead to a great final product. The proposed timeout to make
the server boot out clients that don't seem to be responding is not
worth committing, as it will only work when the server isn't
generating WAL, which can't be presumed to be the normal state of
affairs. The patch to avoid ever letting the WAL sender status go
backward from catchup to streaming was committed without discussion,
and needs to be reverted for reasons discussed on that thread. An
updated version of the main patch has yet to be posted.
3, 4, 5. SQL/MED. Tom has picked up the main FDW API patch, which I
expect means it'll go in. I am not so sure about the FDW patches,
though: in particular, based on Heikki's comments, the postgresql_fdw
patch seems to be badly in need of some more work. The file_fdw patch
may be in better shape (I'm not 100% sure), but it needs the encoding
fix patch Itagaki Takahiro recently proposed. For this to be
worthwhile, we presumably need to get at least one FDW committed along
with the API patch.
6. Writeable CTEs. Tom said he'd look at this one.
7. contrib/btree_gist KNN. Needs updating as a result of the
extensions patch. This ball is really in Teodor and Oleg's court.
8, 9, 10, 11, 12, 13, 14. PL/python patches. I believe Peter was
working on these, but I haven't seen any updates in a while.
15. Fix snapshot taking inconsistencies. Tom said he'd look at this one.
16. synchronized snapshots. Alvaro is working on this one.
17. determining client_encoding from client locale. This is Peter's
patch. Peter, are you planning to commit this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-18 23:04:06
Message-ID: 27371.1298070246@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> 3, 4, 5. SQL/MED. Tom has picked up the main FDW API patch, which I
> expect means it'll go in. I am not so sure about the FDW patches,
> though: in particular, based on Heikki's comments, the postgresql_fdw
> patch seems to be badly in need of some more work. The file_fdw patch
> may be in better shape (I'm not 100% sure), but it needs the encoding
> fix patch Itagaki Takahiro recently proposed. For this to be
> worthwhile, we presumably need to get at least one FDW committed along
> with the API patch.

FWIW, my thought is to try to get the API patch committed and then do
the file_fdw patch. Maybe I'm hopelessly ASCII-centric, but I do not
see encoding considerations as a blocking factor for this. If we define
that files are read in the database encoding, it's still a pretty damn
useful feature. We can look at whether that can be improved after we
have some kind of feature at all.

postgresql_fdw may have to live as an external project for the 9.1
cycle, unless it's in much better shape than you suggest above.
I won't feel too bad about that as long as the core support exists.
More than likely, people would want to improve it on a faster release
cycle than the core anyway.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-18 23:06:06
Message-ID: 4D5EFB5E.30601@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/18/11 2:47 PM, Robert Haas wrote:
> The CommitFest application currently reflects 17 remaining patches for
> CommitFest 2011-01.

I'm impressed, actually. This is way further along than I expected us
to be.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-18 23:07:35
Message-ID: 4D5EFBB7.6000209@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/18/11 3:04 PM, Tom Lane wrote:
> postgresql_fdw may have to live as an external project for the 9.1
> cycle, unless it's in much better shape than you suggest above.
> I won't feel too bad about that as long as the core support exists.
> More than likely, people would want to improve it on a faster release
> cycle than the core anyway.

FDWs seem like perfect candidates for Extensions. We'll eventually want
postgresql_fdw in core, but most FDWs will never be there.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-18 23:13:46
Message-ID: 4D5EFD2A.8080905@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/18/2011 05:47 PM, Robert Haas wrote:
> 3, 4, 5. SQL/MED. Tom has picked up the main FDW API patch, which I
> expect means it'll go in. I am not so sure about the FDW patches,
> though: in particular, based on Heikki's comments, the postgresql_fdw
> patch seems to be badly in need of some more work. The file_fdw patch
> may be in better shape (I'm not 100% sure), but it needs the encoding
> fix patch Itagaki Takahiro recently proposed. For this to be
> worthwhile, we presumably need to get at least one FDW committed along
> with the API patch.

I'm not sure it's not useful without, but it would be better with it. I
agree we need some actual uses.

If people want more I'm prepared to put some hurried effort into making
one just for copy to text array, since the consensus didn't seems to be
in favor of piggybacking this onto the file_fdw. That would exercise the
part of the new COPY API that would not otherwise not be exercised by
file_fdw. If not, I'll eventually contribute that for 9.2.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-18 23:25:02
Message-ID: 27878.1298071502@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> On 2/18/11 3:04 PM, Tom Lane wrote:
>> postgresql_fdw may have to live as an external project for the 9.1
>> cycle, unless it's in much better shape than you suggest above.
>> I won't feel too bad about that as long as the core support exists.
>> More than likely, people would want to improve it on a faster release
>> cycle than the core anyway.

> FDWs seem like perfect candidates for Extensions. We'll eventually want
> postgresql_fdw in core, but most FDWs will never be there.

Yeah, agreed as to both points. I would imagine that we'd absorb
postgresql_fdw into core late in the 9.2 devel cycle, which would still
leave quite a few months where it could be improved on a rapid release
cycle.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-19 01:20:14
Message-ID: AANLkTinAKx2AxRH3i=E0e0xTppFMzFvoT1MCH+vvmvkU@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 18, 2011 at 6:04 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> FWIW, my thought is to try to get the API patch committed and then do
> the file_fdw patch.  Maybe I'm hopelessly ASCII-centric, but I do not
> see encoding considerations as a blocking factor for this.  If we define
> that files are read in the database encoding, it's still a pretty damn
> useful feature.  We can look at whether that can be improved after we
> have some kind of feature at all.

Sure. OTOH, Itagaki Takahiro's solution wasn't a lot of code, so if
he feels reasonably confident in it, I'd like to see it committed.

> postgresql_fdw may have to live as an external project for the 9.1
> cycle, unless it's in much better shape than you suggest above.
> I won't feel too bad about that as long as the core support exists.
> More than likely, people would want to improve it on a faster release
> cycle than the core anyway.

I think as long as we have one implementation in contrib, we're OK to release.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-23 17:54:02
Message-ID: AANLkTik7rxoX5cNwLRa4wer+qeDe2WHVma_AiBczOrXW@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The CommitFest application currently reflects 17 remaining patches for
> CommitFest 2011-01.

Now we're down to 12. As usual, the last few patches take the longest...

> 1. Change pg_last_xlog_receive_location not to move backwards.  We
> don't have complete consensus on what to do here.  If we can agree on
> a way forward, I think we can finish this one up pretty quickly.  It's
> partially being held up by #2.

No change.

> 2. Synchronous replication.  Splitting up this patch has allowed some
> progress to be made here, but there is a lot left to do, and I fear
> that trying to hash out the design issues at this late date is not
> going to lead to a great final product.  The proposed timeout to make
> the server boot out clients that don't seem to be responding is not
> worth committing, as it will only work when the server isn't
> generating WAL, which can't be presumed to be the normal state of
> affairs.  The patch to avoid ever letting the WAL sender status go
> backward from catchup to streaming was committed without discussion,
> and needs to be reverted for reasons discussed on that thread.  An
> updated version of the main patch has yet to be posted.

This has gotten a bunch of review, on several different threads. I
assume Simon will publish an update when he gets back to his
keyboard...

> 3, 4, 5. SQL/MED.  Tom has picked up the main FDW API patch, which I
> expect means it'll go in.  I am not so sure about the FDW patches,
> though: in particular, based on Heikki's comments, the postgresql_fdw
> patch seems to be badly in need of some more work.  The file_fdw patch
> may be in better shape (I'm not 100% sure), but it needs the encoding
> fix patch Itagaki Takahiro recently proposed.  For this to be
> worthwhile, we presumably need to get at least one FDW committed along
> with the API patch.

The core and file_fdw patches are in; postgresql_fdw is being reworked
by the author.

> 6. Writeable CTEs.  Tom said he'd look at this one.
> 7. contrib/btree_gist KNN.  Needs updating as a result of the
> extensions patch.  This ball is really in Teodor and Oleg's court.

No change on these.

> 8, 9, 10, 11, 12, 13, 14.  PL/python patches.  I believe Peter was
> working on these, but I haven't seen any updates in a while.

Peter committed two of these seven, leaving five to be addressed.

> 15. Fix snapshot taking inconsistencies.  Tom said he'd look at this one.

No change on this one.

> 16. synchronized snapshots.  Alvaro is working on this one.

Lots of discussion of this one, but current status is not clear to me.
Alvaro, are you working on this actively?

> 17. determining client_encoding from client locale.  This is Peter's
> patch.  Peter, are you planning to commit this?

Peter committed this one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: disposition of remaining patches
Date: 2011-02-23 18:05:53
Message-ID: 1298484264-sup-7818@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Robert Haas's message of mié feb 23 14:54:02 -0300 2011:
> On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> > 16. synchronized snapshots.  Alvaro is working on this one.
>
> Lots of discussion of this one, but current status is not clear to me.
> Alvaro, are you working on this actively?

I am. I'm not sure if it's still reasonable to get into 9.1, given that
it needs to be rewritten from almost completely from scratch.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: disposition of remaining patches
Date: 2011-02-23 18:14:04
Message-ID: AANLkTi=aqFtbFqTszfsVoQG0qquYyHrE3eY537SbaV3q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 23, 2011 at 1:05 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Robert Haas's message of mié feb 23 14:54:02 -0300 2011:
>> On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> > 16. synchronized snapshots.  Alvaro is working on this one.
>>
>> Lots of discussion of this one, but current status is not clear to me.
>>  Alvaro, are you working on this actively?
>
> I am.  I'm not sure if it's still reasonable to get into 9.1, given that
> it needs to be rewritten from almost completely from scratch.

Well, if it gets punted, I won't be too sad, since the pg_dump patch
to actually use this functionality for something useful already got
pushed off. If you can commit something in a timely fashion that is
also high quality, great, but if not, I don't see it as a
show-stopper. The highest priorities IMO are writeable CTEs and
synchronous replication. I doubt that there would be majority support
for prolonging this on the basis of any other single patch, though I
might be wrong about that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: disposition of remaining patches
Date: 2011-02-23 18:34:32
Message-ID: 1298485977-sup-7411@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Robert Haas's message of mié feb 23 15:14:04 -0300 2011:
> On Wed, Feb 23, 2011 at 1:05 PM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
> > Excerpts from Robert Haas's message of mié feb 23 14:54:02 -0300 2011:
> >> On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> >> > 16. synchronized snapshots.  Alvaro is working on this one.
> >>
> >> Lots of discussion of this one, but current status is not clear to me.
> >>  Alvaro, are you working on this actively?
> >
> > I am.  I'm not sure if it's still reasonable to get into 9.1, given that
> > it needs to be rewritten from almost completely from scratch.
>
> Well, if it gets punted, I won't be too sad, since the pg_dump patch
> to actually use this functionality for something useful already got
> pushed off.

Oh, I thought that patch was committed which is why I was in a bit of a
hurry. I will mark this one "returned with feedback" too, then.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: disposition of remaining patches
Date: 2011-02-23 18:56:29
Message-ID: AANLkTinnkRhEoSNs1Ye6r4K_QPkYY0MRLjvO5A5PEBiX@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 23, 2011 at 1:34 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Robert Haas's message of mié feb 23 15:14:04 -0300 2011:
>> On Wed, Feb 23, 2011 at 1:05 PM, Alvaro Herrera
>> <alvherre(at)commandprompt(dot)com> wrote:
>> > Excerpts from Robert Haas's message of mié feb 23 14:54:02 -0300 2011:
>> >> On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> >
>> >> > 16. synchronized snapshots.  Alvaro is working on this one.
>> >>
>> >> Lots of discussion of this one, but current status is not clear to me.
>> >>  Alvaro, are you working on this actively?
>> >
>> > I am.  I'm not sure if it's still reasonable to get into 9.1, given that
>> > it needs to be rewritten from almost completely from scratch.
>>
>> Well, if it gets punted, I won't be too sad, since the pg_dump patch
>> to actually use this functionality for something useful already got
>> pushed off.
>
> Oh, I thought that patch was committed which is why I was in a bit of a
> hurry.  I will mark this one "returned with feedback" too, then.

No, the directory archive format patch was committed, but the parallel
pg_dump one got pushed off.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-23 19:49:29
Message-ID: 4D6564C9.2040306@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
>> 2. Synchronous replication. Splitting up this patch has allowed some
>>
> This has gotten a bunch of review, on several different threads. I
> assume Simon will publish an update when he gets back to his
> keyboard...
>

That was the idea. If anyone has any serious concerns about the current
patch, please don't hold off just because you know Simon is away for a
bit. We've been trying to keep that from impacting community progress
too badly this week.

On top of 4 listed reviewers I know Dan Farina is poking at the last
update, so we may see one more larger report on top of what's already
shown up. And Jaime keeps kicking the tires too. What Simon was hoping
is that a week of others looking at this would produce enough feedback
that it might be possible to sweep the remaining issues up soon after
he's back. It looks to me like that's about when everything else that's
still open will probably settle too.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: Daniel Farina <daniel(at)heroku(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-25 08:14:18
Message-ID: AANLkTiny3piBqS_FAf9gN-Ws5Ok06Y7xwoMD4iRQqL=_@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 23, 2011 at 11:49 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Robert Haas wrote:
>>>
>>> 2. Synchronous replication.  Splitting up this patch has allowed some
> On top of 4 listed reviewers I know Dan Farina is poking at the last update,
> so we may see one more larger report on top of what's already shown up.  And
> Jaime keeps kicking the tires too.  What Simon was hoping is that a week of
> others looking at this would produce enough feedback that it might be
> possible to sweep the remaining issues up soon after he's back.  It looks to
> me like that's about when everything else that's still open will probably
> settle too.

Besides some of the fixable issues, I am going to have to echo
Robert's sentiments about a few kinks that go beyond mechanism in the
syncrep patch: in particular, it will *almost* solve the use case I
was hoping to solve: a way to cleanly perform planned switchovers
between machines with minimal downtime and no lost data. But there are
a couple of holes I have thought of so far:

1. The 2-safe methodology supported is not really compatible with
performing planned-HA-switchover of a cluster with its own syncrep
guarantees on top of that. For example:

Server A syncreps to Server B

Now I want to provision server A-prime, which will eventually take the
place of A.

Server A syncreps to Server B
Server A syncreps to Server A-prime

Right now, as it stands, the syncrep patch will be happy as soon as
the data has been fsynced to either B or A-prime; I don't think we can
guarantee at any point that A-prime can become the leader, and feed B.

2. The unprivileged user can disable syncrep, in any situation. This
flexibility is *great*, but you don't really want people to do it when
one is performing the switchover. Rather, in a magical world we'd hope
that disabling syncrep would just result in not having to
synchronously commit to B (but, in this case, still synchronously
commit to A-prime)

In other words, to my mind, you can use syncrep as-is to provide
2-safe durability xor a scheduled switchover: as soon as someone wants
both, I think they'll have some trouble. I do want both, though.

--
fdr


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-25 12:43:06
Message-ID: AANLkTikO1MWWg3Qce2GoSh_W4hrjj7QB4dLfeNXud5+5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 25, 2011 at 3:14 AM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
> On Wed, Feb 23, 2011 at 11:49 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>> Robert Haas wrote:
>>>>
>>>> 2. Synchronous replication.  Splitting up this patch has allowed some
>> On top of 4 listed reviewers I know Dan Farina is poking at the last update,
>> so we may see one more larger report on top of what's already shown up.  And
>> Jaime keeps kicking the tires too.  What Simon was hoping is that a week of
>> others looking at this would produce enough feedback that it might be
>> possible to sweep the remaining issues up soon after he's back.  It looks to
>> me like that's about when everything else that's still open will probably
>> settle too.
>
> Besides some of the fixable issues, I am going to have to echo
> Robert's sentiments about a few kinks that go beyond mechanism in the
> syncrep patch: in particular, it will *almost* solve the use case I
> was hoping to solve: a way to cleanly perform planned switchovers
> between machines with minimal downtime and no lost data. But there are
> a couple of holes I have thought of so far:

Well, just because the patch doesn't solve every use case isn't a
reason not to go forward with it - we can always add more options
later - but I have to admit that I'm kind of alarmed about the number
of bugs reported so far.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: marcin mank <marcin(dot)mank(at)gmail(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-25 13:25:09
Message-ID: AANLkTinGf4-EFh+HEyiD+q1hR95VCtVE+egn-O7uq711@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 25, 2011 at 9:14 AM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
>
> Right now, as it stands, the syncrep patch will be happy as soon as
> the data has been fsynced to either B or A-prime; I don't think we can
> guarantee at any point that A-prime can become the leader, and feed B.
>

- start A` up, replicating from A
- shutdown B (now A nad A` are synchronous)
now real quick:
- shut down A
- shut down A`
-change configuration
-start up A`
-start up B

Doesn`t this work?

Greetings
Marcin


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: sync rep design architecture (was "disposition of remaining patches")
Date: 2011-02-25 16:40:27
Message-ID: 4D67DB7B.8030801@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel Farina wrote:
> Server A syncreps to Server B
>
> Now I want to provision server A-prime, which will eventually take the
> place of A.
>
> Server A syncreps to Server B
> Server A syncreps to Server A-prime
>
> Right now, as it stands, the syncrep patch will be happy as soon as
> the data has been fsynced to either B or A-prime; I don't think we can
> guarantee at any point that A-prime can become the leader, and feed B.
>

One of the very fundamental breaks between how this patch implements
sync rep and what some people might expect is this concern. Having such
tight control over the exact order of failover isn't quite here yet, so
sometimes people will need to be creative to work within the
restrictions of what is available. The path for this case is probably:

1) Wait until A' is caught up
2) Switchover to B as the right choice to be the new master, with A' as
its standby and A going off-line at the same time.
3) Switchover the master role from B to A'. Bring up B as its standby.

There are other possible transition plans available too.

I appreciate that you would like to do this as an atomic operation,
rather than handling it as two steps--one of which puts you in a middle
point where B, a possibly inferior standby, is operating at the master.
There are a dozen other complicated "my use case says I want <X> and it
must be done as <Y>" requests for Sync Rep floating around here, too.
They're all getting ignored in favor of something smaller that can get
built today.

The first question I'd ask is whether you could you settle for this more
cumbersome than you'd prefer switchover plan for now. The second is
whether implementing what this feature currently does would get in the
way of coding of what you really want eventually.

I didn't get the Streaming Rep + Hot Standby features I wanted in 9.0
either. But committing what was reasonable to include in that version
let me march forward with very useful new code, doing another year of
development on my own projects and getting some new things get fixed in
core. And so far it looks like 9.1 will sort out all of the kinks I was
unhappy about. The same sort of thing will need to happen to get Sync
Rep committed and then appropriate for more use cases. There isn't any
margin left for discussions of scope creep left here; really it's "is
this subset useful for some situations and stable enough to commit" now.

> 2. The unprivileged user can disable syncrep, in any situation. This
> flexibility is *great*, but you don't really want people to do it when
> one is performing the switchover.

For the moment you may have to live with a situation where user
connections must be blocked during the brief moment of switchover to
eliminate this issue. That's what I end up doing with 9.0 production
systems to get a really clean switchover, there's a second of hiccup
even in the best case. I'm not sure yet of the best way yet to build a
UI to make that more transparent in the sync rep case. It's sure not a
problem that's going to get solved in this release though.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: Daniel Farina <daniel(at)heroku(dot)com>
To: marcin mank <marcin(dot)mank(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-25 19:24:31
Message-ID: AANLkTi=260oeNvn9QBPV60N8TJ9Bd79aVJWPFPjQz8kx@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 25, 2011 at 5:25 AM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
> On Fri, Feb 25, 2011 at 9:14 AM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
>>
>> Right now, as it stands, the syncrep patch will be happy as soon as
>> the data has been fsynced to either B or A-prime; I don't think we can
>> guarantee at any point that A-prime can become the leader, and feed B.
>>
>
> - start A` up, replicating from A
> - shutdown B (now A nad A` are synchronous)
> now real quick:
> - shut down A
> - shut down A`
> -change configuration
> -start up A`
> -start up B
>
> Doesn`t this work?

This dance does work, but it would be very nice to not have to take
the standby ('B' in my case) offline.

--
fdr


From: Daniel Farina <daniel(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-25 19:33:28
Message-ID: AANLkTi=JVBLoRqpy7AsKHw5RqGG6Xf0HMdox40C8KdmM@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 25, 2011 at 4:43 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Feb 25, 2011 at 3:14 AM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
>> On Wed, Feb 23, 2011 at 11:49 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>>> Robert Haas wrote:
>>>>>
>>>>> 2. Synchronous replication.  Splitting up this patch has allowed some
>>> On top of 4 listed reviewers I know Dan Farina is poking at the last update,
>>> so we may see one more larger report on top of what's already shown up.  And
>>> Jaime keeps kicking the tires too.  What Simon was hoping is that a week of
>>> others looking at this would produce enough feedback that it might be
>>> possible to sweep the remaining issues up soon after he's back.  It looks to
>>> me like that's about when everything else that's still open will probably
>>> settle too.
>>
>> Besides some of the fixable issues, I am going to have to echo
>> Robert's sentiments about a few kinks that go beyond mechanism in the
>> syncrep patch: in particular, it will *almost* solve the use case I
>> was hoping to solve: a way to cleanly perform planned switchovers
>> between machines with minimal downtime and no lost data. But there are
>> a couple of holes I have thought of so far:
>
> Well, just because the patch doesn't solve every use case isn't a
> reason not to go forward with it - we can always add more options
> later - but I have to admit that I'm kind of alarmed about the number
> of bugs reported so far.

True: the relevance of any use case to acceptance is up to some
debate. I haven't thought about how to remedy this, just thinking
aloud about a problem I would have as-is, and is important to me. It
is true that later accretion of options can occur, but sometimes the
initial choice of semantics can make growing those easier or harder.
I haven't yet thought ahead as to how the current scheme would impact
that.

I know I got hit by a backend synchronization (in the sense of locks,
etc) bugs; do you think it is possible yours (sending SIGSTOP) could
be the same root cause? I haven't followed all the other bugs cleared
up by inspection.

--
fdr


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-25 22:25:26
Message-ID: AANLkTinXH=GGkz8v663A+5zsrkN8bUsaoEE6ab7Swjoy@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 25, 2011 at 2:33 PM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
> I know I got hit by a backend synchronization (in the sense of locks,
> etc) bugs; do you think it is possible yours (sending SIGSTOP) could
> be the same root cause? I haven't followed all the other bugs cleared
> up by inspection.

I believe that the queue management logic is just totally busted and
needs to be rewritten. I doubt there is much point in speculating
about details until that's done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-25 23:44:39
Message-ID: 4D683EE7.4090507@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Right now, as it stands, the syncrep patch will be happy as soon as
> the data has been fsynced to either B or A-prime; I don't think we can
> guarantee at any point that A-prime can become the leader, and feed B.

Yeah, I think that's something we said months ago is going to be a 9.2
feature, no sooner.

> 2. The unprivileged user can disable syncrep, in any situation. This
> flexibility is *great*, but you don't really want people to do it when
> one is performing the switchover. Rather, in a magical world we'd hope
> that disabling syncrep would just result in not having to
> synchronously commit to B (but, in this case, still synchronously
> commit to A-prime)
>
> In other words, to my mind, you can use syncrep as-is to provide
> 2-safe durability xor a scheduled switchover: as soon as someone wants
> both, I think they'll have some trouble. I do want both, though.

Hmmm, I don't follow this. The user can only disable syncrep for their
own transactions. If they don't care about the persistence of their
transaction post-failover, why should the DBA care?

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Daniel Farina <daniel(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-26 00:10:03
Message-ID: AANLkTi=0L9w2FiyR9tQ62e7UkgP01B9qEeNkC1vMADvZ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 25, 2011 at 3:44 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
>> Right now, as it stands, the syncrep patch will be happy as soon as
>> the data has been fsynced to either B or A-prime; I don't think we can
>> guarantee at any point that A-prime can become the leader, and feed B.
>
> Yeah, I think that's something we said months ago is going to be a 9.2
> feature, no sooner.

Ah, okay, I had missed that discussion, I also did not know it got so
specific as to address this case (are you sure?) rather than something
more general, say quorum or N-safe durability.

>> 2. The unprivileged user can disable syncrep, in any situation. This
>> flexibility is *great*, but you don't really want people to do it when
>> one is performing the switchover. Rather, in a magical world we'd hope
>> that disabling syncrep would just result in not having to
>> synchronously commit to B (but, in this case, still synchronously
>> commit to A-prime)
>>
>> In other words, to my mind, you can use syncrep as-is to provide
>> 2-safe durability xor a scheduled switchover: as soon as someone wants
>> both, I think they'll have some trouble. I do want both, though.
>
> Hmmm, I don't follow this.  The user can only disable syncrep for their
> own transactions.   If they don't care about the persistence of their
> transaction post-failover, why should the DBA care?

The user may have their own level of durability guarantee they want to
attain (that's why machine "B" is syncrepped in my example), but when
doing the switchover I think an override to enable a smooth handoff
(meaning: everything syncrepped) would be best. What I want to avoid
is an ack from "COMMIT" from the primary (machine "A"), and then, post
switchover, the data isn't there on machine A-Prime (or "B", provided
it was able to follow successfully at all, as in the current case it
might get ahead of A-prime in the WAL).

--
fdr


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-26 00:36:12
Message-ID: 4D684AFC.1080503@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Daniel,

> Ah, okay, I had missed that discussion, I also did not know it got so
> specific as to address this case (are you sure?) rather than something
> more general, say quorum or N-safe durability.

The way we address that case is through n-safe durability.

> The user may have their own level of durability guarantee they want to
> attain (that's why machine "B" is syncrepped in my example), but when
> doing the switchover I think an override to enable a smooth handoff
> (meaning: everything syncrepped) would be best. What I want to avoid
> is an ack from "COMMIT" from the primary (machine "A"), and then, post
> switchover, the data isn't there on machine A-Prime (or "B", provided
> it was able to follow successfully at all, as in the current case it
> might get ahead of A-prime in the WAL).

Yeah, when I think about your use case, I can understand why it's an
issue. It would be nice to have a superuser setting (or similar) which
could override user preferances and make all transactions synchrep
temporarily. I'm not sure that's going to be reasonable to do for 9.1
though.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Daniel Farina <daniel(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-26 00:44:02
Message-ID: AANLkTinch936CcHyxee4wRz9TCHjZBjO2wXVbE3jmUg9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 25, 2011 at 4:36 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Daniel,
>
>> Ah, okay, I had missed that discussion, I also did not know it got so
>> specific as to address this case (are you sure?) rather than something
>> more general, say quorum or N-safe durability.
>
> The way we address that case is through n-safe durability.

How is this exposed? The simple "count the number of fsyncs()"
approach is not quite good enough (one has no control to make sure one
or more nodes are definitely up-to-date) unless one wants to just make
it go to *all* syncrep standys for a while. That seems like overkill;
so I imagine something else is in the thoughts. I'll search the
archives...

>> The user may have their own level of durability guarantee they want to
>> attain (that's why machine "B" is syncrepped in my example), but when
>> doing the switchover I think an override to enable a smooth handoff
>> (meaning: everything syncrepped) would be best.  What I want to avoid
>> is an ack from "COMMIT" from the primary (machine "A"), and then, post
>> switchover, the data isn't there on machine A-Prime (or "B", provided
>> it was able to follow successfully at all, as in the current case it
>> might get ahead of A-prime in the WAL).
>
> Yeah, when I think about your use case, I can understand why it's an
> issue.  It would be nice to have a superuser setting (or similar) which
> could override user preferances and make all transactions synchrep
> temporarily.  I'm not sure that's going to be reasonable to do for 9.1
> though.

Agreed; I'd be happy to take any syncrep functionality, although it
wouldn't compose well as-is, I wanted to raise this so that we didn't
make any configuration decisions that got in the way of making
composition possible later. Again, I haven't thought ahead yet,
partially because I thought there may be some existing thoughts in
play to consider.

With that, I will try to give syncrep a more structured review Real
Soon, although the late date of this is leaving me queasy as to the
odds of git-commit.

--
fdr


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Daniel Farina <daniel(at)heroku(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-26 00:57:03
Message-ID: 1298681823.6945.0.camel@jdavis-ux.asterdata.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2011-02-25 at 15:44 -0800, Josh Berkus wrote:
> Hmmm, I don't follow this. The user can only disable syncrep for their
> own transactions. If they don't care about the persistence of their
> transaction post-failover, why should the DBA care?

I think that's the difference between failover and switchover, right? At
least Slony makes such a distinction, as well.

Regards,
Jeff Davis


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-26 01:21:50
Message-ID: 4D6855AE.4000401@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/25/11 4:57 PM, Jeff Davis wrote:
> On Fri, 2011-02-25 at 15:44 -0800, Josh Berkus wrote:
>> Hmmm, I don't follow this. The user can only disable syncrep for their
>> own transactions. If they don't care about the persistence of their
>> transaction post-failover, why should the DBA care?
>
> I think that's the difference between failover and switchover, right? At
> least Slony makes such a distinction, as well.

Yeah. Actually, what would be even simpler and more to the point would
be a command that says "flush all transactions from Server A to Server
B, then fail over".

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Daniel Farina <daniel(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-26 01:35:17
Message-ID: AANLkTi=abvH1A22mBy=qFUBLSdKqs59d6EgwfRE9n6uP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 25, 2011 at 5:21 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 2/25/11 4:57 PM, Jeff Davis wrote:
>> On Fri, 2011-02-25 at 15:44 -0800, Josh Berkus wrote:
>>> Hmmm, I don't follow this.  The user can only disable syncrep for their
>>> own transactions.   If they don't care about the persistence of their
>>> transaction post-failover, why should the DBA care?
>>
>> I think that's the difference between failover and switchover, right? At
>> least Slony makes such a distinction, as well.
>
> Yeah.  Actually, what would be even simpler and more to the point would
> be a command that says "flush all transactions from Server A to Server
> B, then fail over".

That would be nice; I'm basically abusing syncrep to this purpose. At
the same time, someone may need to be notified of such a switchover
occurring, and in event of failure, it'd be nice to bounce back to the
primary. Tangentially relevent, Virtual IP is not always an option,
such as on Amazon EC2.

But I digress. Such a command is unlikely to make it into 9.1; maybe
we can circle around on that in 9.2.

--
fdr


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: disposition of remaining patches
Date: 2011-02-26 20:22:56
Message-ID: 4D696120.7030200@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> That would be nice; I'm basically abusing syncrep to this purpose. At
> the same time, someone may need to be notified of such a switchover
> occurring, and in event of failure, it'd be nice to bounce back to the
> primary. Tangentially relevent, Virtual IP is not always an option,
> such as on Amazon EC2.

Well, let's comprehensively address replication in a cloud environment
for 9.2. You can start a wiki page.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Daniel Farina <daniel(at)heroku(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: sync rep design architecture (was "disposition of remaining patches")
Date: 2011-02-27 03:22:58
Message-ID: AANLkTi=dS1-WQNyQRm--pSe8z8c=Wi0Ey3MNUL3iSTgL@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 25, 2011 at 8:40 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> I didn't get the Streaming Rep + Hot Standby features I wanted in 9.0 either.  But committing what was reasonable to include in that version let me march forward with very useful new code, doing another year of development on my own projects and getting some new things get fixed in core.  And so far it looks like 9.1 will sort out all of the kinks I was unhappy about.  The same sort of thing will need to happen to get Sync Rep committed and then appropriate for more use cases.  There isn't any margin left for discussions of scope creep left here; really it's "is this subset useful for some situations and stable enough to commit" now.

I mostly wanted to raise the issue to not be a blocker, but an attempt
to avoid boxing ourselves in for growing such a feature in 9.2. if
9.1 ships with the syncrep patch as-conceived, it'll just mean that
it'll be hard/not possible to offer syncrep to users as well as at the
"infrastructure service provider" level...which is, actually, quite
fine -- most current users likely don't want to take the performance
hit of syncrep all the time, but to live with it during a switchover
is quite fine. I just wanted to make a reasonable effort to ensure
its possibility in a 9.2-like timeframe.

>> 2. The unprivileged user can disable syncrep, in any situation. This
>> flexibility is *great*, but you don't really want people to do it when
>> one is performing the switchover.
>
> For the moment you may have to live with a situation where user connections must be blocked during the brief moment of switchover to eliminate this issue.  That's what I end up doing with 9.0 production systems to get a really clean switchover, there's a second of hiccup even in the best case.  I'm not sure yet of the best way yet to build a UI to make that more transparent in the sync rep case.  It's sure not a problem that's going to get solved in this release though.

I'm totally okay killing all backends during the switchover between
9.1 and 9.2 releases, unless I get super clever with pgbouncer...which
I will have to do anyway.

--
fdr