Re: management of large patches

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: management of large patches
Date: 2011-01-02 05:32:10
Message-ID: AANLkTim_7kohLBVXZN4aMq3bcbB_zFg-Zm=k4oBVuAkS@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

We're coming the end of the 9.1 development cycle, and I think that
there is a serious danger of insufficient bandwidth to handle the
large patches we have outstanding. For my part, I am hoping to find
the bandwidth to two, MAYBE three major commits between now and the
end of 9.1CF4, but I am not positive that I will be able to find even
that much time, and the number of major patches vying for attention is
considerably greater than that. Quick estimate:

- SQL/MED - probably needs >~3 large commits: foreign table scan, file
FDW, postgresql FDW, plus whatever else gets submitted in the next two
weeks
- MERGE
- checkpoint improvements
- SE-Linux integration
- extensions - may need 2 or more commits
- true serializability - not entirely sure of the status of this
- writeable CTEs (Tom has indicated he will look at this)
- PL/python patches (Peter has indicated he will look look at this)
- snapshot taking inconsistencies (Tom has indicated he will look at this)
- per-column collation (Peter)
- synchronous replication (Simon, and, given the level of interest in
and complexity of this feature, probably others as well)

I guess my basic question is - is it realistic to think that we're
going to get all of the above done in the next 45 days? Is there
anything we can do make the process more efficient? If a few more
large patches drop into the queue in the next two weeks, will we have
bandwidth for those as well? If we don't think we can get everything
done in the time available, what's the best way to handle that? I
would hate to discourage people from continuing to hack away, but I
think it would be even worse to give people the impression that
there's a chance of getting work reviewed and committed if there
really isn't.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: management of large patches
Date: 2011-01-02 09:29:27
Message-ID: AANLkTikqUEjg6LEAjvFTcfZ11tbjzw8-fuFemfj6sQw-@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jan 2, 2011 at 06:32, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> We're coming the end of the 9.1 development cycle, and I think that
> there is a serious danger of insufficient bandwidth to handle the
> large patches we have outstanding.  For my part, I am hoping to find
> the bandwidth to two, MAYBE three major commits between now and the
> end of 9.1CF4, but I am not positive that I will be able to find even
> that much time, and the number of major patches vying for attention is
> considerably greater than that.  Quick estimate:
>
> - SQL/MED - probably needs >~3 large commits: foreign table scan, file
> FDW, postgresql FDW, plus whatever else gets submitted in the next two
> weeks
> - MERGE
> - checkpoint improvements
> - SE-Linux integration
> - extensions - may need 2 or more commits
> - true serializability - not entirely sure of the status of this
> - writeable CTEs (Tom has indicated he will look at this)
> - PL/python patches (Peter has indicated he will look look at this)
> - snapshot taking inconsistencies (Tom has indicated he will look at this)
> - per-column collation (Peter)
> - synchronous replication (Simon, and, given the level of interest in
> and complexity of this feature, probably others as well)
>
> I guess my basic question is - is it realistic to think that we're
> going to get all of the above done in the next 45 days?  Is there
> anything we can do make the process more efficient?  If a few more
> large patches drop into the queue in the next two weeks, will we have
> bandwidth for those as well?  If we don't think we can get everything
> done in the time available, what's the best way to handle that?  I

Well, we've always (well, since we had cf's) said that large patches
shouldn't be submitted for the last CF, they should be submitted for
one of the first. So if something *new* gets dumped on us for the last
one, giving priority to the existing ones in the queue seems like the
only fair option.

As for priority between those that *were* submitted earlier, and have
been reworked (which is how the system is supposed to work), it's a
lot harder. And TBH, I think we're going to have a problem getting all
those done. But the question is - are all ready enough, or are a
couple going to need the "returned with feedback" status *regardless*
of if this is the last CF or not?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: management of large patches
Date: 2011-01-02 12:41:58
Message-ID: AANLkTinEJQmPnm4s+JdnGNMTvUuFqt5xFr3ESNwkwtTN@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jan 2, 2011 at 4:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> As for priority between those that *were* submitted earlier, and have
> been reworked (which is how the system is supposed to work), it's a
> lot harder. And TBH, I think we're going to have a problem getting all
> those done. But the question is - are all ready enough, or are a
> couple going to need the "returned with feedback" status *regardless*
> of if this is the last CF or not?

Well, that all depends on how much work people are willing to put into
reviewing and committing them, which I think is what we need to
determine. None of those patches are going to be as simple as "patch
-p1 < $F && git commit -a && git push". Having done a couple of these
now, I'd say that doing final review and commit of a patch of this
scope takes me ~20 hours of work, but it obviously varies a lot based
on how good the patch is to begin with and how much review has already
been done. So I guess the question is - who is willing to step up to
the plate, either as reviewer or as final reviewer/committer?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: management of large patches
Date: 2011-01-02 12:54:18
Message-ID: 4D20757A.6080000@kaigai.gr.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/01/02 14:32), Robert Haas wrote:
> We're coming the end of the 9.1 development cycle, and I think that
> there is a serious danger of insufficient bandwidth to handle the
> large patches we have outstanding. For my part, I am hoping to find
> the bandwidth to two, MAYBE three major commits between now and the
> end of 9.1CF4, but I am not positive that I will be able to find even
> that much time, and the number of major patches vying for attention is
> considerably greater than that. Quick estimate:
>
:
> - SE-Linux integration

How about feasibility to commit this 3KL patch in the last 45 days?

At least, the idea of security provider enables us to maintain a set
of hooks and logic to make access control decision independently.
I'm available to provide a set of sources for this module at
git.postgresql.org, so we can always obtain a working module from here.
The worst scenario for us is nothing were progressed in spite of
large man-power to review and discuss.

It may be more productive to keep features to be committed on the
last CF as small as possible, such as hooks to support a part of DDL
permissions or pg_regress enhancement to run regression test.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: management of large patches
Date: 2011-01-03 08:54:53
Message-ID: 4D218EDD.9020407@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> - MERGE
> - checkpoint improvements
>

As far as these two go, the state of MERGE is still rougher than I would
like. The code itself isn't too hard to read, and that the errors that
are popping up tend to be caught by assertions (rather than just being
mysterious crashes) makes me feel a little better that there's some
defensive coding in there. It's still a 3648 line patch that touches
grammar, planner, and executor bits though, and I've been doing mainly
functional and coding style review so far. I'm afraid here's not too
many committers in a good position to actually consume the whole scope
of this thing for a commit level review. And the way larger patches
tend to work here, I'd be surprised to find it passes through such a
review without some as yet unidentified major beef appearing. Will see
what we can do to help move this forward more before the CF start.

The checkpoint changes I'm reworking are not really large from a code
complexity or size perspective--I estimate around 350 lines of diff,
with the rough version I submitted to CF2010-11 at 258. I suspect it
will actually be the least complicated patch to consume from that list,
from a committer perspective. The complexity there is mainly in the
performance testing. I've been gearing up infrastructure the last
couple weeks to automate and easily publish all the results I collect
there. The main part that hasn't gone through any serious testing yet,
auto-tuning the spread interval, will also be really easy to revert if a
problem is found there. With Simon and I both reviewing each others
work on this already, I hope we can keep this one from clogging the
committer critical path you're worried about here.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: management of large patches
Date: 2011-01-03 10:28:26
Message-ID: 874o9q9t85.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> - extensions - may need 2 or more commits

I'm now basically done with coding, I'm writing the docs for the upgrade
patch and preparing the upgrade SQL files for pre-9.1 to 9.1 upgrades of
the contrib modules.

Doing that, I've been cleaning up or reorganising some code: I will
backport some of those changes to the main extension patch. So I expect
to send both extension.v23.patch and extension-upgrade.v1.patch this
week.

As the main extension patch as received lots of detailed reviews (both
user level and code level) by commiters already, I'm not expecting big
surprises for the last commitfest. The upgrade patch design has been
discussed in detail on-list too. Dust has settled here.

Meanwhile, there's this bugfix for HEAD that I've sent:

http://archives.postgresql.org/pgsql-hackers/2011-01/msg00078.php

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support