Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

Lists: pgsql-hackers
From: David Fetter <david(at)fetter(dot)org>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-02-10 07:59:22
Message-ID: 20130210075922.GB14203@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Folks,

Per suggestions and lots of help from Andrew Gierth, please find
attached a patch to clean up the call sites for FuncCall nodes, which
I'd like to expand centrally rather than in each of the 37 (or 38, but
I only redid 37) places where it's called. The remaining one is in
src/backend/nodes/copyfuncs.c, which has to be modified for any
changes in the that struct anyhow.

The immediate purpose is two-fold: to reduce some redundancies, which
I believe is worth doing in and of itself, and to prepare for adding
FILTER on aggregates from the spec, and possibly other things in
the <aggregate function> part of the spec.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
makeFuncArgs_001.patch text/plain 26.9 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Fetter <david(at)fetter(dot)org>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-02-10 15:09:19
Message-ID: 19670.1360508959@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter <david(at)fetter(dot)org> writes:
> Per suggestions and lots of help from Andrew Gierth, please find
> attached a patch to clean up the call sites for FuncCall nodes, which
> I'd like to expand centrally rather than in each of the 37 (or 38, but
> I only redid 37) places where it's called. The remaining one is in
> src/backend/nodes/copyfuncs.c, which has to be modified for any
> changes in the that struct anyhow.

TBH, I don't think this is an improvement.

The problem with adding a new field to any struct is that you have to
run around and examine (and, usually, modify) every place that
manufactures that type of struct. With a makeFuncCall defined like
this, you'd still have to do that; it would just become a lot easier
to forget to do so.

If the subroutine were defined like most other makeXXX subroutines,
ie you have to supply *all* the fields, that argument would go away,
but the notational advantage is then dubious.

The bigger-picture point is that you're proposing to make the coding
conventions for building FuncCalls different from what they are for
any other grammar node. I don't think that's a great idea; it will
mostly foster confusion.

regards, tom lane


From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-02-10 15:39:26
Message-ID: 20130210153926.GE14203@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote:
> David Fetter <david(at)fetter(dot)org> writes:
> > Per suggestions and lots of help from Andrew Gierth, please find
> > attached a patch to clean up the call sites for FuncCall nodes, which
> > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > I only redid 37) places where it's called. The remaining one is in
> > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > changes in the that struct anyhow.
>
> TBH, I don't think this is an improvement.
>
> The problem with adding a new field to any struct is that you have to
> run around and examine (and, usually, modify) every place that
> manufactures that type of struct. With a makeFuncCall defined like
> this, you'd still have to do that; it would just become a lot easier
> to forget to do so.

So you're saying that I've insufficiently put the choke point there?

It seems to me that needing to go back to each and every jot and
tittle of the code to add a new default parameter in each place is a
recipe for mishaps, where in this one, the new correct defaults will
just appear in all the places they need to.

> If the subroutine were defined like most other makeXXX subroutines,
> ie you have to supply *all* the fields, that argument would go away,
> but the notational advantage is then dubious.
>
> The bigger-picture point is that you're proposing to make the coding
> conventions for building FuncCalls different from what they are for
> any other grammar node. I don't think that's a great idea; it will
> mostly foster confusion.

I find this a good bit less confusing than the litany of repeated
parameter settings, the vast majority of which in most cases are along
the lines of NULL/NIL/etc.

This way, anything set that's not the default stands out.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-02-11 18:48:38
Message-ID: 20130211184838.GA26392@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote:
> David Fetter <david(at)fetter(dot)org> writes:
> > Per suggestions and lots of help from Andrew Gierth, please find
> > attached a patch to clean up the call sites for FuncCall nodes, which
> > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > I only redid 37) places where it's called. The remaining one is in
> > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > changes in the that struct anyhow.
>
> TBH, I don't think this is an improvement.
>
> The problem with adding a new field to any struct is that you have to
> run around and examine (and, usually, modify) every place that
> manufactures that type of struct. With a makeFuncCall defined like
> this, you'd still have to do that; it would just become a lot easier
> to forget to do so.
>
> If the subroutine were defined like most other makeXXX subroutines,
> ie you have to supply *all* the fields, that argument would go away,
> but the notational advantage is then dubious.
>
> The bigger-picture point is that you're proposing to make the coding
> conventions for building FuncCalls different from what they are for
> any other grammar node. I don't think that's a great idea; it will
> mostly foster confusion.

The major difference between FuncCalls and others is that `while most
raw-parsetree nodes are constructed only in their own syntax
productions, FuncCall is constructed in many places unrelated to
actual function call syntax.

This really will make things a good bit easier on our successors.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: David Fetter <david(at)fetter(dot)org>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-02-13 14:45:31
Message-ID: 20130213144531.GH22576@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote:
> Folks,
>
> Per suggestions and lots of help from Andrew Gierth, please find
> attached a patch to clean up the call sites for FuncCall nodes, which
> I'd like to expand centrally rather than in each of the 37 (or 38, but
> I only redid 37) places where it's called. The remaining one is in
> src/backend/nodes/copyfuncs.c, which has to be modified for any
> changes in the that struct anyhow.
>
> The immediate purpose is two-fold: to reduce some redundancies, which
> I believe is worth doing in and of itself, and to prepare for adding
> FILTER on aggregates from the spec, and possibly other things in
> the <aggregate function> part of the spec.
>
> Cheers,
> David.

Folks,

Please find attached two versions of a patch which provides optional
FILTER clause for aggregates (T612, "Advanced OLAP operations").

The first is intended to be applied on top of the previous patch, the
second without it. The first is, I believe, clearer in what it's
doing. Rather than simply mechanically visiting every place a
function call might be constructed, it visits a central one to change
the default, then goes only to the places where it's relevant.

The patches are both early WIP as they contain no docs or regression
tests yet.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
filter_mfa_001.diff text/plain 30.1 KB
filter_no_mfa_001.diff text/plain 33.1 KB

From: David Fetter <david(at)fetter(dot)org>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-02-14 06:55:18
Message-ID: 20130214065518.GA23503@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 13, 2013 at 06:45:31AM -0800, David Fetter wrote:
> On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote:
> > Folks,
> >
> > Per suggestions and lots of help from Andrew Gierth, please find
> > attached a patch to clean up the call sites for FuncCall nodes, which
> > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > I only redid 37) places where it's called. The remaining one is in
> > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > changes in the that struct anyhow.
> >
> > The immediate purpose is two-fold: to reduce some redundancies, which
> > I believe is worth doing in and of itself, and to prepare for adding
> > FILTER on aggregates from the spec, and possibly other things in
> > the <aggregate function> part of the spec.
> >
> > Cheers,
> > David.
>
> Folks,
>
> Please find attached two versions of a patch which provides optional
> FILTER clause for aggregates (T612, "Advanced OLAP operations").
>
> The first is intended to be applied on top of the previous patch, the
> second without it.

I'll find a brown paper back to wear over my head at some point, but
meanwhile, here's a cleaned-up version of the patch that doesn't use
makeFuncArgs, now without merge artifacts and with the ability to
actually compile. It's still WIP in the sense previously mentioned.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
filter_no_mfa_002.diff text/plain 31.6 KB

From: David Fetter <david(at)fetter(dot)org>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-02-26 21:09:30
Message-ID: 20130226210930.GA24791@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 13, 2013 at 06:45:31AM -0800, David Fetter wrote:
> On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote:
> > Folks,
> >
> > Per suggestions and lots of help from Andrew Gierth, please find
> > attached a patch to clean up the call sites for FuncCall nodes, which
> > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > I only redid 37) places where it's called. The remaining one is in
> > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > changes in the that struct anyhow.
> >
> > The immediate purpose is two-fold: to reduce some redundancies, which
> > I believe is worth doing in and of itself, and to prepare for adding
> > FILTER on aggregates from the spec, and possibly other things in
> > the <aggregate function> part of the spec.
> >
> > Cheers,
> > David.
>
> Folks,
>
> Please find attached two versions of a patch which provides optional
> FILTER clause for aggregates (T612, "Advanced OLAP operations").
>
> The first is intended to be applied on top of the previous patch, the
> second without it. The first is, I believe, clearer in what it's
> doing. Rather than simply mechanically visiting every place a
> function call might be constructed, it visits a central one to change
> the default, then goes only to the places where it's relevant.
>
> The patches are both early WIP as they contain no docs or regression
> tests yet.

Docs and regression tests added, makeFuncArgs approached dropped for
now, will re-visit later.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
filter_no_mfa_003.diff text/plain 44.8 KB

From: David Fetter <david(at)fetter(dot)org>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-04-28 20:29:41
Message-ID: 20130428202941.GC32284@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 26, 2013 at 01:09:30PM -0800, David Fetter wrote:
> On Wed, Feb 13, 2013 at 06:45:31AM -0800, David Fetter wrote:
> > On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote:
> > > Folks,
> > >
> > > Per suggestions and lots of help from Andrew Gierth, please find
> > > attached a patch to clean up the call sites for FuncCall nodes, which
> > > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > > I only redid 37) places where it's called. The remaining one is in
> > > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > > changes in the that struct anyhow.
> > >
> > > The immediate purpose is two-fold: to reduce some redundancies, which
> > > I believe is worth doing in and of itself, and to prepare for adding
> > > FILTER on aggregates from the spec, and possibly other things in
> > > the <aggregate function> part of the spec.
> > >
> > > Cheers,
> > > David.
> >
> > Folks,
> >
> > Please find attached two versions of a patch which provides optional
> > FILTER clause for aggregates (T612, "Advanced OLAP operations").
> >
> > The first is intended to be applied on top of the previous patch, the
> > second without it. The first is, I believe, clearer in what it's
> > doing. Rather than simply mechanically visiting every place a
> > function call might be constructed, it visits a central one to change
> > the default, then goes only to the places where it's relevant.
> >
> > The patches are both early WIP as they contain no docs or regression
> > tests yet.
>
> Docs and regression tests added, makeFuncArgs approached dropped for
> now, will re-visit later.

Regression tests added to reflect bug fixes in COLLATE.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
filter_no_mfa_004.diff text/plain 46.8 KB

From: David Fetter <david(at)fetter(dot)org>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-17 05:36:55
Message-ID: 20130617053655.GB19441@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Apr 28, 2013 at 01:29:41PM -0700, David Fetter wrote:
> On Tue, Feb 26, 2013 at 01:09:30PM -0800, David Fetter wrote:
> > On Wed, Feb 13, 2013 at 06:45:31AM -0800, David Fetter wrote:
> > > On Sat, Feb 09, 2013 at 11:59:22PM -0800, David Fetter wrote:
> > > > Folks,
> > > >
> > > > Per suggestions and lots of help from Andrew Gierth, please find
> > > > attached a patch to clean up the call sites for FuncCall nodes, which
> > > > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > > > I only redid 37) places where it's called. The remaining one is in
> > > > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > > > changes in the that struct anyhow.
> > > >
> > > > The immediate purpose is two-fold: to reduce some redundancies, which
> > > > I believe is worth doing in and of itself, and to prepare for adding
> > > > FILTER on aggregates from the spec, and possibly other things in
> > > > the <aggregate function> part of the spec.
> > > >
> > > > Cheers,
> > > > David.
> > >
> > > Folks,
> > >
> > > Please find attached two versions of a patch which provides optional
> > > FILTER clause for aggregates (T612, "Advanced OLAP operations").
> > >
> > > The first is intended to be applied on top of the previous patch, the
> > > second without it. The first is, I believe, clearer in what it's
> > > doing. Rather than simply mechanically visiting every place a
> > > function call might be constructed, it visits a central one to change
> > > the default, then goes only to the places where it's relevant.
> > >
> > > The patches are both early WIP as they contain no docs or regression
> > > tests yet.
> >
> > Docs and regression tests added, makeFuncArgs approached dropped for
> > now, will re-visit later.
>
> Regression tests added to reflect bug fixes in COLLATE.

Rebased vs. master.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
filter_005.diff text/plain 46.8 KB

From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-17 05:43:04
Message-ID: 20130617054304.GC19441@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 11, 2013 at 10:48:38AM -0800, David Fetter wrote:
> On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote:
> > David Fetter <david(at)fetter(dot)org> writes:
> > > Per suggestions and lots of help from Andrew Gierth, please find
> > > attached a patch to clean up the call sites for FuncCall nodes, which
> > > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > > I only redid 37) places where it's called. The remaining one is in
> > > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > > changes in the that struct anyhow.
> >
> > TBH, I don't think this is an improvement.
> >
> > The problem with adding a new field to any struct is that you have to
> > run around and examine (and, usually, modify) every place that
> > manufactures that type of struct. With a makeFuncCall defined like
> > this, you'd still have to do that; it would just become a lot easier
> > to forget to do so.
> >
> > If the subroutine were defined like most other makeXXX subroutines,
> > ie you have to supply *all* the fields, that argument would go away,
> > but the notational advantage is then dubious.
> >
> > The bigger-picture point is that you're proposing to make the coding
> > conventions for building FuncCalls different from what they are for
> > any other grammar node. I don't think that's a great idea; it will
> > mostly foster confusion.
>
> The major difference between FuncCalls and others is that `while most
> raw-parsetree nodes are constructed only in their own syntax
> productions, FuncCall is constructed in many places unrelated to
> actual function call syntax.
>
> This really will make things a good bit easier on our successors.

Here's a rebased patch with comments illustrating how best to employ.

In my previous message, I characterized the difference between
FuncCalls and other raw-parsetree nodes. Is there some flaw in that
logic? If there isn't, is there some reason not to treat them in a
less redundant, more uniform manner as this patch does?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
makeFuncArgs_002.diff text/plain 27.6 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: David Fetter <david(at)fetter(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-17 12:42:25
Message-ID: 20130617124225.GF23363@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* David Fetter (david(at)fetter(dot)org) wrote:
> On Mon, Feb 11, 2013 at 10:48:38AM -0800, David Fetter wrote:
> > On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote:
> > > David Fetter <david(at)fetter(dot)org> writes:
> > > > Per suggestions and lots of help from Andrew Gierth, please find
> > > > attached a patch to clean up the call sites for FuncCall nodes, which
> > > > I'd like to expand centrally rather than in each of the 37 (or 38, but
> > > > I only redid 37) places where it's called. The remaining one is in
> > > > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > > > changes in the that struct anyhow.
> > >
> > > TBH, I don't think this is an improvement.
> > >
> > > The problem with adding a new field to any struct is that you have to
> > > run around and examine (and, usually, modify) every place that
> > > manufactures that type of struct. With a makeFuncCall defined like
> > > this, you'd still have to do that; it would just become a lot easier
> > > to forget to do so.

I don't really see how finding all callers of makeFuncCall is
particularly harder than finding the callers of makeNode(Func). If
there were cases where we still wanted to use makeNode(Func), perhaps
that would be annoying since you'd have to look for both- but, iiuc,
this patch changes all of the callers to use makeFuncCall and it seems
reasonable for all callers to do so in the future as well.

It looks to me like the advantage of this patch is exactly that you
*don't* have to run around and modify things which are completely
unrelated to the new feature being added (eg: FILTER). Add the new
field, set up the default/no-op case in makeFuncCall() and then only
change those places that actually need to worry about your new field.

> > > If the subroutine were defined like most other makeXXX subroutines,
> > > ie you have to supply *all* the fields, that argument would go away,
> > > but the notational advantage is then dubious.

Having to supply all the fields certainly wouldn't make things any
better. Providing the base set of fields which are required to be set
for any FuncCall node does make sense though, which is what the patch
does. The rest of the fields are all special cases for which a default
value works perfectly fine (when the field isn't involved in the
specific case being handled).

> > > The bigger-picture point is that you're proposing to make the coding
> > > conventions for building FuncCalls different from what they are for
> > > any other grammar node. I don't think that's a great idea; it will
> > > mostly foster confusion.
> >
> > The major difference between FuncCalls and others is that `while most
> > raw-parsetree nodes are constructed only in their own syntax
> > productions, FuncCall is constructed in many places unrelated to
> > actual function call syntax.

Yeah, FuncCall's are already rather special and they're built all over
the place.

That's my 2c on it anyhow. I don't see it as some kind of major
milestone but it looks like improvement to me and likely to make things
a bit easier on patch authors and reviewers who otherwise have to ponder
a bunch of repeated 'x->q = false;' statements in areas which are
completely unrelated to the new feature itself.

Thanks,

Stephen


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-20 19:59:27
Message-ID: CAEZATCWLqg8CYTuGOrzjKNzw6fDWfAPf21iLUjwzyekehMWJ8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17 June 2013 06:36, David Fetter <david(at)fetter(dot)org> wrote:
>> > > Please find attached two versions of a patch which provides optional
>> > > FILTER clause for aggregates (T612, "Advanced OLAP operations").
>> > >
>> > > The first is intended to be applied on top of the previous patch, the
>> > > second without it. The first is, I believe, clearer in what it's
>> > > doing. Rather than simply mechanically visiting every place a
>> > > function call might be constructed, it visits a central one to change
>> > > the default, then goes only to the places where it's relevant.
>> > >
>> > > The patches are both early WIP as they contain no docs or regression
>> > > tests yet.
>> >
>> > Docs and regression tests added, makeFuncArgs approached dropped for
>> > now, will re-visit later.
>>
>> Regression tests added to reflect bug fixes in COLLATE.
>
> Rebased vs. master.
>

Hi,

I've been looking at this patch, which adds support for the SQL
standard feature of applying a filter to rows used in an aggregate.
The syntax matches the spec:

agg_fn ( <args> [ ORDER BY <sort_clause> ] ) [ FILTER ( WHERE <qual> ) ]

and this patch makes FILTER a new reserved keyword, usable as a
function or type, but not in other contexts, e.g., as a table name or
alias.

I'm not sure if that might be a problem for some people, but I can't
see any way round it, since otherwise there would be no way for the
parser to distinguish a filter clause from an alias expression.

The feature appears to work per-spec, and the code and doc changes
look reasonable. However, it looks a little light on regression tests,
and so I think some necessary changes have been overlooked.

In my testing of sub-queries in the FILTER clause (an extension to the
spec), I was able to produce the following error:

CREATE TABLE t1(a1 int);
CREATE TABLE t2(a2 int);
INSERT INTO t1 SELECT * FROM generate_series(1,10);
INSERT INTO t2 SELECT * FROM generate_series(3,6);

SELECT array_agg(a1) FILTER (WHERE a1 IN (SELECT a2 FROM t2)) FROM t1;

ERROR: plan should not reference subplan's variable

which looks to be related to subselect.c's handling of sub-queries in
aggregates.

Regards,
Dean


From: David Fetter <david(at)fetter(dot)org>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-21 04:01:08
Message-ID: 20130621040108.GA5395@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:
> On 17 June 2013 06:36, David Fetter <david(at)fetter(dot)org> wrote:
> >> > > Please find attached two versions of a patch which provides optional
> >> > > FILTER clause for aggregates (T612, "Advanced OLAP operations").
> >> > >
> >> > > The first is intended to be applied on top of the previous patch, the
> >> > > second without it. The first is, I believe, clearer in what it's
> >> > > doing. Rather than simply mechanically visiting every place a
> >> > > function call might be constructed, it visits a central one to change
> >> > > the default, then goes only to the places where it's relevant.
> >> > >
> >> > > The patches are both early WIP as they contain no docs or regression
> >> > > tests yet.
> >> >
> >> > Docs and regression tests added, makeFuncArgs approached dropped for
> >> > now, will re-visit later.
> >>
> >> Regression tests added to reflect bug fixes in COLLATE.
> >
> > Rebased vs. master.
>
> Hi,
>
> I've been looking at this patch, which adds support for the SQL
> standard feature of applying a filter to rows used in an aggregate.
> The syntax matches the spec:
>
> agg_fn ( <args> [ ORDER BY <sort_clause> ] ) [ FILTER ( WHERE <qual> ) ]
>
> and this patch makes FILTER a new reserved keyword, usable as a
> function or type, but not in other contexts, e.g., as a table name or
> alias.
>
> I'm not sure if that might be a problem for some people, but I can't
> see any way round it, since otherwise there would be no way for the
> parser to distinguish a filter clause from an alias expression.
>
> The feature appears to work per-spec, and the code and doc changes
> look reasonable. However, it looks a little light on regression tests,

What tests do you think should be there that aren't?

> and so I think some necessary changes have been overlooked.
>
> In my testing of sub-queries in the FILTER clause (an extension to the
> spec), I was able to produce the following error:

Per the spec,

B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference.

I'd be happy to see about adding one despite this, though. That they
didn't figure out how doesn't mean we shouldn't eventually, ideally in
time for 9.4.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-21 04:10:25
Message-ID: 20130621041025.GG4724@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter escribió:
> On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:

> > In my testing of sub-queries in the FILTER clause (an extension to the
> > spec), I was able to produce the following error:
>
> Per the spec,
>
> B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference.

If this is not allowed, I think there should be a clearer error message.
What Dean showed is an internal (not user-facing) error message.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: David Fetter <david(at)fetter(dot)org>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-21 04:16:12
Message-ID: 20130621041612.GB5395@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote:
> David Fetter escribió:
> > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:
>
> > > In my testing of sub-queries in the FILTER clause (an extension
> > > to the spec), I was able to produce the following error:
> >
> > Per the spec,
> >
> > B) A <filter clause> shall not contain a <query expression>, a
> > <window function>, or an outer reference.
>
> If this is not allowed, I think there should be a clearer error
> message. What Dean showed is an internal (not user-facing) error
> message.

Folding to lower isn't allowed by the spec either, and then there's
the matter of arrays...

Before going to lots of trouble to figure out how to throw an error
that says, "only the spec and no further," I'd like to investigate how
to make this work.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: David Fetter <david(at)fetter(dot)org>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-21 05:16:38
Message-ID: 20130621051638.GE5395@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote:
> David Fetter escribió:
> > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:
>
> > > In my testing of sub-queries in the FILTER clause (an extension to the
> > > spec), I was able to produce the following error:
> >
> > Per the spec,
> >
> > B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference.
>
> If this is not allowed, I think there should be a clearer error message.
> What Dean showed is an internal (not user-facing) error message.

Please find attached a patch which allows subqueries in the FILTER
clause and adds regression testing for same.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
filter_006.diff text/plain 41.3 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-21 08:57:46
Message-ID: CAEZATCUh9C9zDH7z_WKfKfnmb7cC7H9ZeoqbJD8v1GoKwq8SUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 June 2013 05:01, David Fetter <david(at)fetter(dot)org> wrote:
> What tests do you think should be there that aren't?
>

I think I expected to see more tests related to some of the specific
code changes, such as

CREATE TABLE t AS SELECT * FROM generate_series(1,10) t(x);

-- Should fail (filter can't be used for non-aggregates)
SELECT abs(x) FILTER (WHERE x > 5) FROM t;

-- Should fail (filter clause can't contain aggregates)
SELECT array_agg(x) FILTER (WHERE x > AVG(x)) t;

-- OK (aggregate in filter sub-query)
SELECT array_agg(x) FILTER (WHERE x > (SELECT AVG(x) FROM t)) FROM t;

-- OK (trivial sub-query)
SELECT array_agg(x) FILTER (WHERE (SELECT x > 5)) FROM t;

-- OK
SELECT array_agg(x) FILTER (WHERE (SELECT x > (SELECT AVG(x) FROM t))) FROM t;

-- OK
SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT t.x > AVG(t2.x) FROM
t as t2))) FROM t;

-- Should fail
SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT 5 > AVG(t.x) FROM t
as t2))) FROM t;

-- OK (filtered aggregate in window context)
SELECT x, AVG(x) OVER(ORDER BY x), AVG(x) FILTER (WHERE x > 5)
OVER(ORDER BY x) FROM t;

-- Should fail (non-aggregate window function with filter)
SELECT x, rank() OVER(ORDER BY x), rank() FILTER (WHERE x > 5)
OVER(ORDER BY x) FROM t;

-- View definition test
CREATE VIEW v AS SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT t.x
> AVG(t2.x) FROM t as t2))) FROM t;
\d+ v

etc...

You could probably dream up better examples --- I just felt that the
current tests don't give much coverage of the code changes.

Regards,
Dean


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-21 09:02:28
Message-ID: CAEZATCVx27ZXqWe-xX_yTqKbvOQMQpftFMCJKdON3Qd2-1quhQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 June 2013 06:16, David Fetter <david(at)fetter(dot)org> wrote:
> On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote:
>> David Fetter escribió:
>> > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:
>>
>> > > In my testing of sub-queries in the FILTER clause (an extension to the
>> > > spec), I was able to produce the following error:
>> >
>> > Per the spec,
>> >
>> > B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference.
>>
>> If this is not allowed, I think there should be a clearer error message.
>> What Dean showed is an internal (not user-facing) error message.
>
> Please find attached a patch which allows subqueries in the FILTER
> clause and adds regression testing for same.
>

Nice!

I should have pointed out that it was already working for some
sub-queries, just not all. It's good to see that it was basically just
a one-line fix, because I think it would have been a shame to not
support sub-queries.

I hope to take another look at it over the weekend.

Regards,
Dean


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-23 13:44:32
Message-ID: CAEZATCUceqFGQH_igF_PnkfyKY7_NiuW7UROT8qtXZyeqmOMCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 June 2013 10:02, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> On 21 June 2013 06:16, David Fetter <david(at)fetter(dot)org> wrote:
>> On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote:
>>> David Fetter escribió:
>>> > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:
>>>
>>> > > In my testing of sub-queries in the FILTER clause (an extension to the
>>> > > spec), I was able to produce the following error:
>>> >
>>> > Per the spec,
>>> >
>>> > B) A <filter clause> shall not contain a <query expression>, a <window function>, or an outer reference.
>>>
>>> If this is not allowed, I think there should be a clearer error message.
>>> What Dean showed is an internal (not user-facing) error message.
>>
>> Please find attached a patch which allows subqueries in the FILTER
>> clause and adds regression testing for same.
>>
>
> Nice!
>
> I should have pointed out that it was already working for some
> sub-queries, just not all. It's good to see that it was basically just
> a one-line fix, because I think it would have been a shame to not
> support sub-queries.
>
> I hope to take another look at it over the weekend.
>

I'm still not happy that this patch is making FILTER a new reserved
keyword, because I think it is a common enough English word (and an
obscure enough SQL keyword) that people may well have used it for
table names or aliases, and so their code will break.

There is another thread discussing a similar issue with lead/lag ignore nulls:
http://www.postgresql.org/message-id/CAOdE5WQnfR737OkxNXuLWnwpL7=OUssC-63ijP2=2RnRTw8emQ@mail.gmail.com

Playing around with the idea proposed there, it seems that it is
possible to make FILTER (and also OVER) unreserved keywords, by
splitting out the production of aggregate/window functions from normal
functions in gram.y. Aggregate/window functions are not allowed in
index_elem or func_table constructs, but they may appear in c_expr's.
That resolves the shift/reduce conflicts that would otherwise occur
from subsequent alias clauses, allowing FILTER and OVER to be
unreserved.

There is a drawback --- certain error messages become less specific,
for example: "SELECT * FROM rank() OVER(ORDER BY random());" is now a
syntax error, rather than the more useful error saying that window
functions aren't allowed in the FROM clause. That doesn't seem like
such a common case though.

What do you think?

Regards,
Dean

Attachment Content-Type Size
make-filter-unreserved.patch application/octet-stream 16.8 KB

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-23 14:44:26
Message-ID: 1371998666.72405.YahooMailNeo@web162903.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:

> I'm still not happy that this patch is making FILTER a new reserved
> keyword, because I think it is a common enough English word (and an
> obscure enough SQL keyword) that people may well have used it for
> table names or aliases, and so their code will break.

Well, if it is a useful capability with a standard syntax, I think
we should go with the standard syntax even if it might break
application code that uses, as unquoted identifiers, words reserved
by the SQL standard.  Of course, non-reserved is better than
reserved if we can manage it.

> Playing around with the idea proposed there, it seems that it is
> possible to make FILTER (and also OVER) unreserved keywords, by
> splitting out the production of aggregate/window functions from normal
> functions in gram.y. Aggregate/window functions are not allowed in
> index_elem or func_table constructs, but they may appear in c_expr's.
> That resolves the shift/reduce conflicts that would otherwise occur
> from subsequent alias clauses, allowing FILTER and OVER to be
> unreserved.
>
> There is a drawback --- certain error messages become less specific,
> for example: "SELECT * FROM rank() OVER(ORDER BY random());" is now a
> syntax error, rather than the more useful error saying that window
> functions aren't allowed in the FROM clause. That doesn't seem like
> such a common case though.
>
> What do you think?

I think it is OK if that gets a syntax error.  If that's the "worst
case" I like this approach.

This also helps keep down the size of the generated parse tables,
doesn't it?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: David Fetter <david(at)fetter(dot)org>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-23 21:55:11
Message-ID: 20130623215511.GC13568@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 23, 2013 at 07:44:26AM -0700, Kevin Grittner wrote:
> Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>
> > I'm still not happy that this patch is making FILTER a new reserved
> > keyword, because I think it is a common enough English word (and an
> > obscure enough SQL keyword) that people may well have used it for
> > table names or aliases, and so their code will break.
>
> Well, if it is a useful capability with a standard syntax, I think
> we should go with the standard syntax even if it might break
> application code that uses, as unquoted identifiers, words reserved
> by the SQL standard.  Of course, non-reserved is better than
> reserved if we can manage it.

I'm not entirely sure that I agree with this. The SQL standard
doesn't go adding keywords willy-nilly, or at least hasn't over a
fairly long stretch of time. I can get precise numbers on this if
needed. So far, only Tom and Greg have weighed in, Greg's response
being here:

http://www.postgresql.org/message-id/CAM-w4HOd3N_ozMygs==Lm5+hU8yQKKaYutGjiNp6z2hAzDrSTA@mail.gmail.com

> > Playing around with the idea proposed there, it seems that it is
> > possible to make FILTER (and also OVER) unreserved keywords, by
> > splitting out the production of aggregate/window functions from normal
> > functions in gram.y. Aggregate/window functions are not allowed in
> > index_elem or func_table constructs, but they may appear in c_expr's.
> > That resolves the shift/reduce conflicts that would otherwise occur
> > from subsequent alias clauses, allowing FILTER and OVER to be
> > unreserved.
> >
> > There is a drawback --- certain error messages become less specific,
> > for example: "SELECT * FROM rank() OVER(ORDER BY random());" is now a
> > syntax error, rather than the more useful error saying that window
> > functions aren't allowed in the FROM clause. That doesn't seem like
> > such a common case though.
> >
> > What do you think?
>
> I think it is OK if that gets a syntax error.  If that's the "worst
> case" I like this approach.

I think reducing the usefulness of error messages is something we need
to think extremely hard about before we do. Is there maybe a way to
keep the error messages even if by some magic we manage to unreserve
the words?

> This also helps keep down the size of the generated parse tables,
> doesn't it?

Could well. I suspect we may need to rethink the whole way we do
grammar at some point, but that's for a later discussion when I (or
someone else) has something choate to say about it.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Fetter <david(at)fetter(dot)org>
Cc: Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-24 02:50:13
Message-ID: 15895.1372042213@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter <david(at)fetter(dot)org> writes:
> On Sun, Jun 23, 2013 at 07:44:26AM -0700, Kevin Grittner wrote:
>> I think it is OK if that gets a syntax error. If that's the "worst
>> case" I like this approach.

> I think reducing the usefulness of error messages is something we need
> to think extremely hard about before we do. Is there maybe a way to
> keep the error messages even if by some magic we manage to unreserve
> the words?

Of the alternatives discussed so far, I don't really like anything
better than adding another special case to base_yylex(). Robert opined
in the other thread about RESPECT NULLS that the potential failure cases
with that approach are harder to reason about, which is true, but that
doesn't mean that we should accept failures we know of because there
might be failures we don't know of.

One thing that struck me while thinking about this is that it seems
like we've been going about it wrong in base_yylex() in any case.
For example, because we fold WITH followed by TIME into a special token
WITH_TIME, we will fail on a statement beginning

WITH time AS ...

even though "time" is a legal ColId. But suppose that instead of
merging the two tokens into one, we just changed the first token into
something special; that is, base_yylex would return a special token
WITH_FOLLOWED_BY_TIME and then TIME. We could then fix the above
problem by allowing either WITH or WITH_FOLLOWED_BY_TIME as the leading
keyword of a statement; and similarly for the few other places where
WITH can be followed by an arbitrary identifier.

Going on the same principle, we could probably let FILTER be an
unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a
type_func_name_keyword. (I've not tried this though.)

This idea doesn't help much for OVER because one of the alternatives for
over_clause is "OVER ColId", and I doubt we want to have base_yylex know
all the alternatives for ColId. I also had no great success with the
NULLS FIRST/LAST case: AFAICT the substitute token for NULLS still has
to be fully reserved, meaning that something like "select nulls last"
still doesn't work without quoting. We could maybe fix that with enough
denormalization of the index_elem productions, but it'd be ugly.

> Could well. I suspect we may need to rethink the whole way we do
> grammar at some point, but that's for a later discussion when I (or
> someone else) has something choate to say about it.

It'd sure be interesting to know what the SQL committee's target parsing
algorithm is. I find it hard to believe they're uninformed enough to
not know that these random syntaxes they keep inventing are hard to deal
with in LALR(1). Or maybe they really don't give a damn about breaking
applications every time they invent a new reserved word?

regards, tom lane


From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-24 14:31:54
Message-ID: CAM-w4HOdZ=429-mXwhuJ9BdjvUYuPBfWz__dJK7StNKjYd7nDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 24, 2013 at 3:50 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Or maybe they really don't give a damn about breaking
> applications every time they invent a new reserved word?

I think this is the obvious conclusion. In the standard the reserved
words are pretty explicitly reserved and not legal column names, no?

I think their model is that applications work with a certain version
of SQL and they're not expected to work with a new version without
extensive updating.

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: David Fetter <david(at)fetter(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-24 14:59:00
Message-ID: 13987.1372085940@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> I think their model is that applications work with a certain version
> of SQL and they're not expected to work with a new version without
> extensive updating.

Hm. We could invent a "sql_version" parameter and tweak the lexer to
return keywords added in spec versions later than that as IDENT.
However, I fear such a parameter would be a major PITA from the user's
standpoint, just like most of our backwards-compatibility GUCs have
proven to be.

regards, tom lane


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-24 21:16:01
Message-ID: 1372108561.1010.YahooMailNeo@web162906.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> wrote:
> On Mon, Jun 24, 2013 at 3:50 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Or maybe they really don't give a damn about breaking
>> applications every time they invent a new reserved word?
>
> I think this is the obvious conclusion. In the standard the reserved
> words are pretty explicitly reserved and not legal column names, no?
>
> I think their model is that applications work with a certain version
> of SQL and they're not expected to work with a new version without
> extensive updating.

IMO it is insanity to write an application of any significant
complexity without a data abstraction layer sitting on a data
access layer, and it is silly to write such layers which are
intended to interface to SQL in a portable way without quoting
*all* identifiers sent to the server.  As a developer, new reserved
words never bothered me in the slightest.  At Wisconsin Courts the
most heavily used table has been called "Case" since 1989, and the
table to hold a row for each paper document printed to pay someone
is called "Check".  No need to change the names just because SQL
started using those words for new language features after we had
the tables.  And there is no reason to assume that any particular
word won't become reserved in the future.

I think the most likely explanation is not that they don't mind
breaking applications, but that they don't understand that there
are significant numbers of people who choose to write code in a
fashion which leaves them vulnerable to breakage when new reserved
words are added.

Being closer to the wide variety of users we know that there are
many such people out there, and we try to look after them as best
we can; which is entirely appropriate.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-25 05:41:00
Message-ID: CAM2+6=VpH+cq4VSkUf+1iNcXpyAdfC1R=HVbJ4rzM+JG9bN_mw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi David,

I hope this is the latest patch to review, right ?

I am going to review it.

I have gone through the discussion on this thread and I agree with Stephen
Frost that it don't add much improvements as such but definitely it is
going to be easy for contributors in this area as they don't need to go all
over to add any extra parameter they need to add. This patch simplifies it
well enough.

Will post my review soon.

Thanks.

On Mon, Jun 17, 2013 at 11:13 AM, David Fetter <david(at)fetter(dot)org> wrote:

> On Mon, Feb 11, 2013 at 10:48:38AM -0800, David Fetter wrote:
> > On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote:
> > > David Fetter <david(at)fetter(dot)org> writes:
> > > > Per suggestions and lots of help from Andrew Gierth, please find
> > > > attached a patch to clean up the call sites for FuncCall nodes, which
> > > > I'd like to expand centrally rather than in each of the 37 (or 38,
> but
> > > > I only redid 37) places where it's called. The remaining one is in
> > > > src/backend/nodes/copyfuncs.c, which has to be modified for any
> > > > changes in the that struct anyhow.
> > >
> > > TBH, I don't think this is an improvement.
> > >
> > > The problem with adding a new field to any struct is that you have to
> > > run around and examine (and, usually, modify) every place that
> > > manufactures that type of struct. With a makeFuncCall defined like
> > > this, you'd still have to do that; it would just become a lot easier
> > > to forget to do so.
> > >
> > > If the subroutine were defined like most other makeXXX subroutines,
> > > ie you have to supply *all* the fields, that argument would go away,
> > > but the notational advantage is then dubious.
> > >
> > > The bigger-picture point is that you're proposing to make the coding
> > > conventions for building FuncCalls different from what they are for
> > > any other grammar node. I don't think that's a great idea; it will
> > > mostly foster confusion.
> >
> > The major difference between FuncCalls and others is that `while most
> > raw-parsetree nodes are constructed only in their own syntax
> > productions, FuncCall is constructed in many places unrelated to
> > actual function call syntax.
> >
> > This really will make things a good bit easier on our successors.
>
> Here's a rebased patch with comments illustrating how best to employ.
>
> In my previous message, I characterized the difference between
> FuncCalls and other raw-parsetree nodes. Is there some flaw in that
> logic? If there isn't, is there some reason not to treat them in a
> less redundant, more uniform manner as this patch does?
>
> Cheers,
> David.
> --
> David Fetter <david(at)fetter(dot)org> http://fetter.org/
> Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
> Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 30589500

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-25 09:24:55
Message-ID: CAM2+6=UuovxFYsxw_AaG1+eK9GWekXDAbyxpiZ4rUL-G7htgvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 25, 2013 at 11:11 AM, Jeevan Chalke <
jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:

> Hi David,
>
> I hope this is the latest patch to review, right ?
>
> I am going to review it.
>
> I have gone through the discussion on this thread and I agree with Stephen
> Frost that it don't add much improvements as such but definitely it is
> going to be easy for contributors in this area as they don't need to go all
> over to add any extra parameter they need to add. This patch simplifies it
> well enough.
>
> Will post my review soon.
>
>
Assuming *makeFuncArgs_002.diff* is the latest patch, here are my review
points.

About this patch and feature:
===
This patch tries to reduce redundancy when we need FuncCall expression. With
this patch it will become easier to add new parameter if needed. We just
need
to put it's default value at centralized location (I mean in this new
function)
so that all other places need not require and changes. And this new
parameter
is handled by the new feature who demanded it keep other untouched.

Review comments on patch:
===
1. Can't apply with "git apply" command but able to do it with patch -p1.
So no
issues
2. Adds one whitespace error, hopefully it will get corrected once it goes
through pgindent.
3. There is absolutely NO user visibility and thus I guess we don't need any
test case. Also no documentation are needed.
4. Also make check went smooth and thus assumes that there is no issue as
such.
Even I couldn't find any issue with my testing other than regression suite.
5. I had a code walk-through over patch and it looks good to me. However,
following line change seems unrelated (Line 186 in your patch)

! $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1,
(Node *) n, @2);
!

Changes required from author:
===
It will be good if you remove unrelated changes from the patch and possibly
all
white-space errors.

Thanks

> Thanks.
>
> On Mon, Jun 17, 2013 at 11:13 AM, David Fetter <david(at)fetter(dot)org> wrote:
>
>> On Mon, Feb 11, 2013 at 10:48:38AM -0800, David Fetter wrote:
>> > On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote:
>> > > David Fetter <david(at)fetter(dot)org> writes:
>> > > > Per suggestions and lots of help from Andrew Gierth, please find
>> > > > attached a patch to clean up the call sites for FuncCall nodes,
>> which
>> > > > I'd like to expand centrally rather than in each of the 37 (or 38,
>> but
>> > > > I only redid 37) places where it's called. The remaining one is in
>> > > > src/backend/nodes/copyfuncs.c, which has to be modified for any
>> > > > changes in the that struct anyhow.
>> > >
>> > > TBH, I don't think this is an improvement.
>> > >
>> > > The problem with adding a new field to any struct is that you have to
>> > > run around and examine (and, usually, modify) every place that
>> > > manufactures that type of struct. With a makeFuncCall defined like
>> > > this, you'd still have to do that; it would just become a lot easier
>> > > to forget to do so.
>> > >
>> > > If the subroutine were defined like most other makeXXX subroutines,
>> > > ie you have to supply *all* the fields, that argument would go away,
>> > > but the notational advantage is then dubious.
>> > >
>> > > The bigger-picture point is that you're proposing to make the coding
>> > > conventions for building FuncCalls different from what they are for
>> > > any other grammar node. I don't think that's a great idea; it will
>> > > mostly foster confusion.
>> >
>> > The major difference between FuncCalls and others is that `while most
>> > raw-parsetree nodes are constructed only in their own syntax
>> > productions, FuncCall is constructed in many places unrelated to
>> > actual function call syntax.
>> >
>> > This really will make things a good bit easier on our successors.
>>
>> Here's a rebased patch with comments illustrating how best to employ.
>>
>> In my previous message, I characterized the difference between
>> FuncCalls and other raw-parsetree nodes. Is there some flaw in that
>> logic? If there isn't, is there some reason not to treat them in a
>> less redundant, more uniform manner as this patch does?
>>
>> Cheers,
>> David.
>> --
>> David Fetter <david(at)fetter(dot)org> http://fetter.org/
>> Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
>> Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
>> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>>
>> Remember to vote!
>> Consider donating to Postgres: http://www.postgresql.org/about/donate
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>>
>
>
> --
> Jeevan B Chalke
> Senior Software Engineer, R&D
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
> Phone: +91 20 30589500
>
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
>
> This e-mail message (and any attachment) is intended for the use of the
> individual or entity to whom it is addressed. This message contains
> information from EnterpriseDB Corporation that may be privileged,
> confidential, or exempt from disclosure under applicable law. If you are
> not the intended recipient or authorized to receive this for the intended
> recipient, any use, dissemination, distribution, retention, archiving, or
> copying of this communication is strictly prohibited. If you have received
> this e-mail in error, please notify the sender immediately by reply e-mail
> and delete this message.
>

--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 30589500

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-25 11:58:04
Message-ID: CAEZATCVFfrK94xp=v1x-rJKw-_294WKubCf5u87oeVV7PV=Nug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 June 2013 03:50, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> David Fetter <david(at)fetter(dot)org> writes:
>> On Sun, Jun 23, 2013 at 07:44:26AM -0700, Kevin Grittner wrote:
>>> I think it is OK if that gets a syntax error. If that's the "worst
>>> case" I like this approach.
>
>> I think reducing the usefulness of error messages is something we need
>> to think extremely hard about before we do. Is there maybe a way to
>> keep the error messages even if by some magic we manage to unreserve
>> the words?
>

The flip side is that the error message you get if you don't realise a
word is now reserved is not exactly useful: "Syntax error at or near
xxx". I don't know of any way to improve that though.

> Of the alternatives discussed so far, I don't really like anything
> better than adding another special case to base_yylex(). Robert opined
> in the other thread about RESPECT NULLS that the potential failure cases
> with that approach are harder to reason about, which is true, but that
> doesn't mean that we should accept failures we know of because there
> might be failures we don't know of.
>
> One thing that struck me while thinking about this is that it seems
> like we've been going about it wrong in base_yylex() in any case.
> For example, because we fold WITH followed by TIME into a special token
> WITH_TIME, we will fail on a statement beginning
>
> WITH time AS ...
>
> even though "time" is a legal ColId. But suppose that instead of
> merging the two tokens into one, we just changed the first token into
> something special; that is, base_yylex would return a special token
> WITH_FOLLOWED_BY_TIME and then TIME. We could then fix the above
> problem by allowing either WITH or WITH_FOLLOWED_BY_TIME as the leading
> keyword of a statement; and similarly for the few other places where
> WITH can be followed by an arbitrary identifier.
>
> Going on the same principle, we could probably let FILTER be an
> unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a
> type_func_name_keyword. (I've not tried this though.)
>

I've not tried either, but wouldn't that mean that "SELECT * FROM
list_filters() filter" would be legal, whereas "SELECT * FROM
list_filters() filter(id, val)" would be a syntax error? If so, I
don't think that would be an improvement.

Regards,
Dean


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-25 13:42:10
Message-ID: 8783.1372167730@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> On 24 June 2013 03:50, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Going on the same principle, we could probably let FILTER be an
>> unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a
>> type_func_name_keyword. (I've not tried this though.)

> I've not tried either, but wouldn't that mean that "SELECT * FROM
> list_filters() filter" would be legal, whereas "SELECT * FROM
> list_filters() filter(id, val)" would be a syntax error? If so, I
> don't think that would be an improvement.

Hm, good point. The SQL committee really managed to choose some
unfortunate syntax here, didn't they.

I know it's heresy in these parts, but maybe we should consider
adopting a non-spec syntax for this feature? In particular, it's
really un-obvious why the FILTER clause shouldn't be inside rather
than outside the aggregate's parens, like ORDER BY.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-25 13:50:42
Message-ID: CAFj8pRAYkCe_-0+95zXKByrs1VjPbhvBj5tznBfC7wVg0D+hZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/6/25 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
>> On 24 June 2013 03:50, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Going on the same principle, we could probably let FILTER be an
>>> unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a
>>> type_func_name_keyword. (I've not tried this though.)
>
>> I've not tried either, but wouldn't that mean that "SELECT * FROM
>> list_filters() filter" would be legal, whereas "SELECT * FROM
>> list_filters() filter(id, val)" would be a syntax error? If so, I
>> don't think that would be an improvement.
>
> Hm, good point. The SQL committee really managed to choose some
> unfortunate syntax here, didn't they.
>
> I know it's heresy in these parts, but maybe we should consider
> adopting a non-spec syntax for this feature? In particular, it's
> really un-obvious why the FILTER clause shouldn't be inside rather
> than outside the aggregate's parens, like ORDER BY.

the gram can be more free and final decision should be done in later stages ???

This technique was enough when I wrote prototype for CUBE ROLLUP
without CUBE ROLLUP reserwed keywords.

Regards

Pavel

>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-25 14:59:02
Message-ID: CA+TgmoZq6SiDYgFrcSuKez_nCcW8V0FcN2_TbiymATfc1e--7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 23, 2013 at 10:50 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> David Fetter <david(at)fetter(dot)org> writes:
>> On Sun, Jun 23, 2013 at 07:44:26AM -0700, Kevin Grittner wrote:
>>> I think it is OK if that gets a syntax error. If that's the "worst
>>> case" I like this approach.
>
>> I think reducing the usefulness of error messages is something we need
>> to think extremely hard about before we do. Is there maybe a way to
>> keep the error messages even if by some magic we manage to unreserve
>> the words?
>
> Of the alternatives discussed so far, I don't really like anything
> better than adding another special case to base_yylex(). Robert opined
> in the other thread about RESPECT NULLS that the potential failure cases
> with that approach are harder to reason about, which is true, but that
> doesn't mean that we should accept failures we know of because there
> might be failures we don't know of.

Sure, that's true; but the proposal on the other thread is just to
disallow invalid syntax early enough that it benefits the parser. The
error message is different, but I don't think it's a BAD error
message.

> One thing that struck me while thinking about this is that it seems
> like we've been going about it wrong in base_yylex() in any case.
> For example, because we fold WITH followed by TIME into a special token
> WITH_TIME, we will fail on a statement beginning
>
> WITH time AS ...
>
> even though "time" is a legal ColId. But suppose that instead of
> merging the two tokens into one, we just changed the first token into
> something special; that is, base_yylex would return a special token
> WITH_FOLLOWED_BY_TIME and then TIME. We could then fix the above
> problem by allowing either WITH or WITH_FOLLOWED_BY_TIME as the leading
> keyword of a statement; and similarly for the few other places where
> WITH can be followed by an arbitrary identifier.
>
> Going on the same principle, we could probably let FILTER be an
> unreserved keyword while FILTER_FOLLOWED_BY_PAREN could be a
> type_func_name_keyword. (I've not tried this though.)

I think this whole direction is going to collapse under its own weight
VERY quickly. The problems you're describing are essentially
shift/reduce conflicts that are invisible because they're hidden
behind lexer magic. Part of the value of using a parser generator is
that it TELLS you when you've added ambiguous syntax. But it doesn't
know about lexer hacks, so stuff will just silently break. I think
this type of lexer hacks works reasonably well keyword-like things
that are used in just one place in the grammar. As soon as you get up
to two, the wheels come off - as with RESPECT NULLS vs. NULLS FIRST.

> This idea doesn't help much for OVER because one of the alternatives for
> over_clause is "OVER ColId", and I doubt we want to have base_yylex know
> all the alternatives for ColId. I also had no great success with the
> NULLS FIRST/LAST case: AFAICT the substitute token for NULLS still has
> to be fully reserved, meaning that something like "select nulls last"
> still doesn't work without quoting. We could maybe fix that with enough
> denormalization of the index_elem productions, but it'd be ugly.

I don't think that particular example is very compelling - there's a
general rule that column aliases can't be keywords of any type.
That's not wonderful, and EnterpriseDB has had bug reports filed about
it, but the real-world impact is pretty minimal, certainly compared to
what we used to do which is not allow column aliases AT ALL.

> It'd sure be interesting to know what the SQL committee's target parsing
> algorithm is. I find it hard to believe they're uninformed enough to
> not know that these random syntaxes they keep inventing are hard to deal
> with in LALR(1). Or maybe they really don't give a damn about breaking
> applications every time they invent a new reserved word?

Does the SQL committee contemplate that SELECT * FROM somefunc()
filter (id, val) should act as a table alias and that SELECT * FROM
somefunc() filter (where x > 1) is an aggregate filter? This all gets
much easier to understand if one of those constructs isn't allowed in
that particular context.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-26 00:01:39
Message-ID: 51CA2F63.9060901@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> I know it's heresy in these parts, but maybe we should consider
> adopting a non-spec syntax for this feature? In particular, it's
> really un-obvious why the FILTER clause shouldn't be inside rather
> than outside the aggregate's parens, like ORDER BY.

Well, what other DBMSes support this feature? Will being non-spec
introduce migration pain?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-26 08:21:33
Message-ID: CAEZATCXJrMpD1+DOusVNG0fTBFjf+J8kpv9i7=5+p1a0ihXjUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 June 2013 01:01, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
>> I know it's heresy in these parts, but maybe we should consider
>> adopting a non-spec syntax for this feature? In particular, it's
>> really un-obvious why the FILTER clause shouldn't be inside rather
>> than outside the aggregate's parens, like ORDER BY.
>
> Well, what other DBMSes support this feature? Will being non-spec
> introduce migration pain?
>

I can't find any, but that doesn't mean they don't exist, or aren't
being worked on.

To recap, the options currently on offer are:

1). Make FILTER a new partially reserved keyword, accepting that that
might break some users' application code.

2). Make FILTER unreserved, accepting that that will lead to syntax
errors rather than more specific error messages if the user tries to
use an aggregate/window function with FILTER or OVER in the FROM
clause of a query, or as an index expression.

3). Adopt a non-standard syntax for this feature, accepting that that
might conflict with other databases, and that we can never then claim
to have implemented T612, "Advanced OLAP operations".

4). Some other parser hack that will offer a better compromise?

My preference is for (2) as the lesser of several evils --- it's a
fairly narrow case where the quality of the error message is reduced.

Regards,
Dean


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-26 08:38:31
Message-ID: CAFj8pRAE-KRxXAgsYuCMONN-h78FeEpArOMBKJq-40JTwPZYHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/6/26 Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>:
> On 26 June 2013 01:01, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>
>>> I know it's heresy in these parts, but maybe we should consider
>>> adopting a non-spec syntax for this feature? In particular, it's
>>> really un-obvious why the FILTER clause shouldn't be inside rather
>>> than outside the aggregate's parens, like ORDER BY.
>>
>> Well, what other DBMSes support this feature? Will being non-spec
>> introduce migration pain?
>>
>
> I can't find any, but that doesn't mean they don't exist, or aren't
> being worked on.
>
> To recap, the options currently on offer are:
>
> 1). Make FILTER a new partially reserved keyword, accepting that that
> might break some users' application code.
>
> 2). Make FILTER unreserved, accepting that that will lead to syntax
> errors rather than more specific error messages if the user tries to
> use an aggregate/window function with FILTER or OVER in the FROM
> clause of a query, or as an index expression.
>
> 3). Adopt a non-standard syntax for this feature, accepting that that
> might conflict with other databases, and that we can never then claim
> to have implemented T612, "Advanced OLAP operations".
>
> 4). Some other parser hack that will offer a better compromise?
>
>
> My preference is for (2) as the lesser of several evils --- it's a
> fairly narrow case where the quality of the error message is reduced.
>

@2 looks well

Pavel

> Regards,
> Dean
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-26 08:51:56
Message-ID: b3024c23db5a2aecbd2c82b189181453@news-out.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean Rasheed said:
> To recap, the options currently on offer are:
>
> 1). Make FILTER a new partially reserved keyword, accepting that that
> might break some users' application code.
>
> 2). Make FILTER unreserved, accepting that that will lead to syntax
> errors rather than more specific error messages if the user tries to
> use an aggregate/window function with FILTER or OVER in the FROM
> clause of a query, or as an index expression.
>
> 3). Adopt a non-standard syntax for this feature, accepting that that
> might conflict with other databases, and that we can never then claim
> to have implemented T612, "Advanced OLAP operations".
>
> 4). Some other parser hack that will offer a better compromise?
>
> My preference is for (2) as the lesser of several evils --- it's a
> fairly narrow case where the quality of the error message is reduced.

Possibly significant in this context is that there is a proof-of-concept
patch in development for another part of T612, namely inverse
distribution functions (e.g. percentile_disc and percentile_cont) which
should be available by the next CF, and which will require a similar
decision with respect to the keyword WITHIN (to support doing:
select percentile_cont(0.5) within group (order by x) from ...;
which returns the median value of x).

This syntax is much more widely supported than FILTER, including by both
Oracle and MSSQL, so a non-standard alternative is much less attractive.
A solution which suits both (i.e. either option 1 or option 2) would make
a whole lot more sense than trying to handle them differently.

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-27 03:05:44
Message-ID: 7321.1372302344@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> Possibly significant in this context is that there is a proof-of-concept
> patch in development for another part of T612, namely inverse
> distribution functions (e.g. percentile_disc and percentile_cont) which
> should be available by the next CF, and which will require a similar
> decision with respect to the keyword WITHIN (to support doing:
> select percentile_cont(0.5) within group (order by x) from ...;
> which returns the median value of x).

Agreed, separating out the function-call-with-trailing-declaration
syntaxes so they aren't considered in FROM and index_elem seems like
the best compromise.

If we do that for window function OVER clauses as well, can we make
OVER less reserved?

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-27 08:41:59
Message-ID: 7c73f7b5630eb0e2581cbd02696b61b2@news-out.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane said:
> Agreed, separating out the function-call-with-trailing-declaration
> syntaxes so they aren't considered in FROM and index_elem seems like
> the best compromise.
>
> If we do that for window function OVER clauses as well, can we make
> OVER less reserved?

Yes.

At least, I tried it with both OVER and FILTER unreserved and there
were no grammar conflicts (and I didn't have to do anything fancy to
avoid them), and it passed regression with the exception of the
changed error message for window functions in the from-clause.

So is this the final decision on how to proceed? It seems good to me,
and I can work with David to get it done.

--
Andrew (irc:RhodiumToad)


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-27 08:47:20
Message-ID: CAFj8pRDmgBDKWjMzMHwdZxrLUUcYXXiHA8C=b=Y3EWoTXAfLJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

2013/6/27 Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>:
> Tom Lane said:
>> Agreed, separating out the function-call-with-trailing-declaration
>> syntaxes so they aren't considered in FROM and index_elem seems like
>> the best compromise.
>>
>> If we do that for window function OVER clauses as well, can we make
>> OVER less reserved?
>
> Yes.
>
> At least, I tried it with both OVER and FILTER unreserved and there
> were no grammar conflicts (and I didn't have to do anything fancy to
> avoid them), and it passed regression with the exception of the
> changed error message for window functions in the from-clause.
>
> So is this the final decision on how to proceed? It seems good to me,
> and I can work with David to get it done.
>

Isn't dangerous do OVER unreserved keyword??

Regards

Pavel

> --
> Andrew (irc:RhodiumToad)
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: David Fetter <david(at)fetter(dot)org>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-27 13:56:24
Message-ID: 20130627135624.GA27765@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 27, 2013 at 08:41:59AM +0000, Andrew Gierth wrote:
> Tom Lane said:
> > Agreed, separating out the function-call-with-trailing-declaration
> > syntaxes so they aren't considered in FROM and index_elem seems
> > like the best compromise.
> >
> > If we do that for window function OVER clauses as well, can we
> > make OVER less reserved?
>
> Yes.
>
> At least, I tried it with both OVER and FILTER unreserved and there
> were no grammar conflicts (and I didn't have to do anything fancy to
> avoid them), and it passed regression with the exception of the
> changed error message for window functions in the from-clause.
>
> So is this the final decision on how to proceed? It seems good to
> me, and I can work with David to get it done.

If this is really the direction people want to go, I'm in. Is there
some code I can look at?

I still submit that having our reserved word ducks in a row in advance
is a saner way to go about this, and will work up a patch for that as
I have time.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-27 14:05:34
Message-ID: 20268.1372341934@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> Tom Lane said:
>> Agreed, separating out the function-call-with-trailing-declaration
>> syntaxes so they aren't considered in FROM and index_elem seems like
>> the best compromise.
>>
>> If we do that for window function OVER clauses as well, can we make
>> OVER less reserved?

> Yes.

> At least, I tried it with both OVER and FILTER unreserved and there
> were no grammar conflicts (and I didn't have to do anything fancy to
> avoid them), and it passed regression with the exception of the
> changed error message for window functions in the from-clause.

> So is this the final decision on how to proceed? It seems good to me,
> and I can work with David to get it done.

Yeah, please submit a separate patch that just refactors the existing
grammar as above; that'll simplify reviewing.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-27 14:12:42
Message-ID: 20419.1372342362@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> Tom Lane said:
>>> If we do that for window function OVER clauses as well, can we make
>>> OVER less reserved?

> Isn't dangerous do OVER unreserved keyword??

How so? The worst-case scenario is that we find we have to make it more
reserved again in some future release, as a consequence of some new
randomness from the SQL committee. That will just return us to the
status quo, in which anybody who uses OVER as a table/column name has
been broken since about 8.4. Since we still hear of people using
releases as old as 7.2.x, I'm sure there are a few out there who would
still be helped if we could de-reserve OVER again. (Not to mention
people migrating from other systems in which it's not a keyword.)

In any case, the general project policy has been to never make keywords
any more reserved than we absolutely have to. If we didn't care about
this, we wouldn't be bothering with four separate categories of keywords.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-27 14:40:36
Message-ID: 51CC4EE4.30309@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6/23/13 10:50 PM, Tom Lane wrote:
> It'd sure be interesting to know what the SQL committee's target parsing
> algorithm is.

It's whatever Oracle and IBM implement.

> Or maybe they really don't give a damn about breaking
> applications every time they invent a new reserved word?

Well, yes, I think that policy was built into the language at the very
beginning.


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-28 08:18:29
Message-ID: CAEZATCXhcVgx+jeMPLgHTeCSYAp+as6xtRx90V0PZGWd7O989Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27 June 2013 15:05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
>> Tom Lane said:
>>> Agreed, separating out the function-call-with-trailing-declaration
>>> syntaxes so they aren't considered in FROM and index_elem seems like
>>> the best compromise.
>>>
>>> If we do that for window function OVER clauses as well, can we make
>>> OVER less reserved?
>
>> Yes.
>
>> At least, I tried it with both OVER and FILTER unreserved and there
>> were no grammar conflicts (and I didn't have to do anything fancy to
>> avoid them), and it passed regression with the exception of the
>> changed error message for window functions in the from-clause.
>
>> So is this the final decision on how to proceed? It seems good to me,
>> and I can work with David to get it done.
>
> Yeah, please submit a separate patch that just refactors the existing
> grammar as above; that'll simplify reviewing.
>

In that case, I'll re-review the latest FILTER patch over the weekend
on the understanding that the reserved/unreserved keyword issue will
be resolved in separate patch.

Regards,
Dean


From: David Fetter <david(at)fetter(dot)org>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-28 14:25:14
Message-ID: 20130628142514.GA2500@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 25, 2013 at 02:54:55PM +0530, Jeevan Chalke wrote:
> On Tue, Jun 25, 2013 at 11:11 AM, Jeevan Chalke <
> jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
>
> > Hi David,
> >
> > I hope this is the latest patch to review, right ?
> >
> > I am going to review it.
> >
> > I have gone through the discussion on this thread and I agree with Stephen
> > Frost that it don't add much improvements as such but definitely it is
> > going to be easy for contributors in this area as they don't need to go all
> > over to add any extra parameter they need to add. This patch simplifies it
> > well enough.
> >
> > Will post my review soon.
> >
> >
> Assuming *makeFuncArgs_002.diff* is the latest patch, here are my review
> points.
>
> About this patch and feature:
> ===
> This patch tries to reduce redundancy when we need FuncCall expression. With
> this patch it will become easier to add new parameter if needed. We just
> need
> to put it's default value at centralized location (I mean in this new
> function)
> so that all other places need not require and changes. And this new
> parameter
> is handled by the new feature who demanded it keep other untouched.
>
> Review comments on patch:
> ===
> 1. Can't apply with "git apply" command but able to do it with patch -p1.
> So no
> issues
> 2. Adds one whitespace error, hopefully it will get corrected once it goes
> through pgindent.
> 3. There is absolutely NO user visibility and thus I guess we don't need any
> test case. Also no documentation are needed.
> 4. Also make check went smooth and thus assumes that there is no issue as
> such.
> Even I couldn't find any issue with my testing other than regression suite.
> 5. I had a code walk-through over patch and it looks good to me. However,
> following line change seems unrelated (Line 186 in your patch)
>
> ! $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1,
> (Node *) n, @2);
> !
>
> Changes required from author:
> ===
> It will be good if you remove unrelated changes from the patch and possibly
> all
> white-space errors.
>
> Thanks

Thanks for the review!

Please find attached the latest patch.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
makeFuncArgs_003.diff text/plain 21.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Fetter <david(at)fetter(dot)org>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-28 14:31:16
Message-ID: 8057.1372429876@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter <david(at)fetter(dot)org> writes:
> Please find attached the latest patch.

I remain of the opinion that this is simply a bad idea. It is unlike
our habits for constructing other types of nodes, and makes it harder
not easier to find all the places that need to be updated when adding
another field to FuncCall.

regards, tom lane


From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-28 14:38:08
Message-ID: 20130628143808.GB2500@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 28, 2013 at 10:31:16AM -0400, Tom Lane wrote:
> David Fetter <david(at)fetter(dot)org> writes:
> > Please find attached the latest patch.
>
> I remain of the opinion that this is simply a bad idea. It is unlike
> our habits for constructing other types of nodes, and makes it harder
> not easier to find all the places that need to be updated when adding
> another field to FuncCall.

With utmost respect, this is just not true.

There's exactly one place that needs updating after adding another
field to FuncCall in the general case where the default value of the
field doesn't affect most setters of FuncCall, i.e. where the default
default is the right thing for current setters. In specific cases
where such a field might need to be set to something other than its
default value, finding calls to makeFuncCall is just as easy, and with
some tools like cscope, even easier.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-28 15:04:15
Message-ID: 20130628150415.GA925704@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 28, 2013 at 10:31:16AM -0400, Tom Lane wrote:
> David Fetter <david(at)fetter(dot)org> writes:
> > Please find attached the latest patch.
>
> I remain of the opinion that this is simply a bad idea. It is unlike
> our habits for constructing other types of nodes, and makes it harder
> not easier to find all the places that need to be updated when adding
> another field to FuncCall.

We have precedents in makeRangeVar() and makeDefElem().

For me, this change would make it slightly easier to visit affected code sites
after a change. I could cscope for callers of makeFuncCall() instead of doing
"git grep 'makeNode(FuncCall)'". The advantage could go either way depending
on your tooling, though.

By having each call site only mention the seldom-used fields for which it does
something special, the distinctive aspects of the call site stand out better.
That's a nice advantage.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-28 15:30:30
Message-ID: CA+TgmoZWBfYdNmm4zUz9esjTu-e1FgNJffc5sFYER1zr-OaRZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 28, 2013 at 10:31 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> David Fetter <david(at)fetter(dot)org> writes:
>> Please find attached the latest patch.
>
> I remain of the opinion that this is simply a bad idea. It is unlike
> our habits for constructing other types of nodes, and makes it harder
> not easier to find all the places that need to be updated when adding
> another field to FuncCall.

I think it's a nice code cleanup. I don't understand your objection.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-06-28 17:28:35
Message-ID: 51CDC7C3.7080204@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6/28/13 11:30 AM, Robert Haas wrote:
> On Fri, Jun 28, 2013 at 10:31 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> David Fetter <david(at)fetter(dot)org> writes:
>>> Please find attached the latest patch.
>>
>> I remain of the opinion that this is simply a bad idea. It is unlike
>> our habits for constructing other types of nodes, and makes it harder
>> not easier to find all the places that need to be updated when adding
>> another field to FuncCall.
>
> I think it's a nice code cleanup. I don't understand your objection.

Yeah, I was reading the patch thinking, yes, finally someone cleans that up.


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-06-28 20:22:52
Message-ID: CAEZATCWsqR095bNohY5QwY0vAS4k7TDx_dEpFvfTnatzouvfyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 June 2013 06:16, David Fetter <david(at)fetter(dot)org> wrote:
> Please find attached a patch which allows subqueries in the FILTER
> clause and adds regression testing for same.
>

This needs re-basing/merging following Robert's recent commit to make
OVER unreserved.

Regards,
Dean


From: David Fetter <david(at)fetter(dot)org>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-01 00:44:09
Message-ID: 20130701004409.GA4365@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 28, 2013 at 09:22:52PM +0100, Dean Rasheed wrote:
> On 21 June 2013 06:16, David Fetter <david(at)fetter(dot)org> wrote:
> > Please find attached a patch which allows subqueries in the FILTER
> > clause and adds regression testing for same.
> >
>
> This needs re-basing/merging following Robert's recent commit to make
> OVER unreserved.

Please find attached. Thanks, Andrew Gierth! In this one, FILTER is
no longer a reserved word.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
filter_007.diff text/plain 39.9 KB

From: David Fetter <david(at)fetter(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-07-01 00:46:37
Message-ID: 20130701004637.GB4365@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 28, 2013 at 01:28:35PM -0400, Peter Eisentraut wrote:
> On 6/28/13 11:30 AM, Robert Haas wrote:
> > On Fri, Jun 28, 2013 at 10:31 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> David Fetter <david(at)fetter(dot)org> writes:
> >>> Please find attached the latest patch.
> >>
> >> I remain of the opinion that this is simply a bad idea. It is unlike
> >> our habits for constructing other types of nodes, and makes it harder
> >> not easier to find all the places that need to be updated when adding
> >> another field to FuncCall.
> >
> > I think it's a nice code cleanup. I don't understand your objection.
>
> Yeah, I was reading the patch thinking, yes, finally someone cleans that up.

Please find enclosed a patch reflecting the changes that de-reserved
OVER as a keyword.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
makeFuncArgs_004.diff text/plain 21.5 KB

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-07-01 07:19:55
Message-ID: CAM2+6=UyJZqaip4kdFuRSxoU1NOBepkHMHXMRSYzAqiaSK8n2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 1, 2013 at 6:16 AM, David Fetter <david(at)fetter(dot)org> wrote:

> On Fri, Jun 28, 2013 at 01:28:35PM -0400, Peter Eisentraut wrote:
> > On 6/28/13 11:30 AM, Robert Haas wrote:
> > > On Fri, Jun 28, 2013 at 10:31 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >> David Fetter <david(at)fetter(dot)org> writes:
> > >>> Please find attached the latest patch.
> > >>
> > >> I remain of the opinion that this is simply a bad idea. It is unlike
> > >> our habits for constructing other types of nodes, and makes it harder
> > >> not easier to find all the places that need to be updated when adding
> > >> another field to FuncCall.
> > >
> > > I think it's a nice code cleanup. I don't understand your objection.
> >
> > Yeah, I was reading the patch thinking, yes, finally someone cleans that
> up.
>
> Please find enclosed a patch reflecting the changes that de-reserved
> OVER as a keyword.
>

I have re-validated this new patch and it looks good to go in now.

I saw that it's already marked ready for committer.

Thanks David

> Cheers,
> David.
> --
> David Fetter <david(at)fetter(dot)org> http://fetter.org/
> Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
> Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 30589500

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-01 16:30:38
Message-ID: CAEZATCWgD_jYY0ARXTNTLbP7FSDsfahc4y3WtSOhgXo0ajPnyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1 July 2013 01:44, David Fetter <david(at)fetter(dot)org> wrote:
> On Fri, Jun 28, 2013 at 09:22:52PM +0100, Dean Rasheed wrote:
>> On 21 June 2013 06:16, David Fetter <david(at)fetter(dot)org> wrote:
>> > Please find attached a patch which allows subqueries in the FILTER
>> > clause and adds regression testing for same.
>> >
>>
>> This needs re-basing/merging following Robert's recent commit to make
>> OVER unreserved.
>
> Please find attached. Thanks, Andrew Gierth! In this one, FILTER is
> no longer a reserved word.
>

Looking at this patch again, it appears to be in pretty good shape.

- Applies cleanly to head.
- Compiles with no warnings.
- Includes regression test cases and doc updates.
- Compatible with the relevant part of T612, "Advanced OLAP operations".
- Includes pg_dump support.
- Code changes all look reasonable, and I can't find any corner cases
that have been missed.
- Appears to work as expected. I tested everything I could think of
and couldn't break it.

AFAICT all the bases have been covered. As mentioned upthread, I would
have preferred a few more regression test cases, and a couple of the
tests don't appear to return anything interesting, but I'll leave that
for the committer to decide whether they're sufficient for regression
tests.

I have a few suggestions to improve the docs:

1). In syntax.sgml: "The aggregate_name can also be suffixed with
FILTER as described below". It's not really a suffix to the aggregate
name, since it follows the function arguments and optional order by
clause. Perhaps it would be more consistent with the surrounding text
to say something like

<replaceable>expression</replaceable> is
any value expression that does not itself contain an aggregate
expression or a window function call, and
! <replaceable>order_by_clause</replaceable> and
! <replaceable>filter_clause</replaceable> are optional
! <literal>ORDER BY</> and <literal>FILTER</> clauses as described below.

2). In syntax.sgml: "... or when a FILTER clause is present, each row
matching same". In the context of that paragraph this suggests that
the filter clause only applies to the first form, since that paragraph
is a description of the 4 forms of the aggregate function. I don't
think it's worth mentioning FILTER in this paragraph at all --- it's
adequately described below that.

3). In syntax.sgml: "Adding a FILTER clause to an aggregate specifies
which values of the expression being aggregated to evaluate". How
about something a little more specific, along the lines of

If <literal>FILTER</> is specified, then only input rows for which
the <replaceable>filter_clause</replaceable> evaluates to true are
fed to the aggregate function; input rows for which the
<replaceable>filter_clause</replaceable> evaluates to false or the
null value are discarded. For example...

4). In select.sgml: "In the absence of a FILTER clause, aggregate
functions...". It doesn't seem right to refer to the FILTER clause at
the top level here because it's not part of the SELECT syntax being
described on this page. Also I think this should include a
cross-reference to the aggregate function syntax section, perhaps
something like:

Aggregate functions, if any are used, are computed across all rows
making up each group, producing a separate value for each group
(whereas without <literal>GROUP BY</literal>, an aggregate
produces a single value computed across all the selected rows).
+ The set of rows fed to the aggregate function can be further filtered
+ by attaching a <literal>FILTER</literal> clause to the aggregate
+ function call, see <xref ...> for more information.
When <literal>GROUP BY</literal> is present, it is not valid for
the <command>SELECT</command> list expressions to refer to
ungrouped columns except within aggregate functions or if the
ungrouped column is functionally dependent on the grouped columns,
since there would otherwise be more than one possible value to
return for an ungrouped column. A functional dependency exists if
the grouped columns (or a subset thereof) are the primary key of
the table containing the ungrouped column.

Regards,
Dean


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Department of Redundancy Department: makeNode(FuncCall) division
Date: 2013-07-01 18:50:05
Message-ID: CA+Tgmoa6vKp5OitAzSeGUOUgtU2CXSkSYHpaNP_UmrsmEVw57Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 1, 2013 at 3:19 AM, Jeevan Chalke
<jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> I have re-validated this new patch and it looks good to go in now.
>
> I saw that it's already marked ready for committer.

I don't normally like to commit things over another committer's
objections, but this has +1 votes from four other committers (Stephen,
Noah, Peter E, myself) so I think that's enough reason to move
forward. So committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: David Fetter <david(at)fetter(dot)org>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-05 17:23:37
Message-ID: 20130705172337.GB29050@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 01, 2013 at 05:30:38PM +0100, Dean Rasheed wrote:
> On 1 July 2013 01:44, David Fetter <david(at)fetter(dot)org> wrote:
> > On Fri, Jun 28, 2013 at 09:22:52PM +0100, Dean Rasheed wrote:
> >> On 21 June 2013 06:16, David Fetter <david(at)fetter(dot)org> wrote:
> >> > Please find attached a patch which allows subqueries in the FILTER
> >> > clause and adds regression testing for same.
> >> >
> >>
> >> This needs re-basing/merging following Robert's recent commit to make
> >> OVER unreserved.
> >
> > Please find attached. Thanks, Andrew Gierth! In this one, FILTER is
> > no longer a reserved word.
> >
>
> Looking at this patch again, it appears to be in pretty good shape.
>
> - Applies cleanly to head.
> - Compiles with no warnings.
> - Includes regression test cases and doc updates.
> - Compatible with the relevant part of T612, "Advanced OLAP operations".
> - Includes pg_dump support.
> - Code changes all look reasonable, and I can't find any corner cases
> that have been missed.
> - Appears to work as expected. I tested everything I could think of
> and couldn't break it.
>
> AFAICT all the bases have been covered. As mentioned upthread, I would
> have preferred a few more regression test cases, and a couple of the
> tests don't appear to return anything interesting, but I'll leave that
> for the committer to decide whether they're sufficient for regression
> tests.
>
> I have a few suggestions to improve the docs:
>
> 1). In syntax.sgml: "The aggregate_name can also be suffixed with
> FILTER as described below". It's not really a suffix to the aggregate
> name, since it follows the function arguments and optional order by
> clause. Perhaps it would be more consistent with the surrounding text
> to say something like
>
> <replaceable>expression</replaceable> is
> any value expression that does not itself contain an aggregate
> expression or a window function call, and
> ! <replaceable>order_by_clause</replaceable> and
> ! <replaceable>filter_clause</replaceable> are optional
> ! <literal>ORDER BY</> and <literal>FILTER</> clauses as described below.
>
> 2). In syntax.sgml: "... or when a FILTER clause is present, each row
> matching same". In the context of that paragraph this suggests that
> the filter clause only applies to the first form, since that paragraph
> is a description of the 4 forms of the aggregate function. I don't
> think it's worth mentioning FILTER in this paragraph at all --- it's
> adequately described below that.
>
> 3). In syntax.sgml: "Adding a FILTER clause to an aggregate specifies
> which values of the expression being aggregated to evaluate". How
> about something a little more specific, along the lines of
>
> If <literal>FILTER</> is specified, then only input rows for which
> the <replaceable>filter_clause</replaceable> evaluates to true are
> fed to the aggregate function; input rows for which the
> <replaceable>filter_clause</replaceable> evaluates to false or the
> null value are discarded. For example...
>
> 4). In select.sgml: "In the absence of a FILTER clause, aggregate
> functions...". It doesn't seem right to refer to the FILTER clause at
> the top level here because it's not part of the SELECT syntax being
> described on this page. Also I think this should include a
> cross-reference to the aggregate function syntax section, perhaps
> something like:
>
> Aggregate functions, if any are used, are computed across all rows
> making up each group, producing a separate value for each group
> (whereas without <literal>GROUP BY</literal>, an aggregate
> produces a single value computed across all the selected rows).
> + The set of rows fed to the aggregate function can be further filtered
> + by attaching a <literal>FILTER</literal> clause to the aggregate
> + function call, see <xref ...> for more information.
> When <literal>GROUP BY</literal> is present, it is not valid for
> the <command>SELECT</command> list expressions to refer to
> ungrouped columns except within aggregate functions or if the
> ungrouped column is functionally dependent on the grouped columns,
> since there would otherwise be more than one possible value to
> return for an ungrouped column. A functional dependency exists if
> the grouped columns (or a subset thereof) are the primary key of
> the table containing the ungrouped column.

Please find attached changes based on the above.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
filter_008.diff text/plain 38.5 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-06 10:49:21
Message-ID: CAEZATCWdVZF6Q4FZZAR_NcLLEOXUYL_t0-8QFuv=HMpTYjnJgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 5 July 2013 18:23, David Fetter <david(at)fetter(dot)org> wrote:
> Please find attached changes based on the above.
>

This looks good. The grammar changes are smaller and neater now on top
of the makeFuncCall() patch.

Overall I think this patch offers useful additional functionality, in
compliance with the SQL spec, which should be handy to simplify
complex grouping queries.

There's a minor typo in syntax.sgml: "for each input row, each row
matching same." --- fix attached.

I think this is ready for committer.

Regards,
Dean

Attachment Content-Type Size
filter_008b.diff application/octet-stream 38.1 KB

From: David Fetter <david(at)fetter(dot)org>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-08 01:37:26
Message-ID: 20130708013726.GG18753@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote:
> On 5 July 2013 18:23, David Fetter <david(at)fetter(dot)org> wrote:
> > Please find attached changes based on the above.
> >
>
> This looks good. The grammar changes are smaller and neater now on top
> of the makeFuncCall() patch.
>
> Overall I think this patch offers useful additional functionality, in
> compliance with the SQL spec, which should be handy to simplify
> complex grouping queries.
>
> There's a minor typo in syntax.sgml: "for each input row, each row
> matching same." --- fix attached.

That is actually correct grammar, but may not be the easiest to
translate.

> I think this is ready for committer.

Thanks :)

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Noah Misch <noah(at)leadboat(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-15 02:15:12
Message-ID: 20130715021512.GA23930@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote:
> On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote:
> > Overall I think this patch offers useful additional functionality, in
> > compliance with the SQL spec, which should be handy to simplify
> > complex grouping queries.

As I understand this feature, it is syntactic sugar for the typical case of an
aggregate with a strict transition function. For example, "min(x) FILTER
(WHERE y > 0)" is rigorously equivalent to "min(CASE y > 0 THEN x END)".
Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard
queries it is *only* syntactic sugar. In PostgreSQL, it lets you do novel
things with, for example, array_agg(). Is that accurate?

> > I think this is ready for committer.

The patch was thorough. I updated applicable comments, revisited some
cosmetic choices, and made these functional changes:

1. The pg_stat_statements "query jumble" should incorporate the filter.

2. The patch did not update costing. I made it add the cost of the filter
expression the same way we add the cost of the argument expressions. This
makes "min(x) FILTER (WHERE y > 0)" match "min(case y > 0 THEN x end)" in
terms of cost, which is a fair start. At some point, we could do better by
reducing the argument cost by the filter selectivity.

A few choices/standard interpretations may deserve discussion. The patch
makes filter clauses contribute to the subquery level chosen to be the
"aggregation query". This is observable through the behavior of these two
standard-conforming queries:

select (select count(outer_c)
from (values (1)) t0(inner_c))
from (values (2),(3)) t1(outer_c); -- outer query is aggregation query
select (select count(outer_c) filter (where inner_c <> 0)
from (values (1)) t0(inner_c))
from (values (2),(3)) t1(outer_c); -- inner query is aggregation query

I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this. Since that
still isn't crystal-clear to me, I mention it in case anyone has a different
reading.

Distinct from that, albeit in a similar vein, SQL does not permit outer
references in a filter clause. This patch permits them; I think that
qualifies as a reasonable PostgreSQL extension.

> --- a/doc/src/sgml/keywords.sgml
> +++ b/doc/src/sgml/keywords.sgml

> @@ -3200,7 +3200,7 @@
> </row>
> <row>
> <entry><token>OVER</token></entry>
> - <entry>reserved (can be function or type)</entry>
> + <entry>non-reserved</entry>

I committed this one-line correction separately.

> --- a/src/backend/optimizer/plan/planagg.c
> +++ b/src/backend/optimizer/plan/planagg.c
> @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context)
> ListCell *l;
>
> Assert(aggref->agglevelsup == 0);
> - if (list_length(aggref->args) != 1 || aggref->aggorder != NIL)
> + if (list_length(aggref->args) != 1 || aggref->aggorder != NIL || aggref->agg_filter != NULL)
> return true; /* it couldn't be MIN/MAX */
> /* note: we do not care if DISTINCT is mentioned ... */

I twitched upon reading this, because neither ORDER BY nor FILTER preclude the
aggregate being MIN or MAX. Perhaps Andrew can explain why he put aggorder
there back in 2009. All I can figure is that writing max(c ORDER BY x) is so
unlikely that we'd too often waste the next syscache lookup. But the same
argument would apply to DISTINCT. With FILTER, the rationale is entirely
different. The aggregate could well be MIN/MAX; we just haven't implemented
the necessary support elsewhere in this file.

See attached patch revisions. The first patch edits find_minmax_aggs_walker()
per my comments just now. The second is an update of your FILTER patch with
the changes to which I alluded above; it applies atop the first patch. Would
you verify that I didn't ruin anything? Barring problems, will commit.

Are you the sole named author of this patch? That's what the CF page says,
but that wasn't fully clear to me from the list discussions.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
minmax-ignore-orderby-v1.patch text/plain 1.3 KB
filter-v9.patch text/plain 45.2 KB

From: David Fetter <david(at)fetter(dot)org>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-15 03:20:51
Message-ID: 20130715032051.GB9844@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote:
> On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote:
> > On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote:
> > > Overall I think this patch offers useful additional functionality, in
> > > compliance with the SQL spec, which should be handy to simplify
> > > complex grouping queries.
>
> As I understand this feature, it is syntactic sugar for the typical case of an
> aggregate with a strict transition function. For example, "min(x) FILTER
> (WHERE y > 0)" is rigorously equivalent to "min(CASE y > 0 THEN x END)".
> Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard
> queries it is *only* syntactic sugar. In PostgreSQL, it lets you do novel
> things with, for example, array_agg(). Is that accurate?
>
> > > I think this is ready for committer.
>
> The patch was thorough. I updated applicable comments, revisited some
> cosmetic choices, and made these functional changes:
>
> 1. The pg_stat_statements "query jumble" should incorporate the filter.
>
> 2. The patch did not update costing. I made it add the cost of the filter
> expression the same way we add the cost of the argument expressions. This
> makes "min(x) FILTER (WHERE y > 0)" match "min(case y > 0 THEN x end)" in
> terms of cost, which is a fair start. At some point, we could do better by
> reducing the argument cost by the filter selectivity.

Thanks!

> A few choices/standard interpretations may deserve discussion. The patch
> makes filter clauses contribute to the subquery level chosen to be the
> "aggregation query". This is observable through the behavior of these two
> standard-conforming queries:
>
> select (select count(outer_c)
> from (values (1)) t0(inner_c))
> from (values (2),(3)) t1(outer_c); -- outer query is aggregation query
> select (select count(outer_c) filter (where inner_c <> 0)
> from (values (1)) t0(inner_c))
> from (values (2),(3)) t1(outer_c); -- inner query is aggregation query
>
> I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this. Since that
> still isn't crystal-clear to me, I mention it in case anyone has a different
> reading.
>
> Distinct from that, albeit in a similar vein, SQL does not permit outer
> references in a filter clause. This patch permits them; I think that
> qualifies as a reasonable PostgreSQL extension.

Thanks again.

> > --- a/doc/src/sgml/keywords.sgml
> > +++ b/doc/src/sgml/keywords.sgml
>
> > @@ -3200,7 +3200,7 @@
> > </row>
> > <row>
> > <entry><token>OVER</token></entry>
> > - <entry>reserved (can be function or type)</entry>
> > + <entry>non-reserved</entry>
>
> I committed this one-line correction separately.
>
> > --- a/src/backend/optimizer/plan/planagg.c
> > +++ b/src/backend/optimizer/plan/planagg.c
> > @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context)
> > ListCell *l;
> >
> > Assert(aggref->agglevelsup == 0);
> > - if (list_length(aggref->args) != 1 || aggref->aggorder != NIL)
> > + if (list_length(aggref->args) != 1 || aggref->aggorder != NIL || aggref->agg_filter != NULL)
> > return true; /* it couldn't be MIN/MAX */
> > /* note: we do not care if DISTINCT is mentioned ... */
>
> I twitched upon reading this, because neither ORDER BY nor FILTER preclude the
> aggregate being MIN or MAX. Perhaps Andrew can explain why he put aggorder
> there back in 2009. All I can figure is that writing max(c ORDER BY x) is so
> unlikely that we'd too often waste the next syscache lookup. But the same
> argument would apply to DISTINCT. With FILTER, the rationale is entirely
> different. The aggregate could well be MIN/MAX; we just haven't implemented
> the necessary support elsewhere in this file.

Excellent reasoning, and good catch.

> See attached patch revisions. The first patch edits
> find_minmax_aggs_walker() per my comments just now. The second is
> an update of your FILTER patch with the changes to which I alluded
> above; it applies atop the first patch. Would you verify that I
> didn't ruin anything? Barring problems, will commit.
>
> Are you the sole named author of this patch? That's what the CF
> page says, but that wasn't fully clear to me from the list
> discussions.

While Andrew's help was invaluable and pervasive, I wrote the patch,
and all flaws in it are my fault.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: David Fetter <david(at)fetter(dot)org>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-15 18:43:04
Message-ID: 20130715184304.GA3520@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote:
> On Sun, Jul 07, 2013 at 06:37:26PM -0700, David Fetter wrote:
> > On Sat, Jul 06, 2013 at 11:49:21AM +0100, Dean Rasheed wrote:
> > > Overall I think this patch offers useful additional functionality, in
> > > compliance with the SQL spec, which should be handy to simplify
> > > complex grouping queries.
>
> As I understand this feature, it is syntactic sugar for the typical case of an
> aggregate with a strict transition function. For example, "min(x) FILTER
> (WHERE y > 0)" is rigorously equivalent to "min(CASE y > 0 THEN x END)".
> Every SQL aggregate is strict (ISO 9075-2 section 4.15.4), so for standard
> queries it is *only* syntactic sugar. In PostgreSQL, it lets you do novel
> things with, for example, array_agg(). Is that accurate?
>
> > > I think this is ready for committer.
>
> The patch was thorough. I updated applicable comments, revisited some
> cosmetic choices, and made these functional changes:
>
> 1. The pg_stat_statements "query jumble" should incorporate the filter.
>
> 2. The patch did not update costing. I made it add the cost of the filter
> expression the same way we add the cost of the argument expressions. This
> makes "min(x) FILTER (WHERE y > 0)" match "min(case y > 0 THEN x end)" in
> terms of cost, which is a fair start. At some point, we could do better by
> reducing the argument cost by the filter selectivity.
>
>
> A few choices/standard interpretations may deserve discussion. The patch
> makes filter clauses contribute to the subquery level chosen to be the
> "aggregation query". This is observable through the behavior of these two
> standard-conforming queries:
>
> select (select count(outer_c)
> from (values (1)) t0(inner_c))
> from (values (2),(3)) t1(outer_c); -- outer query is aggregation query
> select (select count(outer_c) filter (where inner_c <> 0)
> from (values (1)) t0(inner_c))
> from (values (2),(3)) t1(outer_c); -- inner query is aggregation query
>
> I believe SQL (ISO 9075-2 section 6.9 SR 3,4,6) does require this. Since that
> still isn't crystal-clear to me, I mention it in case anyone has a different
> reading.
>
> Distinct from that, albeit in a similar vein, SQL does not permit outer
> references in a filter clause. This patch permits them; I think that
> qualifies as a reasonable PostgreSQL extension.
>
> > --- a/doc/src/sgml/keywords.sgml
> > +++ b/doc/src/sgml/keywords.sgml
>
> > @@ -3200,7 +3200,7 @@
> > </row>
> > <row>
> > <entry><token>OVER</token></entry>
> > - <entry>reserved (can be function or type)</entry>
> > + <entry>non-reserved</entry>
>
> I committed this one-line correction separately.
>
> > --- a/src/backend/optimizer/plan/planagg.c
> > +++ b/src/backend/optimizer/plan/planagg.c
> > @@ -314,7 +314,7 @@ find_minmax_aggs_walker(Node *node, List **context)
> > ListCell *l;
> >
> > Assert(aggref->agglevelsup == 0);
> > - if (list_length(aggref->args) != 1 || aggref->aggorder != NIL)
> > + if (list_length(aggref->args) != 1 || aggref->aggorder != NIL || aggref->agg_filter != NULL)
> > return true; /* it couldn't be MIN/MAX */
> > /* note: we do not care if DISTINCT is mentioned ... */
>
> I twitched upon reading this, because neither ORDER BY nor FILTER preclude the
> aggregate being MIN or MAX. Perhaps Andrew can explain why he put aggorder
> there back in 2009. All I can figure is that writing max(c ORDER BY x) is so
> unlikely that we'd too often waste the next syscache lookup. But the same
> argument would apply to DISTINCT. With FILTER, the rationale is entirely
> different. The aggregate could well be MIN/MAX; we just haven't implemented
> the necessary support elsewhere in this file.
>
>
> See attached patch revisions. The first patch edits find_minmax_aggs_walker()
> per my comments just now. The second is an update of your FILTER patch with
> the changes to which I alluded above; it applies atop the first patch. Would
> you verify that I didn't ruin anything? Barring problems, will commit.
>
> Are you the sole named author of this patch? That's what the CF page says,
> but that wasn't fully clear to me from the list discussions.
>
> Thanks,
> nm

Tested your changes. They pass regression, etc. :)

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Noah Misch <noah(at)leadboat(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-15 18:56:17
Message-ID: 1430cb03838fb80e461b72461789ef74@news-out.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch said:

> I twitched upon reading this, because neither ORDER BY nor FILTER preclude
> the aggregate being MIN or MAX. Perhaps Andrew can explain why he put
> aggorder there back in 2009.

The bottom line is that I intentionally avoided assuming that an agg with an
aggsortop could only be min() or max() and that having an order by clause
would always be harmless in such cases. I can't think of an actual use case
where it would matter, but I've seen people define some pretty strange aggs
recently.

So I mildly object to simply throwing away the ORDER BY clause in such cases.

--
Andrew (irc:RhodiumToad)


From: Noah Misch <noah(at)leadboat(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-15 21:59:22
Message-ID: 20130715215922.GA34968@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 15, 2013 at 06:56:17PM +0000, Andrew Gierth wrote:
> Noah Misch said:
>
> > I twitched upon reading this, because neither ORDER BY nor FILTER preclude
> > the aggregate being MIN or MAX. Perhaps Andrew can explain why he put
> > aggorder there back in 2009.
>
> The bottom line is that I intentionally avoided assuming that an agg with an
> aggsortop could only be min() or max() and that having an order by clause
> would always be harmless in such cases. I can't think of an actual use case
> where it would matter, but I've seen people define some pretty strange aggs
> recently.
>
> So I mildly object to simply throwing away the ORDER BY clause in such cases.

I can't think of another use for aggsortop as defined today. However, on
further reflection, min(x ORDER BY y) is not identical to min(x) when the
B-tree operator class of the aggsortop can find non-identical datums to be
equal. This affects at least min(numeric) and min(interval). min(x) chooses
an unspecified x among those equal to the smallest x, while min(x ORDER BY y)
can be used to narrow the choice. I will update the comments along those
lines and not change semantics after all.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Noah Misch <noah(at)leadboat(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]
Date: 2013-07-17 00:31:15
Message-ID: 20130717003115.GD55849@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 15, 2013 at 11:43:04AM -0700, David Fetter wrote:
> On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote:
> > See attached patch revisions. The first patch edits find_minmax_aggs_walker()
> > per my comments just now. The second is an update of your FILTER patch with
> > the changes to which I alluded above; it applies atop the first patch. Would
> > you verify that I didn't ruin anything? Barring problems, will commit.

> Tested your changes. They pass regression, etc. :)

Committed.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com