Re: Support for RANGE ... PRECEDING windows in OVER

Lists: pgsql-hackers
From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-21 02:24:18
Message-ID: 51C3B952.60907@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all

Since 8.4, PostgreSQL has had extremely useful window function support -
but support for "RANGE PRECEDING / FOLLOWING" windows was dropped late
in 8.4's development in order to get the rest of the feature in, per
http://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.

It looks like there was discussion of requiring a new opclass to be
declared for types or otherwise extending opclasses to provide the
information required for RANGE ... PRECEDING / FOLLOWING (
http://www.postgresql.org/message-id/20100211201444.GA28270@svana.org )
. I can't find any sign that it went anywhere beyond some broad
discussion:
http://www.postgresql.org/message-id/13993.1265920013@sss.pgh.pa.us at
the time.

I've missed this feature more than once, and am curious about whether
any more recent changes may have made it cleaner to tackle this, or
whether consensus can be formed on adding the new entries to btree's
opclass to avoid the undesirable explicit lookups of the '+' and '-'
oprators.

Some question seems to remain open about how ranges over
timestamps/intervals should work, but this wasn't elaborated on.

There's been interest in this, eg:

http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1

http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Ian Link <ian(at)ilink(dot)io>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-21 02:31:00
Message-ID: 51C3BAE4.6000909@ilink.io
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I am currently looking into this feature. However, as I am quite new to
Postgres, I think it might take me a while to get up to speed. Anyways,
I would also appreciate another round of discussion on the future of the
windowing functions.

Ian Link

> Craig Ringer <mailto:craig(at)2ndquadrant(dot)com>
> Thursday, June 20, 2013 7:24 PM
> Hi all
>
> Since 8.4, PostgreSQL has had extremely useful window function support -
> but support for "RANGE PRECEDING / FOLLOWING" windows was dropped late
> in 8.4's development in order to get the rest of the feature in, per
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.
>
> It looks like there was discussion of requiring a new opclass to be
> declared for types or otherwise extending opclasses to provide the
> information required for RANGE ... PRECEDING / FOLLOWING (
> http://www.postgresql.org/message-id/20100211201444.GA28270@svana.org )
> . I can't find any sign that it went anywhere beyond some broad
> discussion:
> http://www.postgresql.org/message-id/13993.1265920013@sss.pgh.pa.us at
> the time.
>
> I've missed this feature more than once, and am curious about whether
> any more recent changes may have made it cleaner to tackle this, or
> whether consensus can be formed on adding the new entries to btree's
> opclass to avoid the undesirable explicit lookups of the '+' and '-'
> oprators.
>
> Some question seems to remain open about how ranges over
> timestamps/intervals should work, but this wasn't elaborated on.
>
> There's been interest in this, eg:
>
> http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1
>
> http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions
>
>
>


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Ian Link <ian(at)ilink(dot)io>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-21 02:37:49
Message-ID: 51C3BC7D.6070203@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/21/2013 10:31 AM, Ian Link wrote:
> I am currently looking into this feature. However, as I am quite new to
> Postgres, I think it might take me a while to get up to speed. Anyways,
> I would also appreciate another round of discussion on the future of the
> windowing functions.

Good to know, and welcome.

I hope the links to the archived discussions on the matter were useful
to you.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Ian Link <ian(at)ilink(dot)io>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-21 02:53:01
Message-ID: 51C3C00D.9040800@ilink.io
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thanks! The discussions have been useful, although I am currently just
reviewing the code.
I think a good starting point will be to refactor/imrpove the
WinGetFuncArgInPartition and WinGetFuncArgInFrame functions.
Tom Lane wrote this about them before comitting the patch:

*

I'm not terribly happy with the changes you made in WinGetFuncArgInPartition

and WinGetFuncArgInFrame to force the window function mark to not go

past frame start in some modes. Not only is that pretty ugly, but I

think it can mask bugs in window functions: it's an error for a window

function to fetch a row before what it has set its mark to be, but in

some cases that wouldn't be detected because of this change. I think

it would be better to revert those changes and find another method of

protecting fetches needed to determine the frame head. One idea is

to create a separate read pointer that tracks the frame head whenever

actual fetches of the frame head might be needed by update_frameheadpos.

I committed it without changing that, but I think this should be

revisited before trying to add the RANGE value PRECEDING/FOLLOWING

options, because those will substantially expand the number of cases

where that hack affects the behavior.*

I am honestly not 100% certain why these functions have issues, but this
seems a good place to start investigating.

Ian Link

> Craig Ringer <mailto:craig(at)2ndquadrant(dot)com>
> Thursday, June 20, 2013 7:37 PM
>
> Good to know, and welcome.
>
> I hope the links to the archived discussions on the matter were useful
> to you.
>
> Craig Ringer <mailto:craig(at)2ndquadrant(dot)com>
> Thursday, June 20, 2013 7:24 PM
> Hi all
>
> Since 8.4, PostgreSQL has had extremely useful window function support -
> but support for "RANGE PRECEDING / FOLLOWING" windows was dropped late
> in 8.4's development in order to get the rest of the feature in, per
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg00540.php.
>
> It looks like there was discussion of requiring a new opclass to be
> declared for types or otherwise extending opclasses to provide the
> information required for RANGE ... PRECEDING / FOLLOWING (
> http://www.postgresql.org/message-id/20100211201444.GA28270@svana.org )
> . I can't find any sign that it went anywhere beyond some broad
> discussion:
> http://www.postgresql.org/message-id/13993.1265920013@sss.pgh.pa.us at
> the time.
>
> I've missed this feature more than once, and am curious about whether
> any more recent changes may have made it cleaner to tackle this, or
> whether consensus can be formed on adding the new entries to btree's
> opclass to avoid the undesirable explicit lookups of the '+' and '-'
> oprators.
>
> Some question seems to remain open about how ranges over
> timestamps/intervals should work, but this wasn't elaborated on.
>
> There's been interest in this, eg:
>
> http://pgsql.hackers.free-usenet.eu/[HACKERS]-range-intervals-in-window-function-frames_T66085695_S1
>
> http://grokbase.com/t/postgresql/pgsql-general/105a89gm2n/postgresql-9-0-support-for-range-value-preceding-window-functions
>
>
>


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-21 09:32:01
Message-ID: CAP7QgmkRUoQuhHqr_GGdgJvwi9=NarQ8Ek9zwfxXU+2C-PGd6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 20, 2013 at 7:24 PM, Craig Ringer <craig(at)2ndquadrant(dot)com>wrote:
>
> I've missed this feature more than once, and am curious about whether
> any more recent changes may have made it cleaner to tackle this, or
> whether consensus can be formed on adding the new entries to btree's
> opclass to avoid the undesirable explicit lookups of the '+' and '-'
> oprators.
>
>

As far as I know the later development didn't add anything to help this
conversation. I initially thought range type or knn gist would add
something, but they were something else far from this. On the other hand,
if this makes it, it'll also open doors to range PARTITION BY for CREATE
TABLE command, so the impact will be bigger than you may think.

I also later found that we are missing not only notion of '+' or '-', but
also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN needs
to detect ERROR if the offset value is negative, but it is not always easy
if you think about interval, numeric types as opposed to int64 used in ROWS
BETWEEN.

Thanks,
--
Hitoshi Harada


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-21 10:20:29
Message-ID: 51C428ED.10902@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/21/2013 05:32 PM, Hitoshi Harada wrote:

> I also later found that we are missing not only notion of '+' or '-',
> but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN
> needs to detect ERROR if the offset value is negative, but it is not
> always easy if you think about interval, numeric types as opposed to
> int64 used in ROWS BETWEEN.

Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That
should make sense for any type in which the concept of zero makes sense.

Thanks for the warning on that issue.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-21 11:35:39
Message-ID: CAP7QgmkfHGnzXderNj8m4UsgSWZj5a6A+X5vrfw7owx9yrNJ=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 21, 2013 at 3:20 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> On 06/21/2013 05:32 PM, Hitoshi Harada wrote:
>
> > I also later found that we are missing not only notion of '+' or '-',
> > but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN
> > needs to detect ERROR if the offset value is negative, but it is not
> > always easy if you think about interval, numeric types as opposed to
> > int64 used in ROWS BETWEEN.
>
> Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That
> should make sense for any type in which the concept of zero makes sense.
>
>
> Yeah, I mean, it needs to know if offset is negative or not by testing
with zero. So we need "zero value" or "is_negative function" for each type.

Thanks,
--
Hitoshi Harada


From: ian link <ian(at)ilink(dot)io>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-21 19:30:19
Message-ID: CAOOwM5Lz+3ikgJQDpaebMgrZHtby0fihCXQgtLHmFjczaciyJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Forgive my ignorance, but I don't entirely understand the problem. What
does '+' and '-' refer to exactly?
Thanks!

On Fri, Jun 21, 2013 at 4:35 AM, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>wrote:

>
>
>
> On Fri, Jun 21, 2013 at 3:20 AM, Craig Ringer <craig(at)2ndquadrant(dot)com>wrote:
>
>> On 06/21/2013 05:32 PM, Hitoshi Harada wrote:
>>
>> > I also later found that we are missing not only notion of '+' or '-',
>> > but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN
>> > needs to detect ERROR if the offset value is negative, but it is not
>> > always easy if you think about interval, numeric types as opposed to
>> > int64 used in ROWS BETWEEN.
>>
>> Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That
>> should make sense for any type in which the concept of zero makes sense.
>>
>>
>> Yeah, I mean, it needs to know if offset is negative or not by testing
> with zero. So we need "zero value" or "is_negative function" for each type.
>
> Thanks,
> --
> Hitoshi Harada
>


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: ian link <ian(at)ilink(dot)io>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-22 03:41:19
Message-ID: 51C51CDF.4060908@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/22/2013 03:30 AM, ian link wrote:
> Forgive my ignorance, but I don't entirely understand the problem. What
> does '+' and '-' refer to exactly?

Consider "RANGE 4.5 PRECEDING'.

You need to be able to test whether, for the current row 'b', any given
row 'a' is within the range (b - 4.5) < a <= b . Not 100% sure about the
< vs <= boundaries, but that's irrelevant for the example.

To test that, you have to be able to do two things: you have to be able
to test whether one value is greater than another, and you have to be
able to add or subtract a constant from one of the values.

Right now, the b-tree access method provides information on the ordering
operators < <= = > >= <> , which provides half the answer. But these
don't give any concept of *distance* - you can test ordinality but not
cardinality.

To implement the "different by 4.5" part, you have to be able to add 4.5
to one value or subtract it from the other.

The obvious way to do that is to look up the function that implements
the '+' or '-' operator, and do:

((OPERATOR(+))(a, 4.5)) > b AND (a <= b)

or

((OPERATOR(-))(b, 4.5)) < a AND (a <= b);

The problem outlined by Tom in prior discussion about this is that
PostgreSQL tries really hard not to assume that particular operator
names mean particular things. Rather than "knowing" that "+" is always
"an operator that adds two values together; is transitive, symmetric and
reflexive", PostgreSQL requires that you define an *operator class* that
names the operator that has those properties.

Or at least, it does for less-than, less-than-or-equals, equals,
greater-than-or-equals, greater-than, and not-equals as part of the
b-tree operator class, which *usually* defines these operators as < <= =
>= > <>, but you could use any operator names you wanted if you really
liked.

Right now (as far as I know) there's no operator class that lets you
identify operators for addition and subtraction in a similar way. So
it's necessary to either add such an operator class (in which case
support has to be added for it for every type), extend the existing
b-tree operator class to provide the info, or blindly assume that "+"
and "-" are always addition and subtraction.

For an example of why such assumptions are a bad idea, consider matrix
multiplication. Normally, "a * b" = "b * a", but this isn't true for
multiplication of matrices. Similarly, if someone defined a "+" operator
as an alias for string concatenation (||), we'd be totally wrong to
assume we could use that for doing range-offset windowing.

So. Yeah. Operator classes required, unless we're going to change the
rules and make certain operator names "special" in PostgreSQL, so that
if you implement them they *must* have certain properties. This seems
like a pretty poor reason to add such a big change.

I hope this explanation (a) is actually correct and (b) is helpful.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: ian link <ian(at)ilink(dot)io>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-06-22 23:38:49
Message-ID: CAOOwM5JMXfRSjmsa6VRN7ABeGPQXShqphFJD-coK6iQOim8-ow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thanks Craig! That definitely does help. I probably still have some
questions but I think I will read through the rest of the code before
asking. Thanks again!

Ian

> Craig Ringer
> Friday, June 21, 2013 8:41 PM
>
> On 06/22/2013 03:30 AM, ian link wrote:
>>
>> Forgive my ignorance, but I don't entirely understand the problem. What
>> does '+' and '-' refer to exactly?
>
> Consider "RANGE 4.5 PRECEDING'.
>
> You need to be able to test whether, for the current row 'b', any given
> row 'a' is within the range (b - 4.5) < a <= b . Not 100% sure about the
> < vs <= boundaries, but that's irrelevant for the example.
>
> To test that, you have to be able to do two things: you have to be able
> to test whether one value is greater than another, and you have to be
> able to add or subtract a constant from one of the values.
>
> Right now, the b-tree access method provides information on the ordering
> operators < <= = > >= <> , which provides half the answer. But these
> don't give any concept of *distance* - you can test ordinality but not
> cardinality.
>
> To implement the "different by 4.5" part, you have to be able to add 4.5
> to one value or subtract it from the other.
>
> The obvious way to do that is to look up the function that implements
> the '+' or '-' operator, and do:
>
> ((OPERATOR(+))(a, 4.5)) > b AND (a <= b)
>
> or
>
> ((OPERATOR(-))(b, 4.5)) < a AND (a <= b);
>
> The problem outlined by Tom in prior discussion about this is that
> PostgreSQL tries really hard not to assume that particular operator
> names mean particular things. Rather than "knowing" that "+" is always
> "an operator that adds two values together; is transitive, symmetric and
> reflexive", PostgreSQL requires that you define an *operator class* that
> names the operator that has those properties.
>
> Or at least, it does for less-than, less-than-or-equals, equals,
> greater-than-or-equals, greater-than, and not-equals as part of the
> b-tree operator class, which *usually* defines these operators as < <= =
>>
>> = > <>, but you could use any operator names you wanted if you really
>
> liked.
>
> Right now (as far as I know) there's no operator class that lets you
> identify operators for addition and subtraction in a similar way. So
> it's necessary to either add such an operator class (in which case
> support has to be added for it for every type), extend the existing
> b-tree operator class to provide the info, or blindly assume that "+"
> and "-" are always addition and subtraction.
>
> For an example of why such assumptions are a bad idea, consider matrix
> multiplication. Normally, "a * b" = "b * a", but this isn't true for
> multiplication of matrices. Similarly, if someone defined a "+" operator
> as an alias for string concatenation (||), we'd be totally wrong to
> assume we could use that for doing range-offset windowing.
>
> So. Yeah. Operator classes required, unless we're going to change the
> rules and make certain operator names "special" in PostgreSQL, so that
> if you implement them they *must* have certain properties. This seems
> like a pretty poor reason to add such a big change.
>
> I hope this explanation (a) is actually correct and (b) is helpful.
>
> ian link
> Friday, June 21, 2013 12:30 PM
> Forgive my ignorance, but I don't entirely understand the problem. What
does '+' and '-' refer to exactly?
> Thanks!
>
>
>
> Hitoshi Harada
> Friday, June 21, 2013 4:35 AM
>
>
>
On 06/22/2013 03:30 AM, ian link wrote:
> Forgive my ignorance, but I don't entirely understand the problem. What
> does '+' and '-' refer to exactly?

Consider "RANGE 4.5 PRECEDING'.

You need to be able to test whether, for the current row 'b', any given
row 'a' is within the range (b - 4.5) < a <= b . Not 100% sure about the
< vs <= boundaries, but that's irrelevant for the example.

To test that, you have to be able to do two things: you have to be able
to test whether one value is greater than another, and you have to be
able to add or subtract a constant from one of the values.

Right now, the b-tree access method provides information on the ordering
operators < <= = > >= <> , which provides half the answer. But these
don't give any concept of *distance* - you can test ordinality but not
cardinality.

To implement the "different by 4.5" part, you have to be able to add 4.5
to one value or subtract it from the other.

The obvious way to do that is to look up the function that implements
the '+' or '-' operator, and do:

((OPERATOR(+))(a, 4.5)) > b AND (a <= b)

or

((OPERATOR(-))(b, 4.5)) < a AND (a <= b);

The problem outlined by Tom in prior discussion about this is that
PostgreSQL tries really hard not to assume that particular operator
names mean particular things. Rather than "knowing" that "+" is always
"an operator that adds two values together; is transitive, symmetric and
reflexive", PostgreSQL requires that you define an *operator class* that
names the operator that has those properties.

Or at least, it does for less-than, less-than-or-equals, equals,
greater-than-or-equals, greater-than, and not-equals as part of the
b-tree operator class, which *usually* defines these operators as < <= =
>= > <>, but you could use any operator names you wanted if you really
liked.

Right now (as far as I know) there's no operator class that lets you
identify operators for addition and subtraction in a similar way. So
it's necessary to either add such an operator class (in which case
support has to be added for it for every type), extend the existing
b-tree operator class to provide the info, or blindly assume that "+"
and "-" are always addition and subtraction.

For an example of why such assumptions are a bad idea, consider matrix
multiplication. Normally, "a * b" = "b * a", but this isn't true for
multiplication of matrices. Similarly, if someone defined a "+" operator
as an alias for string concatenation (||), we'd be totally wrong to
assume we could use that for doing range-offset windowing.

So. Yeah. Operator classes required, unless we're going to change the
rules and make certain operator names "special" in PostgreSQL, so that
if you implement them they *must* have certain properties. This seems
like a pretty poor reason to add such a big change.

I hope this explanation (a) is actually correct and (b) is helpful.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
image/jpeg 1.6 KB
image/jpeg 770 bytes
image/jpeg 1.5 KB

From: ian link <ian(at)ilink(dot)io>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-01 03:54:51
Message-ID: CAOOwM5KajDqb=M8EXoQaV2nmDAo8AvE8H26HnxC_obtQf2_DNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I found some time and I think I am up to speed now. I finally figured out
how to add new operator strategies and made a little test operator for
myself.

It seems pretty clear that assuming '+' and '-' are addition and
subtraction is a bad idea. I don't think it would be too tricky to add
support for new operator strategies. Andrew Gierth suggested calling these
new strategies "offset -" and "offset +", which I think describes it pretty
well. I assigned the operator itself to be "@+" and "@-" but that can
obviously be changed. If this sounds like a good path to you guys, I will
go ahead and implement the operators for the appropriate types. Please let
me know if I am misunderstanding something - I am still figuring stuff out
:)

Aside from the opclass stuff, there were some other important issues
mentioned with the original RANGE support. I think I will address those
after the opclass stuff is done.

Thanks!
Ian

On Sat, Jun 22, 2013 at 4:38 PM, ian link <ian(at)ilink(dot)io> wrote:

> Thanks Craig! That definitely does help. I probably still have some
> questions but I think I will read through the rest of the code before
> asking. Thanks again!
>
> Ian
>
> > Craig Ringer
> > Friday, June 21, 2013 8:41 PM
>
> >
> > On 06/22/2013 03:30 AM, ian link wrote:
> >>
> >> Forgive my ignorance, but I don't entirely understand the problem. What
> >> does '+' and '-' refer to exactly?
> >
> > Consider "RANGE 4.5 PRECEDING'.
> >
> > You need to be able to test whether, for the current row 'b', any given
> > row 'a' is within the range (b - 4.5) < a <= b . Not 100% sure about the
> > < vs <= boundaries, but that's irrelevant for the example.
> >
> > To test that, you have to be able to do two things: you have to be able
> > to test whether one value is greater than another, and you have to be
> > able to add or subtract a constant from one of the values.
> >
> > Right now, the b-tree access method provides information on the ordering
> > operators < <= = > >= <> , which provides half the answer. But these
> > don't give any concept of *distance* - you can test ordinality but not
> > cardinality.
> >
> > To implement the "different by 4.5" part, you have to be able to add 4.5
> > to one value or subtract it from the other.
> >
> > The obvious way to do that is to look up the function that implements
> > the '+' or '-' operator, and do:
> >
> > ((OPERATOR(+))(a, 4.5)) > b AND (a <= b)
> >
> > or
> >
> > ((OPERATOR(-))(b, 4.5)) < a AND (a <= b);
> >
> > The problem outlined by Tom in prior discussion about this is that
> > PostgreSQL tries really hard not to assume that particular operator
> > names mean particular things. Rather than "knowing" that "+" is always
> > "an operator that adds two values together; is transitive, symmetric and
> > reflexive", PostgreSQL requires that you define an *operator class* that
> > names the operator that has those properties.
> >
> > Or at least, it does for less-than, less-than-or-equals, equals,
> > greater-than-or-equals, greater-than, and not-equals as part of the
> > b-tree operator class, which *usually* defines these operators as < <= =
> >>
> >> = > <>, but you could use any operator names you wanted if you really
> >
> > liked.
> >
> > Right now (as far as I know) there's no operator class that lets you
> > identify operators for addition and subtraction in a similar way. So
> > it's necessary to either add such an operator class (in which case
> > support has to be added for it for every type), extend the existing
> > b-tree operator class to provide the info, or blindly assume that "+"
> > and "-" are always addition and subtraction.
> >
> > For an example of why such assumptions are a bad idea, consider matrix
> > multiplication. Normally, "a * b" = "b * a", but this isn't true for
> > multiplication of matrices. Similarly, if someone defined a "+" operator
> > as an alias for string concatenation (||), we'd be totally wrong to
> > assume we could use that for doing range-offset windowing.
> >
> > So. Yeah. Operator classes required, unless we're going to change the
> > rules and make certain operator names "special" in PostgreSQL, so that
> > if you implement them they *must* have certain properties. This seems
> > like a pretty poor reason to add such a big change.
> >
> > I hope this explanation (a) is actually correct and (b) is helpful.
> >
> > ian link
> > Friday, June 21, 2013 12:30 PM
>
> > Forgive my ignorance, but I don't entirely understand the problem. What
> does '+' and '-' refer to exactly?
> > Thanks!
> >
> >
> >
> > Hitoshi Harada
> > Friday, June 21, 2013 4:35 AM
> >
> >
> >
>
> On 06/22/2013 03:30 AM, ian link wrote:
> > Forgive my ignorance, but I don't entirely understand the problem. What
> > does '+' and '-' refer to exactly?
>
> Consider "RANGE 4.5 PRECEDING'.
>
> You need to be able to test whether, for the current row 'b', any given
> row 'a' is within the range (b - 4.5) < a <= b . Not 100% sure about the
> < vs <= boundaries, but that's irrelevant for the example.
>
> To test that, you have to be able to do two things: you have to be able
> to test whether one value is greater than another, and you have to be
> able to add or subtract a constant from one of the values.
>
> Right now, the b-tree access method provides information on the ordering
> operators < <= = > >= <> , which provides half the answer. But these
> don't give any concept of *distance* - you can test ordinality but not
> cardinality.
>
> To implement the "different by 4.5" part, you have to be able to add 4.5
> to one value or subtract it from the other.
>
> The obvious way to do that is to look up the function that implements
> the '+' or '-' operator, and do:
>
> ((OPERATOR(+))(a, 4.5)) > b AND (a <= b)
>
> or
>
> ((OPERATOR(-))(b, 4.5)) < a AND (a <= b);
>
> The problem outlined by Tom in prior discussion about this is that
> PostgreSQL tries really hard not to assume that particular operator
> names mean particular things. Rather than "knowing" that "+" is always
> "an operator that adds two values together; is transitive, symmetric and
> reflexive", PostgreSQL requires that you define an *operator class* that
> names the operator that has those properties.
>
> Or at least, it does for less-than, less-than-or-equals, equals,
> greater-than-or-equals, greater-than, and not-equals as part of the
> b-tree operator class, which *usually* defines these operators as < <= =
> >= > <>, but you could use any operator names you wanted if you really
> liked.
>
> Right now (as far as I know) there's no operator class that lets you
> identify operators for addition and subtraction in a similar way. So
> it's necessary to either add such an operator class (in which case
> support has to be added for it for every type), extend the existing
> b-tree operator class to provide the info, or blindly assume that "+"
> and "-" are always addition and subtraction.
>
> For an example of why such assumptions are a bad idea, consider matrix
> multiplication. Normally, "a * b" = "b * a", but this isn't true for
> multiplication of matrices. Similarly, if someone defined a "+" operator
> as an alias for string concatenation (||), we'd be totally wrong to
> assume we could use that for doing range-offset windowing.
>
> So. Yeah. Operator classes required, unless we're going to change the
> rules and make certain operator names "special" in PostgreSQL, so that
> if you implement them they *must* have certain properties. This seems
> like a pretty poor reason to add such a big change.
>
> I hope this explanation (a) is actually correct and (b) is helpful.
>
> --
> Craig Ringer http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: ian link <ian(at)ilink(dot)io>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-01 04:56:18
Message-ID: 51D10BF2.6040701@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06/30/2013 08:54 PM, ian link wrote:
> I found some time and I think I am up to speed now. I finally figured out
> how to add new operator strategies and made a little test operator for
> myself.
>
> It seems pretty clear that assuming '+' and '-' are addition and
> subtraction is a bad idea. I don't think it would be too tricky to add
> support for new operator strategies. Andrew Gierth suggested calling these
> new strategies "offset -" and "offset +", which I think describes it pretty
> well. I assigned the operator itself to be "@+" and "@-" but that can
> obviously be changed. If this sounds like a good path to you guys, I will
> go ahead and implement the operators for the appropriate types. Please let
> me know if I am misunderstanding something - I am still figuring stuff out
> :)
>
> Aside from the opclass stuff, there were some other important issues
> mentioned with the original RANGE support. I think I will address those
> after the opclass stuff is done.

Are these things you plan to get done this week, or for next CommitFest?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: ian link <ian(at)ilink(dot)io>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-01 07:05:36
Message-ID: CAOOwM5LZ7E0SHa7nSYs44iXqWE20CAShDekOVtnZ6R8O5h0Bgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Definitely not this week. Hopefully for next commit fest.

On Sun, Jun 30, 2013 at 9:56 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:

> On 06/30/2013 08:54 PM, ian link wrote:
> > I found some time and I think I am up to speed now. I finally figured out
> > how to add new operator strategies and made a little test operator for
> > myself.
> >
> > It seems pretty clear that assuming '+' and '-' are addition and
> > subtraction is a bad idea. I don't think it would be too tricky to add
> > support for new operator strategies. Andrew Gierth suggested calling
> these
> > new strategies "offset -" and "offset +", which I think describes it
> pretty
> > well. I assigned the operator itself to be "@+" and "@-" but that can
> > obviously be changed. If this sounds like a good path to you guys, I will
> > go ahead and implement the operators for the appropriate types. Please
> let
> > me know if I am misunderstanding something - I am still figuring stuff
> out
> > :)
> >
> > Aside from the opclass stuff, there were some other important issues
> > mentioned with the original RANGE support. I think I will address those
> > after the opclass stuff is done.
>
> Are these things you plan to get done this week, or for next CommitFest?
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: ian link <ian(at)ilink(dot)io>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-01 18:39:43
Message-ID: CA+TgmoaN4LwxfaUFb=KOw+srXH4OMO8kgta=_Kuh9VWk9Os7RA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 30, 2013 at 11:54 PM, ian link <ian(at)ilink(dot)io> wrote:
> I found some time and I think I am up to speed now. I finally figured out
> how to add new operator strategies and made a little test operator for
> myself.
>
> It seems pretty clear that assuming '+' and '-' are addition and subtraction
> is a bad idea. I don't think it would be too tricky to add support for new
> operator strategies. Andrew Gierth suggested calling these new strategies
> "offset -" and "offset +", which I think describes it pretty well. I
> assigned the operator itself to be "@+" and "@-" but that can obviously be
> changed. If this sounds like a good path to you guys, I will go ahead and
> implement the operators for the appropriate types. Please let me know if I
> am misunderstanding something - I am still figuring stuff out :)

I don't think I understand the design you have in mind. I'm actually
not clear that it would be all that bad to assume fixed operator
names, as we apparently do in a few places despite the existence of
operator classes. But if that is bad, then I don't know how using @+
and @- instead helps anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: ian link <ian(at)ilink(dot)io>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-01 19:28:42
Message-ID: 20130701192842.GM3757@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas escribió:
> On Sun, Jun 30, 2013 at 11:54 PM, ian link <ian(at)ilink(dot)io> wrote:

> > It seems pretty clear that assuming '+' and '-' are addition and subtraction
> > is a bad idea. I don't think it would be too tricky to add support for new
> > operator strategies. Andrew Gierth suggested calling these new strategies
> > "offset -" and "offset +", which I think describes it pretty well. I
> > assigned the operator itself to be "@+" and "@-" but that can obviously be
> > changed. If this sounds like a good path to you guys, I will go ahead and
> > implement the operators for the appropriate types. Please let me know if I
> > am misunderstanding something - I am still figuring stuff out :)
>
> I don't think I understand the design you have in mind. I'm actually
> not clear that it would be all that bad to assume fixed operator
> names, as we apparently do in a few places despite the existence of
> operator classes. But if that is bad, then I don't know how using @+
> and @- instead helps anything.

Yeah.

Currently, all operator classes are tied to access methods. Since
nobody seems to have any great idea about creating an access method that
requires addition and subtraction, would it make sense to have operator
classes that exist solely to support keeping track of such operators for
the various datatypes?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: ian link <ian(at)ilink(dot)io>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-01 19:36:47
Message-ID: CA+Tgmoab=5u53D5-gc2D-T3v249pAhWxnywX6tMNr5Wy9Z1SUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 1, 2013 at 3:28 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Robert Haas escribió:
>> On Sun, Jun 30, 2013 at 11:54 PM, ian link <ian(at)ilink(dot)io> wrote:
>
>> > It seems pretty clear that assuming '+' and '-' are addition and subtraction
>> > is a bad idea. I don't think it would be too tricky to add support for new
>> > operator strategies. Andrew Gierth suggested calling these new strategies
>> > "offset -" and "offset +", which I think describes it pretty well. I
>> > assigned the operator itself to be "@+" and "@-" but that can obviously be
>> > changed. If this sounds like a good path to you guys, I will go ahead and
>> > implement the operators for the appropriate types. Please let me know if I
>> > am misunderstanding something - I am still figuring stuff out :)
>>
>> I don't think I understand the design you have in mind. I'm actually
>> not clear that it would be all that bad to assume fixed operator
>> names, as we apparently do in a few places despite the existence of
>> operator classes. But if that is bad, then I don't know how using @+
>> and @- instead helps anything.
>
> Yeah.
>
> Currently, all operator classes are tied to access methods. Since
> nobody seems to have any great idea about creating an access method that
> requires addition and subtraction, would it make sense to have operator
> classes that exist solely to support keeping track of such operators for
> the various datatypes?

I suppose if we really wanted to do this, it would make more sense to
have a new kind of object, maybe CREATE TYPE INTERFACE, rather than
shoehorning it into the operator class machinery. It seems like a
fairly heavyweight solution, however.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: ian link <ian(at)ilink(dot)io>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-01 23:01:10
Message-ID: 51D20A36.3050309@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 07/01/2013 12:05 AM, ian link wrote:
> Definitely not this week. Hopefully for next commit fest.
>

OK, marked "Returned with Feedback". It'll be up to you to add it to
the next commitfest if you think it's ready by then.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, ian link <ian(at)ilink(dot)io>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-01 23:24:47
Message-ID: 15862.1372721087@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Jul 1, 2013 at 3:28 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>> Currently, all operator classes are tied to access methods. Since
>> nobody seems to have any great idea about creating an access method that
>> requires addition and subtraction, would it make sense to have operator
>> classes that exist solely to support keeping track of such operators for
>> the various datatypes?

We certainly could envision adding "+" and "-" items to btree opfamilies,
with the proviso that they'd have to be optional so as not to break
existing extensions that create btree opfamilies. If we went that way,
I'd suggest that what we actually add to the definition is *not*
operators, but functions --- that is, specify that function slots 2 and
3 can contain addition (resp subtraction) functions that are compatible
with the family's ordering behavior. Otherwise you'd have the planner
trying to match WHERE clauses to the extra operators, which is somewhere
between wrong and dangerous. But a function that isn't actually called
by the index AM is not going to pose a hazard of being misapplied.
Besides, it'd likely be easier not harder for the window-function
machinery to work with a function than an operator.

> I suppose if we really wanted to do this, it would make more sense to
> have a new kind of object, maybe CREATE TYPE INTERFACE, rather than
> shoehorning it into the operator class machinery. It seems like a
> fairly heavyweight solution, however.

Yeah, there's something to be said for not wedging this into the index
opclass infrastructure. I'd be happier about building such new
infrastructure if we could unify this requirement with Peter's
"transforms" feature, which also seems to need more datatype-related
knowledge than we have in the catalogs now.

regards, tom lane


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: ian link <ian(at)ilink(dot)io>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-02 00:32:30
Message-ID: 51D21F9E.2030405@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 07/02/2013 02:39 AM, Robert Haas wrote:
> I'm actually
> not clear that it would be all that bad to assume fixed operator
> names, as we apparently do in a few places despite the existence of
> operator classes. But if that is bad, then I don't know how using @+
> and @- instead helps anything.

Personally I'm not clear why it's bad to reserve certain fundamental
operators like '+' and '-', requiring that they have particular semantics.

Want to use "+" as an alias for || because your Java programmers are
used to writing + for string concatenation? Um, don't do that.

Existing code would be unaffected since RANGE couldn't ever be used in
existing code. At worst, weird user-defined implementations of "+" and
"-" would result in bizarre window function behaviour if the operators
were unsuitable. Exceeding available memory could certainly be an issue
in cases like "+" as concatenation.

The main advantage I see of adding opclass entries for this is that it
makes it explicit when the operators have semantics suitable for use in
range windows.

I don't have a strong opinion on whether we should just use "+" and "-"
or whether we really need an opclass.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, ian link <ian(at)ilink(dot)io>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-02 04:03:10
Message-ID: 21336.1372737790@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> On 07/02/2013 02:39 AM, Robert Haas wrote:
>> I'm actually
>> not clear that it would be all that bad to assume fixed operator
>> names, as we apparently do in a few places despite the existence of
>> operator classes. But if that is bad, then I don't know how using @+
>> and @- instead helps anything.

> Personally I'm not clear why it's bad to reserve certain fundamental
> operators like '+' and '-', requiring that they have particular semantics.

It is bad. It's against project policy, not least because we have
assorted *existing* datatypes for which "obvious" operator names like
"=" do not have all the properties you might expect.

If you need a more concrete example of why that sort of thinking is
bad, you might consider the difference between < and ~<~ for type text.
If we hard-wired knowledge about operator behavior to operator names,
it would be impossible for the system to understand that both of those
operators represent sorting-related behaviors.

Or to be even more concrete: if we allow RANGE to suppose that there's
only one possible definition of "+" for a datatype, we're effectively
supposing that there's only one possible sort ordering for that type.
Which is already a wrong assumption, and has been since Postgres was
still at Berkeley. If you go this way, you won't be able to support
both WINDOW ... ORDER BY foo USING < RANGE ... and WINDOW ... ORDER BY
foo USING ~<~ RANGE ... because you won't know which addition operator
to apply.

(And yeah, I'm aware that the SQL standard only expects RANGE to support
sort keys that are of numeric, datetime, or interval type. I would hope
that we have higher expectations than that. Even if we don't, it's not
exactly hard to credit that people might have multiple ideas about how
to sort interval values.)

There are indeed still some places where we rely on operator names to
mean something, but we need to get away from that idea not add more.
Ideally, any property the system understands about an operator or
function should be explicitly declared through opclass membership or
some similar representation. We've made substantial progress in that
direction in the last fifteen years. I don't want to reverse that
progress in the name of minor expediencies, especially not ones that
fail to support flexibility that has been in the system for a couple
or three decades already.

regards, tom lane


From: ian link <ian(at)ilink(dot)io>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-03 00:23:27
Message-ID: CAOOwM5+=bodqTQMK2bt0MLQYb6YwWY0VvAbOJEKnSzXWf85wuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'm fine with moving the operators over to functions. I just don't want to
implement anything that is against best practice. If we are OK with that
direction, I'll go ahead and start on the new patch.

Ian

On Mon, Jul 1, 2013 at 9:03 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> > On 07/02/2013 02:39 AM, Robert Haas wrote:
> >> I'm actually
> >> not clear that it would be all that bad to assume fixed operator
> >> names, as we apparently do in a few places despite the existence of
> >> operator classes. But if that is bad, then I don't know how using @+
> >> and @- instead helps anything.
>
> > Personally I'm not clear why it's bad to reserve certain fundamental
> > operators like '+' and '-', requiring that they have particular
> semantics.
>
> It is bad. It's against project policy, not least because we have
> assorted *existing* datatypes for which "obvious" operator names like
> "=" do not have all the properties you might expect.
>
> If you need a more concrete example of why that sort of thinking is
> bad, you might consider the difference between < and ~<~ for type text.
> If we hard-wired knowledge about operator behavior to operator names,
> it would be impossible for the system to understand that both of those
> operators represent sorting-related behaviors.
>
> Or to be even more concrete: if we allow RANGE to suppose that there's
> only one possible definition of "+" for a datatype, we're effectively
> supposing that there's only one possible sort ordering for that type.
> Which is already a wrong assumption, and has been since Postgres was
> still at Berkeley. If you go this way, you won't be able to support
> both WINDOW ... ORDER BY foo USING < RANGE ... and WINDOW ... ORDER BY
> foo USING ~<~ RANGE ... because you won't know which addition operator
> to apply.
>
> (And yeah, I'm aware that the SQL standard only expects RANGE to support
> sort keys that are of numeric, datetime, or interval type. I would hope
> that we have higher expectations than that. Even if we don't, it's not
> exactly hard to credit that people might have multiple ideas about how
> to sort interval values.)
>
> There are indeed still some places where we rely on operator names to
> mean something, but we need to get away from that idea not add more.
> Ideally, any property the system understands about an operator or
> function should be explicitly declared through opclass membership or
> some similar representation. We've made substantial progress in that
> direction in the last fifteen years. I don't want to reverse that
> progress in the name of minor expediencies, especially not ones that
> fail to support flexibility that has been in the system for a couple
> or three decades already.
>
> regards, tom lane
>