Re: Underspecified window queries in regression tests

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Underspecified window queries in regression tests
Date: 2011-10-14 20:32:54
Message-ID: 20703.1318624374@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

So I'm testing a patch to make the planner use measured all-visible-page
counts for index-only scans. I was expecting to possibly see some plan
changes in the regression tests, but this diff scared me:

***************
*** 906,921 ****
FROM tenk1 WHERE unique1 < 10;
sum | unique1
-----+---------
! 4 | 4
! 6 | 2
! 3 | 1
! 7 | 6
! 15 | 9
! 17 | 8
! 13 | 5
! 8 | 3
! 10 | 7
! 7 | 0
(10 rows)

CREATE TEMP VIEW v_window AS
--- 906,921 ----
FROM tenk1 WHERE unique1 < 10;
sum | unique1
-----+---------
! 0 | 0
! 1 | 1
! 3 | 2
! 5 | 3
! 7 | 4
! 9 | 5
! 11 | 6
! 13 | 7
! 15 | 8
! 17 | 9
(10 rows)

CREATE TEMP VIEW v_window AS

On inspection, though, there's no bug. The plan did change, and that
affected the order in which the rows are fetched, and that changed the
window function outputs because this test case is effectively using
SUM(x) OVER (ROWS 1 PRECEDING) without any ordering specifier.
There are several adjacent tests that are underspecified in the same
way, but their results didn't change because they aren't candidates for
index-only scans.

We could hack around this by adding more columns to the result so that
an index-only scan doesn't work. But I wonder whether it wouldn't be
smarter to add ORDER BY clauses to the window function calls. I've been
known to argue against adding just-in-case ORDER BYs to the regression
tests in the past; but these cases bother me more because a plan change
will not just rearrange the result rows but change their contents,
making it really difficult to verify that nothing's seriously wrong.

I can't recall whether there was some good reason for underspecifying
these test queries. It looks like all the problematic ones were added in
commit ec4be2ee6827b6bd85e0813c7a8993cfbb0e6fa7 "Extend the set of frame
options supported for window functions", which means it was either me or
Hitoshi-san who wrote them that way, but memory is not serving this
afternoon.

Opinions?

regards, tom lane


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-15 04:04:44
Message-ID: CAP7QgmkSpHPrAsrVtW82gcWpJeRGfyab9_j4zBUtREbOTCJFJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/10/15 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> I can't recall whether there was some good reason for underspecifying
> these test queries.  It looks like all the problematic ones were added in
> commit ec4be2ee6827b6bd85e0813c7a8993cfbb0e6fa7 "Extend the set of frame
> options supported for window functions", which means it was either me or
> Hitoshi-san who wrote them that way, but memory is not serving this
> afternoon.
>

I don't remember if I wrote that part or not, but I like to add
explicit ORDER BY to the test cases. It doesn't appear that some
reason stopped us to specify it. So +1 for adding the clauses.

Regards,
--
Hitoshi Harada


From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-16 18:04:08
Message-ID: CAM-w4HMPL4D4ApaEieitrXqiEaWTh3L36oQcyF5DpKJys15ENw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 14, 2011 at 9:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> We could hack around this by adding more columns to the result so that
> an index-only scan doesn't work.  But I wonder whether it wouldn't be
> smarter to add ORDER BY clauses to the window function calls.  I've been
> known to argue against adding just-in-case ORDER BYs to the regression
> tests in the past; but these cases bother me more because a plan change
> will not just rearrange the result rows but change their contents,
> making it really difficult to verify that nothing's seriously wrong.

I'm not sure if it applies to this case but I recall I was recently
running queries on Oracle that included window functions and it
wouldn't even let me run them without ORDER BY clauses in the window
definition. I don't know if it cleverly determines that the ORDER BY
will change the results or if Oracle just requires ORDER BY on all
window definitions or what.

--
greg


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-16 18:36:36
Message-ID: BEAC1AFC-8A2B-48FA-9682-993AF0A68427@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Oct16, 2011, at 20:04 , Greg Stark wrote:
> On Fri, Oct 14, 2011 at 9:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> We could hack around this by adding more columns to the result so that
>> an index-only scan doesn't work. But I wonder whether it wouldn't be
>> smarter to add ORDER BY clauses to the window function calls. I've been
>> known to argue against adding just-in-case ORDER BYs to the regression
>> tests in the past; but these cases bother me more because a plan change
>> will not just rearrange the result rows but change their contents,
>> making it really difficult to verify that nothing's seriously wrong.
>
> I'm not sure if it applies to this case but I recall I was recently
> running queries on Oracle that included window functions and it
> wouldn't even let me run them without ORDER BY clauses in the window
> definition. I don't know if it cleverly determines that the ORDER BY
> will change the results or if Oracle just requires ORDER BY on all
> window definitions or what.

I was about to point out that whether or not ORDER BY is required probably
ought to depend on whether the window function acts on the frame, or the
whole partition. And that oracle quite likely knows that for the individual
window functions while we don't.

But, actually, our documentation states in 3.5, Window Functions, that

By default, if ORDER BY is supplied then the frame consists of all rows
from the start of the partition up through the current row, plus any
following rows that are equal to the current row according to the ORDER BY
clause. When ORDER BY is omitted the default frame consists of all rows
in the partition.

Either I'm confused, or that doesn't match the observed regression test
failure.

I did a few experiments. Assume that table "d" contains three rows with
v=1, v=2 and v=3.

This case seems to work (the frame is always the whole partition, even
though the frame_clause says "between unbounded preceding and current row"):

postgres=# select array_agg(v) over (range between unbounded preceding
and current row) from d;
array_agg
-----------
{1,2,3}
{1,2,3}
{1,2,3}

Once one adds an ORDER BY, the frame_clause is taken into account, so
that works too:

postgres=# select array_agg(v) over (order by v range between unbounded preceding
and current row) from d;
array_agg
-----------
{1}
{1,2}
{1,2,3}

But some frame clauses (row 1 preceding, for example) have an effect despite
there being no ORDER BY, like here:

postgres=# select array_agg(v) over (rows 1 preceding) from d;
array_agg
-----------
{1}
{1,2}
{2,3}

ISTM that we probably should ignore frame clauses, unless there's an ORDER BY.
Or maybe even throw an error?

best regards,
Florian Pflug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-16 22:03:13
Message-ID: 2997.1318802593@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
> 2011/10/15 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> I can't recall whether there was some good reason for underspecifying
>> these test queries. It looks like all the problematic ones were added in
>> commit ec4be2ee6827b6bd85e0813c7a8993cfbb0e6fa7 "Extend the set of frame
>> options supported for window functions", which means it was either me or
>> Hitoshi-san who wrote them that way, but memory is not serving this
>> afternoon.

> I don't remember if I wrote that part or not, but I like to add
> explicit ORDER BY to the test cases. It doesn't appear that some
> reason stopped us to specify it. So +1 for adding the clauses.

I looked at this more closely and realized that the reason for doing it
like that was to test window frames defined using ROWS rather than
RANGE. If we fully specify the window function's input ordering then
there's no very interesting distinction between the two, because no rows
will have any peers. So adding ORDER BY would in fact reduce the scope
of the tests.

At this point I'm inclined to leave it alone. Maybe we could think of
some other test cases (perhaps using some other function than SUM) which
would both exercise the difference between RANGE and ROWS mode, and not
be sensitive to the detailed input ordering. But I doubt it's really
worth the trouble.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-16 22:14:03
Message-ID: 3161.1318803243@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> writes:
> But some frame clauses (row 1 preceding, for example) have an effect despite
> there being no ORDER BY, like here:

Yeah, why did you expect differently? Without ORDER BY, all rows are
peers in the frame ordering, so there's no way for a RANGE spec to
select less than the whole partition. But with ROWS, you can select
less than that.

In general it's not that hard to create nondeterministic window-function
queries, since the SQL standard doesn't require you to specify a unique
ordering for the window function's input rows. Even in RANGE mode there
are plenty of functions that are sensitive to the exact ordering, eg
first_value/last_value. I guess the committee felt that locking this
down would restrict the feature too much.

regards, tom lane


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-16 22:51:22
Message-ID: 9ED73442-4196-4513-B641-52E91E2746B0@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Oct17, 2011, at 00:14 , Tom Lane wrote:
> Florian Pflug <fgp(at)phlo(dot)org> writes:
>> But some frame clauses (row 1 preceding, for example) have an effect despite
>> there being no ORDER BY, like here:
>
> Yeah, why did you expect differently? Without ORDER BY, all rows are
> peers in the frame ordering, so there's no way for a RANGE spec to
> select less than the whole partition. But with ROWS, you can select
> less than that.

I was confused by

When an aggregate function is used as a window function, it aggregates over
the rows within the current row's window frame. [3.5, Window Functions].

combined with the part I quoted before, which was

By default, if ORDER BY is supplied then the frame consists of all rows
from the start of the partition up through the current row, plus any
following rows that are equal to the current row according to the ORDER BY
clause. When ORDER BY is omitted the default frame consists of all rows
in the partition. [9.19, Window Functions, Last Paragraph]

But, reading those parts again, I realize the it says "When ORDER BY is omitted
the *default* frame consists ... ", and that the second quote is followed
by a footnote which says

There are options to define the window frame in other ways, but this tutorial
does not cover them. See Section 4.2.8 for details. [3.5, Window Functions]

So it was just me being thick. Sorry for the noise.

best regards,
Florian Pflug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-16 23:09:03
Message-ID: 19579.1318806543@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> writes:
> ... reading those parts again, I realize the it says "When ORDER BY is omitted
> the *default* frame consists ... ", and that the second quote is followed
> by a footnote which says

> There are options to define the window frame in other ways, but this tutorial
> does not cover them. See Section 4.2.8 for details. [3.5, Window Functions]

> So it was just me being thick. Sorry for the noise.

Hmm. Maybe the use of a <footnote> there is too subtle, and we should
instead have that text in-line (probably in parentheses)? Or we could
use a <note>, but that's probably too much emphasis.

regards, tom lane


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-16 23:37:41
Message-ID: CAP7Qgmn-CQ3LMn4wfXiPf1OGPbP+dqfhM6WR1OYSKYrDw7_AeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/10/17 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
>> 2011/10/15 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> I can't recall whether there was some good reason for underspecifying
>>> these test queries.  It looks like all the problematic ones were added in
>>> commit ec4be2ee6827b6bd85e0813c7a8993cfbb0e6fa7 "Extend the set of frame
>>> options supported for window functions", which means it was either me or
>>> Hitoshi-san who wrote them that way, but memory is not serving this
>>> afternoon.
>
>> I don't remember if I wrote that part or not, but I like to add
>> explicit ORDER BY to the test cases. It doesn't appear that some
>> reason stopped us to specify it. So +1 for adding the clauses.
>
> I looked at this more closely and realized that the reason for doing it
> like that was to test window frames defined using ROWS rather than
> RANGE.  If we fully specify the window function's input ordering then
> there's no very interesting distinction between the two, because no rows
> will have any peers.  So adding ORDER BY would in fact reduce the scope
> of the tests.
>
> At this point I'm inclined to leave it alone.  Maybe we could think of
> some other test cases (perhaps using some other function than SUM) which
> would both exercise the difference between RANGE and ROWS mode, and not
> be sensitive to the detailed input ordering.  But I doubt it's really
> worth the trouble.

Ah, you mentioned about ORDER BY in window specification (OVER
clause). I thought it was query's ORDER BY. Yes, it affects in RANGE
case, and we don't have rich frame support of RANGE (like n PRECEDING
...) so the case ORDER BY affects result is limited. Agree with
leaving it alone.

Regards,
--
Hitoshi Harada


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-16 23:47:29
Message-ID: CAP7Qgm=87FA0f=ECvUYP3zF+zth+nvnsxGxMxvGnSgQVnGURZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/10/17 Greg Stark <stark(at)mit(dot)edu>:
> On Fri, Oct 14, 2011 at 9:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> We could hack around this by adding more columns to the result so that
>> an index-only scan doesn't work.  But I wonder whether it wouldn't be
>> smarter to add ORDER BY clauses to the window function calls.  I've been
>> known to argue against adding just-in-case ORDER BYs to the regression
>> tests in the past; but these cases bother me more because a plan change
>> will not just rearrange the result rows but change their contents,
>> making it really difficult to verify that nothing's seriously wrong.
>
> I'm not sure if it applies to this case but I recall I was recently
> running queries on Oracle that included window functions and it
> wouldn't even let me run them without ORDER BY clauses in the window
> definition. I don't know if it cleverly determines that the ORDER BY
> will change the results or if Oracle just requires ORDER BY on all
> window definitions or what.

AFAIK, the current standard doesn't tell clearly if all/some window
functions require ORDER BY clause in window specifications. Some
window functions like rank and row_number is meaningless if it is
omitted, so some implementation doesn't allow it omitted. And I
believe Oracle implemented it before the standard, so that'd be why
details are different from spec. We designed it per spec and omitting
the clause doesn't violate any part of the standard.

Regards,
--
Hitoshi Harada


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-17 09:48:38
Message-ID: D91C5B29-8B22-4EC2-803D-B7C4677E8EA5@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Oct17, 2011, at 01:09 , Tom Lane wrote:
> Florian Pflug <fgp(at)phlo(dot)org> writes:
>> ... reading those parts again, I realize the it says "When ORDER BY is omitted
>> the *default* frame consists ... ", and that the second quote is followed
>> by a footnote which says
>
>> There are options to define the window frame in other ways, but this tutorial
>> does not cover them. See Section 4.2.8 for details. [3.5, Window Functions]
>
>> So it was just me being thick. Sorry for the noise.
>
> Hmm. Maybe the use of a <footnote> there is too subtle, and we should
> instead have that text in-line (probably in parentheses)? Or we could
> use a <note>, but that's probably too much emphasis.

Inline and in parentheses sounds fine.

In addition, I think we should reword the explanation in 4.2.8 (The SQL Language
/ SQL Syntax / Value Expressions / Window Functions). Instead of that rather
long (and IMHO hard to read) paragraph about possible frame clauses and their
behaviour in the presence or absence of an ORDER BY clause, we should go with
a more algorithmic explanation I think.

Something along these lines maybe:

----------
.) PARTITION BY splits the rows into disjoint partitions. All further processing
happens only inside a single partition

.) In RANGE mode, ORDER BY then splits each partition into an ordered list of
sub-partitions, each containing rows which the ORDER BY considers to be
equivalent.

.) In ROWS mode, OTOH, each sub-partition contains only a single row. Thus, if
there are rows which are considered to be equivalent by the ORDER BY, the
ordering of the sub-partition isn't fully determined.

.) Each row's frame then consists of some consecutive range of sub-partitions.

.) In RANGE mode, that consecutive range can only start at either the first
sub-partition or the current row's sub-partition, and can only end at either
the current row's sub-partition or the last sub-partitions.

.) In ROWS mode, the consecutive range may additional start <n> sub-partitions
(or rows, it's the same thing here) before the current row, and may additionally
end <m> sub-partitions/rows after the current row.

From that, it follows that even with an underspecified sort order, the contents of
each frame are still fully determined in RANGE mode. The ordering of rows within
a frame is not determined, though. So overall, in RANGE mode, a query's result is
only non-deterministic if the window function is sensitive to the ordering of rows
within a frame.

In ROWS mode, OTOH, the contents each frame themselves are not fully determined,
so even an ordering agnostic window function may produce non-deterministic results.
----------

If you think that something along these lines would be an improvement, I can try
to come up with a patch.

best regards,
Florian Pflug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: Underspecified window queries in regression tests
Date: 2011-10-17 15:37:56
Message-ID: 14370.1318865876@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> writes:
> In addition, I think we should reword the explanation in 4.2.8 (The SQL Language
> / SQL Syntax / Value Expressions / Window Functions). Instead of that rather
> long (and IMHO hard to read) paragraph about possible frame clauses and their
> behaviour in the presence or absence of an ORDER BY clause, we should go with
> a more algorithmic explanation I think.

Hm. I do not think what you're proposing is a substitute for the text
that's now in 4.2.8, because that's mostly providing syntactic details.
It might make a good addition though.

> If you think that something along these lines would be an improvement, I can try
> to come up with a patch.

Sure.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Underspecified window queries in regression tests
Date: 2012-08-16 15:23:11
Message-ID: 20120816152311.GK8353@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I have used your notes below to rewrite the Window function SQL manual
section. As you said, it was very hard to read. I now understand it
better, having restructured it, and I hope others do too.

After waiting 30 minutes for our developer doc build to refresh, I am
giving up and posting my own URL for the doc changes:

http://momjian.us/tmp/pgsql/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS

Perhaps I need to go back to having my own doc build.

---------------------------------------------------------------------------

On Mon, Oct 17, 2011 at 11:48:38AM +0200, Florian Pflug wrote:
> On Oct17, 2011, at 01:09 , Tom Lane wrote:
> > Florian Pflug <fgp(at)phlo(dot)org> writes:
> >> ... reading those parts again, I realize the it says "When ORDER BY is omitted
> >> the *default* frame consists ... ", and that the second quote is followed
> >> by a footnote which says
> >
> >> There are options to define the window frame in other ways, but this tutorial
> >> does not cover them. See Section 4.2.8 for details. [3.5, Window Functions]
> >
> >> So it was just me being thick. Sorry for the noise.
> >
> > Hmm. Maybe the use of a <footnote> there is too subtle, and we should
> > instead have that text in-line (probably in parentheses)? Or we could
> > use a <note>, but that's probably too much emphasis.
>
> Inline and in parentheses sounds fine.
>
> In addition, I think we should reword the explanation in 4.2.8 (The SQL Language
> / SQL Syntax / Value Expressions / Window Functions). Instead of that rather
> long (and IMHO hard to read) paragraph about possible frame clauses and their
> behaviour in the presence or absence of an ORDER BY clause, we should go with
> a more algorithmic explanation I think.
>
> Something along these lines maybe:
>
> ----------
> .) PARTITION BY splits the rows into disjoint partitions. All further processing
> happens only inside a single partition
>
> .) In RANGE mode, ORDER BY then splits each partition into an ordered list of
> sub-partitions, each containing rows which the ORDER BY considers to be
> equivalent.
>
> .) In ROWS mode, OTOH, each sub-partition contains only a single row. Thus, if
> there are rows which are considered to be equivalent by the ORDER BY, the
> ordering of the sub-partition isn't fully determined.
>
> .) Each row's frame then consists of some consecutive range of sub-partitions.
>
> .) In RANGE mode, that consecutive range can only start at either the first
> sub-partition or the current row's sub-partition, and can only end at either
> the current row's sub-partition or the last sub-partitions.
>
> .) In ROWS mode, the consecutive range may additional start <n> sub-partitions
> (or rows, it's the same thing here) before the current row, and may additionally
> end <m> sub-partitions/rows after the current row.
>
> >From that, it follows that even with an underspecified sort order, the contents of
> each frame are still fully determined in RANGE mode. The ordering of rows within
> a frame is not determined, though. So overall, in RANGE mode, a query's result is
> only non-deterministic if the window function is sensitive to the ordering of rows
> within a frame.
>
> In ROWS mode, OTOH, the contents each frame themselves are not fully determined,
> so even an ordering agnostic window function may produce non-deterministic results.
> ----------
>
> If you think that something along these lines would be an improvement, I can try
> to come up with a patch.
>
> best regards,
> Florian Pflug
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +