Re: Windowing Function Patch Review -> Standard Conformance

Lists: pgsql-hackers
From: "David Rowley" <dgrowley(at)gmail(dot)com>
To: "'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "'Hitoshi Harada'" <umi(dot)tanuki(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-11-30 21:19:12
Message-ID: 3C88C2309A1A472CB2BDA7F79C8B3D72@amd64
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> I was also reading over the standard tonight. I've discovered that the
> OFFSET in LEAD() and LAG() is optional. It should default to 1 if it is
> not present. Oracle seems to support this.
>
> SQL2008 says:
> > If <lead or lag function> is specified, then:
> > i) Let VE1 be <lead or lag extent> and let DT be the declared type of
> VE1.
> > ii) Case:
> > Scalar expressions 209
> > WD 9075-2:200w(E)
> > 6.10 <window function>
> > If <offset> is specified, then let OFF be <offset>. The declared type of
> >OFF shall be an
> > exact numeric type with scale 0 (zero).
> > 1)
> > 2) Otherwise, let OFF be 1 (one).
>
> Yet another variant of LEAD() and LAG() but I think well worth it for both
> compliance to the standard and compatibility to Oracle.

I figured this was quite simple so I've created a patch to implement this.
Can probably put this down to the fact that I'm starting to feel bad about
pointing out the mistakes and having someone else fix them. Figured it was
time to make some changes myself.

I've got limited experience with diff so please let me know if there is
something wrong with the patch. Same goes for my changes to the code.

I re-sequenced the OIDs of other window functions so it will require initdb.

Also I made some updates to the documentation. Wasn't 100% sure on the
syntax for the optional arguments there. Hitoshi had: arg1 [,optional1].
I've changed this to arg, [optional1], [optional2].

One thing I didn't do was update the regression test:
SELECT oid, proname FROM pg_proc WHERE proiswfunc;

Hopefully this patch will apply after applying Heikki's latest patch
(version 3).

If you're happy with this Heikki can you merge to your patch?

David

Attachment Content-Type Size
windowfunc_nooffset1.patch application/octet-stream 13.3 KB

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "David Rowley" <dgrowley(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-12-01 16:13:16
Message-ID: e08cc0400812010813u10eb395ewcc55881bce633ca5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2008/12/1 David Rowley <dgrowley(at)gmail(dot)com>:
> I wrote:
>> I was also reading over the standard tonight. I've discovered that the
>> OFFSET in LEAD() and LAG() is optional. It should default to 1 if it is
>> not present. Oracle seems to support this.
>>
>> SQL2008 says:
>> > If <lead or lag function> is specified, then:
>> > i) Let VE1 be <lead or lag extent> and let DT be the declared type of
>> VE1.
>> > ii) Case:
>> > Scalar expressions 209
>> > WD 9075-2:200w(E)
>> > 6.10 <window function>
>> > If <offset> is specified, then let OFF be <offset>. The declared type of
>> >OFF shall be an
>> > exact numeric type with scale 0 (zero).
>> > 1)
>> > 2) Otherwise, let OFF be 1 (one).
>>
>> Yet another variant of LEAD() and LAG() but I think well worth it for both
>> compliance to the standard and compatibility to Oracle.
>
> I figured this was quite simple so I've created a patch to implement this.
> Can probably put this down to the fact that I'm starting to feel bad about
> pointing out the mistakes and having someone else fix them. Figured it was
> time to make some changes myself.
>
> I've got limited experience with diff so please let me know if there is
> something wrong with the patch. Same goes for my changes to the code.
>
> I re-sequenced the OIDs of other window functions so it will require initdb.
>
> Also I made some updates to the documentation. Wasn't 100% sure on the
> syntax for the optional arguments there. Hitoshi had: arg1 [,optional1].
> I've changed this to arg, [optional1], [optional2].
>
> One thing I didn't do was update the regression test:
> SELECT oid, proname FROM pg_proc WHERE proiswfunc;
>
> Hopefully this patch will apply after applying Heikki's latest patch
> (version 3).
>
> If you're happy with this Heikki can you merge to your patch?

I merged Heikki's patch with your lead/lag, then added a few tests for
those new comer functions. Also it contains some of those Tom pointed
including:

- Remove sgml keyword modifications as it will be generated automatically.
- Remove PartitionClause and OrderClause since they can be replaced
with SortGroupClause.
- Parallel test schedule changed to fit the parallel limit.
- equalfuncs/nodefuncs are now consistent in Query/WFunc
- Fix error code, which is now 42P36. But I'm still not sure it is appropriate.

And the patch is against HEAD. The git repository now points "spool"
branch for his approach, which I suppose will be merged to the master
(trunk) of the window functions repository.

I tested the spool performance with David's earlier bigtable:

CREATE TABLE bigtable (
id SERIAL NOT NULL PRIMARY KEY,
timestamp TIMESTAMP NOT NULL
);

-- about 383MB of data
INSERT INTO bigtable (timestamp)
SELECT NOW() + (CAST(RANDOM() * 10 AS INT) || ' secs')::INTERVAL
FROM generate_series(1,10000000);

CREATE INDEX bigtable_timestamp_idx ON bigtable (timestamp);

VACUUM ANALYZE bigtable;

sample=# SELECT COUNT(*) FROM bigtable;
count
----------
10000000
(1 row)

sample=# SELECT LEAD(timestamp) OVER (ORDER BY id) FROM bigtable LIMIT 1;
lead
----------------------------
2008-12-02 00:15:10.288461
(1 row)

sample=# EXPLAIN ANALYZE SELECT LEAD(timestamp) OVER (ORDER BY id)
FROM bigtable LIMIT 1;

QUERY PLAN

----------------------------------------------------------------------------------------------
---------------------------------------------------
Limit (cost=0.00..0.04 rows=1 width=12) (actual time=0.038..0.039
rows=1 loops=1)
-> Window (cost=0.00..386612.13 rows=10000000 width=12) (actual
time=0.036..0.036 rows=1
loops=1)
-> Index Scan using bigtable_pkey on bigtable
(cost=0.00..286612.13 rows=10000000 w
idth=12) (actual time=0.018..0.021 rows=2 loops=1)
Total runtime: 0.071 ms
(4 rows)

shows quite good result. Great work.

The following query works on my build:

> SELECT depname,SUM(SUM(salary)) OVER (ORDER BY depname) FROM empsalary GROUP
> BY depname;
> ERROR: variable not found in subplan target list

Now, I am thinking about refactoring around aggregate common code, and
renaming WFunc to WinFunc, which leads pg_proc.proiswfunc be
pg_proc.proiswinfunc and so on if no objections come.

P.S. seems hit attachment limitation. Sorry if you receive duplicated mail.

Regards,

--
Hitoshi Harada

Attachment Content-Type Size
window_functions.patch.20081202-1.gz application/x-gzip 42.9 KB

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "David Rowley" <dgrowley(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-12-01 16:13:50
Message-ID: e08cc0400812010813v64f65555t1c31858ebb4d8118@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

patch-2

2008/12/2 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
> 2008/12/1 David Rowley <dgrowley(at)gmail(dot)com>:
>> I wrote:
>>> I was also reading over the standard tonight. I've discovered that the
>>> OFFSET in LEAD() and LAG() is optional. It should default to 1 if it is
>>> not present. Oracle seems to support this.
>>>
>>> SQL2008 says:
>>> > If <lead or lag function> is specified, then:
>>> > i) Let VE1 be <lead or lag extent> and let DT be the declared type of
>>> VE1.
>>> > ii) Case:
>>> > Scalar expressions 209
>>> > WD 9075-2:200w(E)
>>> > 6.10 <window function>
>>> > If <offset> is specified, then let OFF be <offset>. The declared type of
>>> >OFF shall be an
>>> > exact numeric type with scale 0 (zero).
>>> > 1)
>>> > 2) Otherwise, let OFF be 1 (one).
>>>
>>> Yet another variant of LEAD() and LAG() but I think well worth it for both
>>> compliance to the standard and compatibility to Oracle.
>>
>> I figured this was quite simple so I've created a patch to implement this.
>> Can probably put this down to the fact that I'm starting to feel bad about
>> pointing out the mistakes and having someone else fix them. Figured it was
>> time to make some changes myself.
>>
>> I've got limited experience with diff so please let me know if there is
>> something wrong with the patch. Same goes for my changes to the code.
>>
>> I re-sequenced the OIDs of other window functions so it will require initdb.
>>
>> Also I made some updates to the documentation. Wasn't 100% sure on the
>> syntax for the optional arguments there. Hitoshi had: arg1 [,optional1].
>> I've changed this to arg, [optional1], [optional2].
>>
>> One thing I didn't do was update the regression test:
>> SELECT oid, proname FROM pg_proc WHERE proiswfunc;
>>
>> Hopefully this patch will apply after applying Heikki's latest patch
>> (version 3).
>>
>> If you're happy with this Heikki can you merge to your patch?
>
> I merged Heikki's patch with your lead/lag, then added a few tests for
> those new comer functions. Also it contains some of those Tom pointed
> including:
>
> - Remove sgml keyword modifications as it will be generated automatically.
> - Remove PartitionClause and OrderClause since they can be replaced
> with SortGroupClause.
> - Parallel test schedule changed to fit the parallel limit.
> - equalfuncs/nodefuncs are now consistent in Query/WFunc
> - Fix error code, which is now 42P36. But I'm still not sure it is appropriate.
>
> And the patch is against HEAD. The git repository now points "spool"
> branch for his approach, which I suppose will be merged to the master
> (trunk) of the window functions repository.
>
> I tested the spool performance with David's earlier bigtable:
>
> CREATE TABLE bigtable (
> id SERIAL NOT NULL PRIMARY KEY,
> timestamp TIMESTAMP NOT NULL
> );
>
> -- about 383MB of data
> INSERT INTO bigtable (timestamp)
> SELECT NOW() + (CAST(RANDOM() * 10 AS INT) || ' secs')::INTERVAL
> FROM generate_series(1,10000000);
>
> CREATE INDEX bigtable_timestamp_idx ON bigtable (timestamp);
>
> VACUUM ANALYZE bigtable;
>
> sample=# SELECT COUNT(*) FROM bigtable;
> count
> ----------
> 10000000
> (1 row)
>
> sample=# SELECT LEAD(timestamp) OVER (ORDER BY id) FROM bigtable LIMIT 1;
> lead
> ----------------------------
> 2008-12-02 00:15:10.288461
> (1 row)
>
> sample=# EXPLAIN ANALYZE SELECT LEAD(timestamp) OVER (ORDER BY id)
> FROM bigtable LIMIT 1;
>
> QUERY PLAN
>
> ----------------------------------------------------------------------------------------------
> ---------------------------------------------------
> Limit (cost=0.00..0.04 rows=1 width=12) (actual time=0.038..0.039
> rows=1 loops=1)
> -> Window (cost=0.00..386612.13 rows=10000000 width=12) (actual
> time=0.036..0.036 rows=1
> loops=1)
> -> Index Scan using bigtable_pkey on bigtable
> (cost=0.00..286612.13 rows=10000000 w
> idth=12) (actual time=0.018..0.021 rows=2 loops=1)
> Total runtime: 0.071 ms
> (4 rows)
>
>
> shows quite good result. Great work.
>
> The following query works on my build:
>
>> SELECT depname,SUM(SUM(salary)) OVER (ORDER BY depname) FROM empsalary GROUP
>> BY depname;
>> ERROR: variable not found in subplan target list
>
>
> Now, I am thinking about refactoring around aggregate common code, and
> renaming WFunc to WinFunc, which leads pg_proc.proiswfunc be
> pg_proc.proiswinfunc and so on if no objections come.
>
> P.S. seems hit attachment limitation. Sorry if you receive duplicated mail.
>
> Regards,
>
>
>
> --
> Hitoshi Harada
>

--
Hitoshi Harada

Attachment Content-Type Size
window_functions.patch.20081202-2.gz application/x-gzip 93.2 KB

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "David Rowley" <dgrowley(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-12-02 15:09:49
Message-ID: e08cc0400812020709t50be7705pf673b48f08e785b1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2008/12/2 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
> sample=# EXPLAIN ANALYZE SELECT LEAD(timestamp) OVER (ORDER BY id)
> FROM bigtable LIMIT 1;
>
> QUERY PLAN
>
> ----------------------------------------------------------------------------------------------
> ---------------------------------------------------
> Limit (cost=0.00..0.04 rows=1 width=12) (actual time=0.038..0.039
> rows=1 loops=1)
> -> Window (cost=0.00..386612.13 rows=10000000 width=12) (actual
> time=0.036..0.036 rows=1
> loops=1)
> -> Index Scan using bigtable_pkey on bigtable
> (cost=0.00..286612.13 rows=10000000 w
> idth=12) (actual time=0.018..0.021 rows=2 loops=1)
> Total runtime: 0.071 ms
> (4 rows)
>
>
> shows quite good result. Great work.
>

After more playing with the new patch, I found worse results.

sample=# explain analyze select id, row_number() OVER (order by id)
from bigtable order by id;

QUERY PLAN

----------------------------------------------------------------------------------------------
-------------------------------------------------------
Window (cost=0.00..361612.13 rows=10000000 width=4) (actual
time=0.064..105414.522 rows=1000
0000 loops=1)
-> Index Scan using bigtable_pkey on bigtable
(cost=0.00..286612.13 rows=10000000 width=4
) (actual time=0.056..16836.341 rows=10000000 loops=1)
Total runtime: 114650.074 ms
(3 rows)

sample=# explain analyze select id,LAG(timestamp,1) over (order by id)
from bigtable order by id;

QUERY PLAN

----------------------------------------------------------------------------------------------
--------------------------------------------------------
Window (cost=0.00..411612.13 rows=10000000 width=12) (actual
time=0.065..122583.331 rows=100
00000 loops=1)
-> Index Scan using bigtable_pkey on bigtable
(cost=0.00..286612.13 rows=10000000 width=1
2) (actual time=0.056..18066.829 rows=10000000 loops=1)
Total runtime: 132770.399 ms
(3 rows)

The earlier patch results are here:
http://archives.postgresql.org/pgsql-hackers/2008-11/msg01121.php

row_number(): 44s/114s
lag(): 79s/132s

I don't understand the new patch totally, and I know the row_number()
optimization is in progress, but even lag() is quite worse. Maybe
tuplestore read pointer's heavy uses cause these.

Regards,

--
Hitoshi Harada


From: "David Rowley" <dgrowley(at)gmail(dot)com>
To: "'Hitoshi Harada'" <umi(dot)tanuki(at)gmail(dot)com>
Cc: "'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-12-05 21:44:44
Message-ID: 233A00BB1F264BB3B18EC9E89CC4CD14@amd64
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hitoshi Harada wrote:
> I tested the spool performance with David's earlier bigtable:
>
> CREATE TABLE bigtable (
> id SERIAL NOT NULL PRIMARY KEY,
> timestamp TIMESTAMP NOT NULL
> );
>
> -- about 383MB of data
> INSERT INTO bigtable (timestamp)
> SELECT NOW() + (CAST(RANDOM() * 10 AS INT) || ' secs')::INTERVAL
> FROM generate_series(1,10000000);
>
> CREATE INDEX bigtable_timestamp_idx ON bigtable (timestamp);
>
> VACUUM ANALYZE bigtable;
>
> sample=# SELECT COUNT(*) FROM bigtable;
> count
> ----------
> 10000000
> (1 row)
>
> sample=# SELECT LEAD(timestamp) OVER (ORDER BY id) FROM bigtable LIMIT 1;
> lead
> ----------------------------
> 2008-12-02 00:15:10.288461
> (1 row)
>
> sample=# EXPLAIN ANALYZE SELECT LEAD(timestamp) OVER (ORDER BY id)
> FROM bigtable LIMIT 1;
>
> QUERY PLAN
>
> --------------------------------------------------------------------------
> --------------------
> ---------------------------------------------------
> Limit (cost=0.00..0.04 rows=1 width=12) (actual time=0.038..0.039
> rows=1 loops=1)
> -> Window (cost=0.00..386612.13 rows=10000000 width=12) (actual
> time=0.036..0.036 rows=1
> loops=1)
> -> Index Scan using bigtable_pkey on bigtable
> (cost=0.00..286612.13 rows=10000000 w
> idth=12) (actual time=0.018..0.021 rows=2 loops=1)
> Total runtime: 0.071 ms
> (4 rows)
>
>
> shows quite good result. Great work.

Amazing improvement!

Old patch:
david=# select timestamp,lag(timestamp,1) over (order by id) from bigtable
order by id limit 1;
timestamp | lag
----------------------------+-----
2008-11-10 21:55:16.498458 |
(1 row)

Time: 105205.055 ms

New patch:
david=# select timestamp,lag(timestamp,1) over (order by id) from bigtable
order by id limit 1;
timestamp | lag
----------------------------+-----
2008-12-04 22:05:22.687975 |
(1 row)

Time: 1.640 ms

>
> The following query works on my build:
>
> > SELECT depname,SUM(SUM(salary)) OVER (ORDER BY depname) FROM empsalary
> GROUP
> > BY depname;
> > ERROR: variable not found in subplan target list
>

This works fine on my build now too.

>
> Now, I am thinking about refactoring around aggregate common code, and
> renaming WFunc to WinFunc, which leads pg_proc.proiswfunc be
> pg_proc.proiswinfunc and so on if no objections come.
>

I've spent last night and tonight trying to break the patch and I've not
managed it.

I spent 2 and a half hours on the train last night reading over the patch
mainly for my own interest. I also went over the documentation and I have a
few suggestions for improvement:

My modifications to the lead and lag syntax can be improved. Currently the
optional parameters make it look like DEFAULT can be specified without
OFFSET. This is not the case:

+ <type>any, [integer], [any]</type>

Should be:

+ <type>any [,integer [,any] ]</type>

And:

+ lag(<replaceable class="parameter">value</replaceable>,
[<replaceable
+ class="parameter">offset</replaceable>], [<replaceable
+ class="parameter">default</replaceable>])

Should be:

+ lag(<replaceable class="parameter">value</replaceable> [,
<replaceable
+ class="parameter">offset</replaceable> [,<replaceable
+ class="parameter">default</replaceable>] ])

Same for LEAD()

+ <para>
+ After <literal>WHERE</> and <literal>GROUP BY</> process,
+ rows might be windowed table, using the <literal>WINDOW</>
+ clause.
+ </para>

I think I know what you mean here. My re-write seems to have turned the
sentence into a paragraph. Please tell me if I've assumed the meaning
wrongly:

"After the <literal>WHERE</>, <literal>GROUP BY</> and <literal>HAVING</>
clauses one or more <literal>WINDOW</> clauses can be specified. This will
allow window functions to be specified in the <literal>SELECT</> clause.
These window functions can make use of the <literal>WINDOW</> clauses by
making reference to the alias name of the window rather than explicitly
specifying the properties of the window in each <literal>OVER</> clause."

+ <para>
+ Another expample shows different capability of window functions
+ from above.

Small typo: example instead of eapample

+ <para>
+ The same window definitions can be named and put togather into one
+ definition using <xref linkend="queries-window">.

Small typo: together instead of togather.

+ <para>
+ Window functions doesn't accept DISTINCT and ALL syntax, even though
+ the function is an aggregate function.
+ </para>

Small grammar error: doesn't should be replaced with don't. Or perhaps
change to:

"Window functions, unlike normal aggregate functions, do not allow DISTINCT
or ALL to be used within the function argument list."

+ Window functions are not placed in any of GROUP BY, HAVING and
+ WHERE clauses, which process values before any of the windows. If
+ there is need to qualify rows by the result of window functions,
+ whole of the query must be nested and append WHERE clause outer of
+ the current query.

I think this one maybe needs an example to back it up. It's quite an
important thing and I'm sure lots of people will need to do this. I'm not
100% happy with my new paragraph either but can't see how to word it any
better.

"Window functions cannot be used in the WHERE, GROUP BY or HAVING clauses
of the query. If there is a need to filter rows, group results or filter
rows after aggregation takes place (HAVING) then the query must be nested.
The query should contain the window functions in the inner query and apply
the additional clauses that contain the results from the window function in
the outer query, such as:

SELECT depname,
empno,
salary,
enroll_date
FROM (SELECT depname,
empno,
salary,
enroll_date,
ROW_NUMBER() OVER (PARTITION BY depname ORDER BY salary,empno)
AS pos
FROM empsalary
) AS e
WHERE pos = 1;

In the above query the we're filtering and only showing the results from the
inner query where the ROW_NUMBER() value is equal to 1."

But of course the above query would be more simple using DISTINCT ON. Maybe
there is a better example... My previous marathon getting the person in 2nd
place might be better but that's introducing another previously unknown
table to the manual.

+ * A window function is defined as a function matked as wfunc in pg_proc.
By
+ * this mark, it means the function can handle window function APIs that
allow
+ * it to access arbitrary random rows within the window.

Small typo: marked instead of matked

Maybe this fragment should read:

"A window function is a function that can be used by the window function
API. This allows access arbitrary random rows within the window. These
functions are identified by the proiswfunc column in pg_proc."

Or pg_proc.proiswfunc if you've renamed.

David.


From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "David Rowley" <dgrowley(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-12-07 08:42:33
Message-ID: e08cc0400812070042k65777c3dj90ac633f58930451@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2008/12/6 David Rowley <dgrowley(at)gmail(dot)com>:
>
> I've spent last night and tonight trying to break the patch and I've not
> managed it.
>
> I spent 2 and a half hours on the train last night reading over the patch
> mainly for my own interest. I also went over the documentation and I have a
> few suggestions for improvement:
>
> + <para>
> + After <literal>WHERE</> and <literal>GROUP BY</> process,
> + rows might be windowed table, using the <literal>WINDOW</>
> + clause.
> + </para>
>
> I think I know what you mean here. My re-write seems to have turned the
> sentence into a paragraph. Please tell me if I've assumed the meaning
> wrongly:
>
>
> "After the <literal>WHERE</>, <literal>GROUP BY</> and <literal>HAVING</>
> clauses one or more <literal>WINDOW</> clauses can be specified. This will
> allow window functions to be specified in the <literal>SELECT</> clause.
> These window functions can make use of the <literal>WINDOW</> clauses by
> making reference to the alias name of the window rather than explicitly
> specifying the properties of the window in each <literal>OVER</> clause."

The "WINDOW clause" is a clause that starts with WINDOW, containing
some window definitions, syntactically. So I rewrote it as:

>>
After the <literal>WHERE</>, <literal>GROUP BY</> and
<literal>HAVING</> clauses one or more window definitions can be
specified by the <literal>WINDOW</> clause. This will allow window
functions to be specified in the <literal>SELECT</> clause. These
window functions can make use of the <literal>WINDOW</> clauses by
making reference to the alias name of the window rather than
explicitly specifying the properties of the window in each
<literal>OVER</> clause.
<<

>
>
> + Window functions are not placed in any of GROUP BY, HAVING and
> + WHERE clauses, which process values before any of the windows. If
> + there is need to qualify rows by the result of window functions,
> + whole of the query must be nested and append WHERE clause outer of
> + the current query.
>
> I think this one maybe needs an example to back it up. It's quite an
> important thing and I'm sure lots of people will need to do this. I'm not
> 100% happy with my new paragraph either but can't see how to word it any
> better.
>
> "Window functions cannot be used in the WHERE, GROUP BY or HAVING clauses
> of the query. If there is a need to filter rows, group results or filter
> rows after aggregation takes place (HAVING) then the query must be nested.
> The query should contain the window functions in the inner query and apply
> the additional clauses that contain the results from the window function in
> the outer query, such as:
>
> SELECT depname,
> empno,
> salary,
> enroll_date
> FROM (SELECT depname,
> empno,
> salary,
> enroll_date,
> ROW_NUMBER() OVER (PARTITION BY depname ORDER BY salary,empno)
> AS pos
> FROM empsalary
> ) AS e
> WHERE pos = 1;
>
> In the above query the we're filtering and only showing the results from the
> inner query where the ROW_NUMBER() value is equal to 1."
>
> But of course the above query would be more simple using DISTINCT ON. Maybe
> there is a better example... My previous marathon getting the person in 2nd
> place might be better but that's introducing another previously unknown
> table to the manual.

I use this query:

SELECT depname,
empno,
salary,
enroll_date
FROM (SELECT depname,
empno,
salary,
enroll_date,
ROW_NUMBER() OVER (PARTITION BY depname ORDER BY
salary,empno) AS pos
FROM empsalary
) AS e
WHERE pos < 3;

This isn't emulated by DISTINCT ON, is it?

For all other issues, thanks, applied to my patch.

Regards,

--
Hitoshi Harada