Window Functions: buffering strategy

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Window Functions: buffering strategy
Date: 2008-10-20 01:32:50
Message-ID: e08cc0400810191832k8f13d73n7af3cf2f2ec730b4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I can find how to do it with the new (window execution model) design,
> (and the design is suitable to fix it above,) but at first before
> going into trivial specs, I would like core hackers to review the
> model is better than before or not. Thank you for your cooperation.

So no objections appeared. I am progressing with the new model.

Thanks to feedbacks against the previous patch, I am inclined to think
that we need to have different row buffering strategy for window
functions. As we discussed, depends on functions different bufferings
are needed. Then I read spec again and classified functions.

- row_number() -- row buffering
Obviously no buffering, or what you need is only where you are in the partition

- rank() -- row buffering
You need two rows, previous row and current row to determine if you
rank up or not. But it is solvable if the function cache the previous
row so row buffering is needed.

- dense_rank() -- row buffering
Same as rank().

- percent_rank() -- partition buffering
In addition to rank() requirement, you have to see the last row in
the partition to count up how many rows are contained in the
partition. The spec says: defined as (RK–1)/(NR–1), where RK is
defined to be the RANK of R and NR is defined to be the number of rows
in the window partition of R.

- cume_dist() -- partition buffering
Almost same as percent_rank(), partition buffering.

- ntile() -- partition buffering
You have to know how many rows in the partition to determine how many
rows are contained in each bucket.

- lead() -- partition buffering
The spec says: returns the value of VE evaluated on a row that is
OFFSET number of rows after R within P (partition).

- lag() -- partition buffering
Same as lead().

- first_value() -- frame buffering
The spec says: return the value of VE evaluated on the first row of
the window frame.

- last_value() -- frame buffering
Same as first_value().

- nth_value() -- frame buffering
Same as first_value()

- aggregates -- frame buffering
aggregates all rows in the frame.

So I propose three Window node buffering strategies,
row/frame/partition bufferinig so as to avoid unnecessary row
buffering. Each window functions have buffering strategy number and
planner detect which strategy is at least needed for the execution. If
the node contains only row_number() then it needs row buffering but if
row_number() and lead() then partition bufferinig is needed.
Temporarily until window function APIs are public out, the strategy
numbers for each function can be defined in the source code as macro
or something, and when they are public we must have pg_wfunc or
something to register the strategy. If the function violate the
declared strategy and touched different API, for example if the
row_number() is about to call window_paritition_rows(), error will
occur.

After reading spec closer, I found row_number(), rank(), etc. doesn't
care if there is a moving frame. Only aggregates and
first/last/nth_value()s care it.

Now I guess I know what to do but I don't know how to do it :) But I
am going to send another patch until next commit fest. I appreciate
for you comments.

Regeards,

--
Hitoshi Harada

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Eric Haszlakiewicz 2008-10-20 01:59:28 Re: two servers on the same port
Previous Message M. Edward (Ed) Borasky 2008-10-19 18:50:03 Re: Lisp as a procedural language?