Re: How to implement oracle like rownum(function or seudocolumn) ?

Lists: pgsql-hackers
From: Juan Manuel Diaz Lara <jmdiazlr(at)yahoo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: How to implement oracle like rownum(function or seudocolumn) ?
Date: 2006-04-08 16:33:27
Message-ID: 20060408163327.11488.qmail@web60321.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I need a rownum column, like Oracle. I have searched the mailing lists and I don't see a satisfactory solution, so I was wondering write a UDF to implement it, the requirements are:

1. Need to preserve state between calls to the function (the rowcount). Maybe using a MemoryContext with a lifespan equal to the executing query.
2. It should be called every time a new row is produced.
3. And more important, need to be called in the right place when called from subquerys:

Imagine one to man relationship between table A and tabla B. Then next query

SELECT s.id, s.rownum
FROM (SELECT a.id, rownum() AS rownum
FROM a) s,
b
WHERE a.id = b.id

In this case, the rownum function should behave as if it was an aggregated function, where the subquery is evaluated first and the results joined with table b. I think that a UDF could be evaluated last by the planner (when producing the final resultset) so it won't give the intended result.

I looking for a general solution, Any ideas? .

Thanks.


Atte.

Juan Manuel Díaz Lara

---------------------------------
New Yahoo! Messenger with Voice. Call regular phones from your PC and save big.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Juan Manuel Diaz Lara <jmdiazlr(at)yahoo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn) ?
Date: 2006-04-08 16:46:06
Message-ID: 23603.1144514766@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Juan Manuel Diaz Lara <jmdiazlr(at)yahoo(dot)com> writes:
> I need a rownum column, like Oracle. I have searched the mailing lists and I don't see a satisfactory solution, so I was wondering write a UDF to implement it, the requirements are:

Try keeping a counter in fcinfo->flinfo->fn_extra.

> 3. And more important, need to be called in the right place when called from subquerys:

Don't expect miracles in this department. The planner will evaluate the
function where it sees fit...

regards, tom lane


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Juan Manuel Diaz Lara <jmdiazlr(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn) ?
Date: 2006-04-08 17:26:19
Message-ID: 20060408172619.GA57636@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 08, 2006 at 12:46:06PM -0400, Tom Lane wrote:
> Juan Manuel Diaz Lara <jmdiazlr(at)yahoo(dot)com> writes:
> > I need a rownum column, like Oracle. I have searched the mailing lists
> > and I don't see a satisfactory solution, so I was wondering write a
> > UDF to implement it, the requirements are:
>
> Try keeping a counter in fcinfo->flinfo->fn_extra.

Is this close to being correct?

Datum
rownum(PG_FUNCTION_ARGS)
{
int64 *row_counter;

if (fcinfo->flinfo->fn_extra == NULL) {
row_counter = (int64 *)MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
sizeof(int64));
*row_counter = 0;
fcinfo->flinfo->fn_extra = row_counter;
}

row_counter = fcinfo->flinfo->fn_extra;

PG_RETURN_INT64(++(*row_counter));
}

> > 3. And more important, need to be called in the right place when
> called from subquerys:
>
> Don't expect miracles in this department. The planner will evaluate the
> function where it sees fit...

Would OFFSET 0 be the workaround in this case?

SELECT rownum(), *
FROM (SELECT rownum() AS f_rownum, id AS f_id FROM foo) AS f,
(SELECT rownum() AS b_rownum, id AS b_id FROM bar) AS b;
rownum | f_rownum | f_id | b_rownum | b_id
--------+----------+-------+----------+-------
1 | 1 | foo-1 | 1 | bar-1
2 | 2 | foo-1 | 2 | bar-2
3 | 3 | foo-2 | 3 | bar-1
4 | 4 | foo-2 | 4 | bar-2
5 | 5 | foo-3 | 5 | bar-1
6 | 6 | foo-3 | 6 | bar-2
(6 rows)

SELECT rownum(), *
FROM (SELECT rownum() AS f_rownum, id AS f_id FROM foo OFFSET 0) AS f,
(SELECT rownum() AS b_rownum, id AS b_id FROM bar OFFSET 0) AS b;
rownum | f_rownum | f_id | b_rownum | b_id
--------+----------+-------+----------+-------
1 | 1 | foo-1 | 1 | bar-1
2 | 1 | foo-1 | 2 | bar-2
3 | 2 | foo-2 | 1 | bar-1
4 | 2 | foo-2 | 2 | bar-2
5 | 3 | foo-3 | 1 | bar-1
6 | 3 | foo-3 | 2 | bar-2
(6 rows)

--
Michael Fuhr


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Juan Manuel Diaz Lara <jmdiazlr(at)Yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Date: 2006-04-08 18:02:53
Message-ID: 4437FACD.2060800@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Someone correct me if I'm wrong, but I was allways under the impression
that Oracle's ROWNUM is a thing attached to a row in the final result
set, whatever (possibly random) order that happens to have. Now a) this
is something that IMHO belongs into the client or stored procedure code,
b) if I am right, the code below will break as soon as an ORDER BY is
added to the query and most importantly c) if a) cannot do the job, it
indicates that the database schema or business process definition lacks
some key/referential definition and is in need of a fix.

My humble guess is that c) is also the reason why the ANSI didn't find a
ROWNUM desirable.

Jan

On 4/8/2006 1:26 PM, Michael Fuhr wrote:
> On Sat, Apr 08, 2006 at 12:46:06PM -0400, Tom Lane wrote:
>> Juan Manuel Diaz Lara <jmdiazlr(at)yahoo(dot)com> writes:
>> > I need a rownum column, like Oracle. I have searched the mailing lists
>> > and I don't see a satisfactory solution, so I was wondering write a
>> > UDF to implement it, the requirements are:
>>
>> Try keeping a counter in fcinfo->flinfo->fn_extra.
>
> Is this close to being correct?
>
> Datum
> rownum(PG_FUNCTION_ARGS)
> {
> int64 *row_counter;
>
> if (fcinfo->flinfo->fn_extra == NULL) {
> row_counter = (int64 *)MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
> sizeof(int64));
> *row_counter = 0;
> fcinfo->flinfo->fn_extra = row_counter;
> }
>
> row_counter = fcinfo->flinfo->fn_extra;
>
> PG_RETURN_INT64(++(*row_counter));
> }
>
>> > 3. And more important, need to be called in the right place when
>> called from subquerys:
>>
>> Don't expect miracles in this department. The planner will evaluate the
>> function where it sees fit...
>
> Would OFFSET 0 be the workaround in this case?
>
> SELECT rownum(), *
> FROM (SELECT rownum() AS f_rownum, id AS f_id FROM foo) AS f,
> (SELECT rownum() AS b_rownum, id AS b_id FROM bar) AS b;
> rownum | f_rownum | f_id | b_rownum | b_id
> --------+----------+-------+----------+-------
> 1 | 1 | foo-1 | 1 | bar-1
> 2 | 2 | foo-1 | 2 | bar-2
> 3 | 3 | foo-2 | 3 | bar-1
> 4 | 4 | foo-2 | 4 | bar-2
> 5 | 5 | foo-3 | 5 | bar-1
> 6 | 6 | foo-3 | 6 | bar-2
> (6 rows)
>
> SELECT rownum(), *
> FROM (SELECT rownum() AS f_rownum, id AS f_id FROM foo OFFSET 0) AS f,
> (SELECT rownum() AS b_rownum, id AS b_id FROM bar OFFSET 0) AS b;
> rownum | f_rownum | f_id | b_rownum | b_id
> --------+----------+-------+----------+-------
> 1 | 1 | foo-1 | 1 | bar-1
> 2 | 1 | foo-1 | 2 | bar-2
> 3 | 2 | foo-2 | 1 | bar-1
> 4 | 2 | foo-2 | 2 | bar-2
> 5 | 3 | foo-3 | 1 | bar-1
> 6 | 3 | foo-3 | 2 | bar-2
> (6 rows)
>

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To: "Jan Wieck" <JanWieck(at)yahoo(dot)com>
Cc: "Michael Fuhr" <mike(at)fuhr(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Juan Manuel Diaz Lara" <jmdiazlr(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Date: 2006-04-08 18:20:21
Message-ID: 36e682920604081120q768ab245je099cdf32c9a332f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4/8/06, Jan Wieck <JanWieck(at)yahoo(dot)com> wrote:
> Someone correct me if I'm wrong, but I was allways under the impression
> that Oracle's ROWNUM is a thing attached to a row in the final result
> set, whatever (possibly random) order that happens to have. Now a) this
> is something that IMHO belongs into the client or stored procedure code,
> b) if I am right, the code below will break as soon as an ORDER BY is
> added to the query and most importantly c) if a) cannot do the job, it
> indicates that the database schema or business process definition lacks
> some key/referential definition and is in need of a fix.

Yes, a rownum is assigned at fetch time.

An example is the following PostgreSQL query:

SELECT id FROM sometable ORDER BY id LIMIT 5;

In Oracle-land is written as:

SELECT id FROM (SELECT id FROM sometable ORDER BY id) WHERE rownum <= 5;

> My humble guess is that c) is also the reason why the ANSI didn't find a
> ROWNUM desirable.

I believe this is a good assumption.

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Michael Fuhr <mike(at)fuhr(dot)org>, Juan Manuel Diaz Lara <jmdiazlr(at)Yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Date: 2006-04-08 19:04:40
Message-ID: 25474.1144523080@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> My humble guess is that c) is also the reason why the ANSI didn't find a
> ROWNUM desirable.

I've never understood what the conceptual model is for Oracle's rownum.
Where along the SQL operational pipeline (FROM / WHERE / GROUP BY /
aggregate / compute output columns / ORDER BY) is it supposed to be
computed? To be useful for the often-requested purpose of nicely
labeling output with line numbers, it'd have to be assigned
post-ORDER-BY, but then it doesn't make any sense at all to use it in
WHERE, nor in sub-selects.

A function implemented as per Michael's example would not give the
results that I think people would expect for

SELECT rownum(), * FROM foo ORDER BY whatever;

unless the planner chances to do the ordering with an indexscan.
If it does it with a sort step then the rownums will be computed before
sorting :-(

regards, tom lane


From: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jan Wieck" <JanWieck(at)yahoo(dot)com>, "Michael Fuhr" <mike(at)fuhr(dot)org>, "Juan Manuel Diaz Lara" <jmdiazlr(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Date: 2006-04-08 19:19:43
Message-ID: 36e682920604081219q31adf9dct75b4c9fc65c3b3c1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4/8/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> A function implemented as per Michael's example would not give the
> results that I think people would expect for
>
> SELECT rownum(), * FROM foo ORDER BY whatever;
>

Yep, the query would have to be rewritten similar to Oracle's:

SELECT rownum(), * FROM (SELECT * FROM foo ORDER BY whatever);

IIRC, processing-wise, rownum and order-by processing is handled as follows:

SELECT id, whatever FROM foo WHERE rownum <= 10 ORDER BY id;

is the same as PostgreSQL's

SELECT id, whatever FROM (SELECT id, whatever FROM foo LIMIT 10) ORDER BY id;

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


From: David Fetter <david(at)fetter(dot)org>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Michael Fuhr <mike(at)fuhr(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Juan Manuel Diaz Lara <jmdiazlr(at)Yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Date: 2006-04-08 22:07:37
Message-ID: 20060408220736.GA19392@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 08, 2006 at 02:02:53PM -0400, Jan Wieck wrote:
> Someone correct me if I'm wrong, but I was allways under the impression
> that Oracle's ROWNUM is a thing attached to a row in the final result
> set, whatever (possibly random) order that happens to have. Now a) this
> is something that IMHO belongs into the client or stored procedure code,
> b) if I am right, the code below will break as soon as an ORDER BY is
> added to the query and most importantly c) if a) cannot do the job, it
> indicates that the database schema or business process definition lacks
> some key/referential definition and is in need of a fix.
>
> My humble guess is that c) is also the reason why the ANSI didn't find a
> ROWNUM desirable.

Sadly, ANSI did just that.

http://troels.arvin.dk/db/rdbms/#select-limit
http://troels.arvin.dk/db/rdbms/#select-top-n
http://troels.arvin.dk/db/rdbms/#select-limit-offset

Cheers,
D
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!


From: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To: "David Fetter" <david(at)fetter(dot)org>
Cc: "Jan Wieck" <JanWieck(at)yahoo(dot)com>, "Michael Fuhr" <mike(at)fuhr(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Juan Manuel Diaz Lara" <jmdiazlr(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Date: 2006-04-08 22:26:33
Message-ID: 36e682920604081526s5f5129cem9369c51672d3af1f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4/8/06, David Fetter <david(at)fetter(dot)org> wrote:
> Sadly, ANSI did just that.

Thanks for pointing that out... I'd been using LIMIT/OFFSET for so
long, I totally forgot about the standard :(

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


From: Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
To: Juan Manuel Diaz Lara <jmdiazlr(at)yahoo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Date: 2006-04-09 04:56:55
Message-ID: 44389417.4090907@calorieking.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I need a rownum column, like Oracle. I have searched the mailing lists
> and I don't see a satisfactory solution, so I was wondering write a UDF
> to implement it, the requirements are:

+1

I would _love_ to see rownums in PostgreSQL :)

Chris


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, Michael Fuhr <mike(at)fuhr(dot)org>, Juan Manuel Diaz Lara <jmdiazlr(at)Yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Date: 2006-04-11 21:28:21
Message-ID: 20060411212821.GW49405@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 08, 2006 at 03:04:40PM -0400, Tom Lane wrote:
> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> > My humble guess is that c) is also the reason why the ANSI didn't find a
> > ROWNUM desirable.
>
> I've never understood what the conceptual model is for Oracle's rownum.
> Where along the SQL operational pipeline (FROM / WHERE / GROUP BY /
> aggregate / compute output columns / ORDER BY) is it supposed to be
> computed? To be useful for the often-requested purpose of nicely
> labeling output with line numbers, it'd have to be assigned
> post-ORDER-BY, but then it doesn't make any sense at all to use it in
> WHERE, nor in sub-selects.

AFAIK rownum() is Oracle's solution to doing LIMIT ... OFFSET from
before those were ANSI. rownum() is applied as rows are leaving the
relevant node, which means you can't use rownum() in any part of a
SELECT statement other than the SELECT clause (you can't even use it in
a HAVING clause afaik, though I would think you should be able to). So,
if you want to actually do anything useful with rownum(), you have to
use it in a subquery and then operate at a higher level:

SELECT * FROM (SELECT rownum() AS row_number, * FROM table) z ORDER BY
row_number;

> A function implemented as per Michael's example would not give the
> results that I think people would expect for
>
> SELECT rownum(), * FROM foo ORDER BY whatever;
>
> unless the planner chances to do the ordering with an indexscan.
> If it does it with a sort step then the rownums will be computed before
> sorting :-(

I think you're right.

If people are that hot-to-trot about having Oracle compatable rownum()
in PostgreSQL, perhaps EnterpriseDB has some code they could share.
Though I think it'd be better to understand what people actually want
this info for. Personally I think having a rank function (or a complete
suite of analytic functions) would be far more useful.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: "Nicolas Barbier" <nicolas(dot)barbier(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to implement oracle like rownum(function or seudocolumn)
Date: 2006-04-11 23:51:03
Message-ID: b0f3f5a10604111651i2b3f3747oe348d22d2d54f925@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2006/4/8, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> I've never understood what the conceptual model is for Oracle's rownum.
> Where along the SQL operational pipeline (FROM / WHERE / GROUP BY /
> aggregate / compute output columns / ORDER BY) is it supposed to be
> computed? To be useful for the often-requested purpose of nicely
> labeling output with line numbers, it'd have to be assigned
> post-ORDER-BY, but then it doesn't make any sense at all to use it in
> WHERE, nor in sub-selects.
>
> A function implemented as per Michael's example would not give the
> results that I think people would expect for
>
> SELECT rownum(), * FROM foo ORDER BY whatever;
>
> unless the planner chances to do the ordering with an indexscan.
> If it does it with a sort step then the rownums will be computed before
> sorting :-(

I don't know about Oracle or ROW_NUM, but SQL apparently defines
ROW_NUMBER() OVER (..) (see
<url:http://en.wikipedia.org/wiki/Select_(SQL)#ROW_NUMBER.28.29_window_function>)

This gives a number for each output row, according to some ordering
(in SQL, one cannot do ORDER BY in a subquery AFAIK). If used in a
subquery, one can then of course use the resulting column in the WHERE
clause of the outer query:

SELECT * FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY key ASC) AS rownumber,
columns
FROM tablename
) AS foo
WHERE rownumber <= 10

(example stolen from the Wikipedia article linked above).

--
Nicolas Barbier
http://www.gnu.org/philosophy/no-word-attachments.html