Strange RETURN NEXT behaviour in Postgres 8.0

Lists: pgsql-hackers
From: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-12 17:10:46
Message-ID: Pine.LNX.4.44.0502121929520.29807-100000@lnfm1.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi All

I have a quite strange problem with RETURN NEXT statement.
I have a big table with 500 millions of rows running on Postgres 8.0.
Table "public.usno"
Column | Type | Modifiers
--------+--------+-----------
ra | real |
dec | real |
bmag | real |
rmag | real |
ipix | bigint |
errbox | box |
Indexes:
"box_ind" rtree (errbox)
"ipix_ind" btree (ipix)
"radec_ind" btree (ra, "dec")

I actually wrote some procedures in PL/SQL using dynamical queries,
and once I obtained the following error.
ERROR: wrong record type supplied in RETURN NEXT
CONTEXT: PL/pgSQL function "yyy" line 8 at return next

To solve the problem, I used just the following simple PL/SQL functions, and
a query "select * from yyy()"

CREATE OR REPLACE FUNCTION xxx(refcursor) RETURNS refcursor AS '
DECLARE query varchar;
BEGIN
query = ''SELECT * FROM usno'';
OPEN $1 FOR EXECUTE query;
RETURN $1;
END;
' LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION yyy() RETURNS SETOF usno AS '
DECLARE rec record;
DECLARE cur refcursor;
BEGIN
cur=xxx(''curs_name'');
LOOP
FETCH cur into rec;
RETURN NEXT rec;
EXIT WHEN NOT FOUND;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;

I was quite surprised by this errors, because I have tried the same
functions on rather same (but smaller table) on Postgres 7.4.6 on my laptop
without any problems.

For debugging purposes, I just have created by hand on Postgres 8.0 machine
the small table "q3c" with just two rows, but same structure as usno table.
Table "public.q3c"
Column | Type | Modifiers
--------+--------+-----------
ra | real |
dec | real |
bmag | real |
rmag | real |
ipix | bigint |
errbox | box |
Indexes:
"box_ind1" rtree (errbox)
"ipix_ind1" btree (ipix)
"radec_ind1" btree (ra, "dec")

And, after replacing "usno"->"q3c" in the xxx() and yyy(), the query
"select * from yyy()" worked without problems!!!

So, how can it be, that my PL/SQL functions works fine on one(smaller)
table, but fails on another(bigger) table.

Thanks in advance for any ideas.
Sergey
PS
I have tried my code replacing the declaration
"rec record;" by "rec TABLE_NAME%ROWTYPE", and it worked for both (big and
small table), but I don't understand, why it doesn't work with the type
"record".

------------------------------------------------------------
Sergey E. Koposov
Sternberg Astronomical Institute, Moscow University (Russia)
Max-Planck Institute for Astronomy (Germany)
Internet: math(at)sai(dot)msu(dot)ru, http://lnfm1.sai.msu.su/~math/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-16 18:54:28
Message-ID: 26214.1108580068@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Sergey E. Koposov" <math(at)sai(dot)msu(dot)ru> writes:
> LOOP
> FETCH cur into rec;
> RETURN NEXT rec;
> EXIT WHEN NOT FOUND;
> END LOOP;
> RETURN;

Don't you think you should have the EXIT *above* the RETURN NEXT?
I would expect this to emit a bogus row of nulls after the last row
returned by the cursor. (At least that's what I get with current
sources. Pre-8.0 it might return the last row twice.)

Running it on a 500-million-row table would quite possibly run out of
memory or disk space, too, because RETURN NEXT accumulates all the
results before the function is actually exited.

regards, tom lane


From: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-16 20:28:45
Message-ID: Pine.LNX.4.44.0502162252060.25847-100000@lnfm1.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> "Sergey E. Koposov" <math(at)sai(dot)msu(dot)ru> writes:
> > LOOP
> > FETCH cur into rec;
> > RETURN NEXT rec;
> > EXIT WHEN NOT FOUND;
> > END LOOP;
> > RETURN;
>
> Don't you think you should have the EXIT *above* the RETURN NEXT?
> I would expect this to emit a bogus row of nulls after the last row
> returned by the cursor. (At least that's what I get with current
> sources. Pre-8.0 it might return the last row twice.)

Yes, surely EXIT should be written before RETURN NEXT, it was my error,
(thanks, but I've found that error by myself, after posting my message)
But that small bug does not affect the original problem.

> Running it on a 500-million-row table would quite possibly run out of
> memory or disk space, too, because RETURN NEXT accumulates all the
> results before the function is actually exited.

Yes, that's right, but I did not waited until the whole table was loaded in
the function. The error, which is the subject of current thread occured
just immediately after "select * from yyy()", so surely was not caused by
memory overfilling.

Concerning to the exact form of my functions (using cursors, but still
collecting all the data in the memory). As I understand this is the only one
way (or just the simplest way ???)
to execute fully dynamic queries returned by C function in PL/SQL.
For the real functions which I use, instead of

query = ''SELECT * FROM usno'';

I have

query = my_C_function(some_args);

(see full code in my first message)

------------------------------------------------------------
Sergey E. Koposov
Sternberg Astronomical Institute, Moscow University (Russia)
Max-Planck Institute for Astronomy (Germany)
Internet: math(at)sai(dot)msu(dot)ru, http://lnfm1.sai.msu.su/~math/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-16 20:32:47
Message-ID: 27975.1108585967@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Sergey E. Koposov" <math(at)sai(dot)msu(dot)ru> writes:
> Concerning to the exact form of my functions (using cursors, but still
> collecting all the data in the memory). As I understand this is the only one
> way (or just the simplest way ???)
> to execute fully dynamic queries returned by C function in PL/SQL.
> For the real functions which I use, instead of

> query = ''SELECT * FROM usno'';

> I have

> query = my_C_function(some_args);

Oh? I'd make a small side bet that the underlying error is in your C
function --- possibly it's tromping on some data structure and the
damage doesn't have an effect till later. If you can demonstrate the
problem without using any custom C functions then I'd be interested to
see a test case.

regards, tom lane


From: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-16 20:53:15
Message-ID: Pine.LNX.4.44.0502162334280.25847-100000@lnfm1.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > For the real functions which I use, instead of
>
> > query = ''SELECT * FROM usno'';
>
> > I have
>
> > query = my_C_function(some_args);
>
> Oh? I'd make a small side bet that the underlying error is in your C
> function --- possibly it's tromping on some data structure and the
> damage doesn't have an effect till later. If you can demonstrate the
> problem without using any custom C functions then I'd be interested to
> see a test case.

I want to clarify, that I have a problem even without my C functions!!

And show the full exact(but long) test case, which I performed just now
specially.

I begin from table usno with 500 millions records

wsdb=# \d usno
Table "public.usno"
Column | Type | Modifiers
--------+--------+-----------
ra | real |
dec | real |
bmag | real |
rmag | real |
ipix | bigint |
errbox | box |
Indexes:
"box_ind" rtree (errbox)
"ipix_ind" btree (ipix)
"radec_ind" btree (ra, "dec")

The declaration of the functions:

CREATE OR REPLACE FUNCTION xxx(refcursor) RETURNS refcursor AS '
DECLARE query varchar;
BEGIN
query = ''SELECT * FROM usno'';
OPEN $1 FOR EXECUTE query;
RETURN $1;
END;
' LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION yyy() RETURNS SETOF usno AS '
DECLARE rec record;
DECLARE cur refcursor;
BEGIN
cur=xxx(''curs_name'');
LOOP
FETCH cur into rec;
EXIT WHEN NOT FOUND;
RETURN NEXT rec;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;

wsdb=# \i q3c.sql
CREATE FUNCTION
CREATE FUNCTION
wsdb=# select * from yyy();
ERROR: wrong record type supplied in RETURN NEXT
CONTEXT: PL/pgSQL function "yyy" line 10 at return next

We see the error

#############################################

Now with q3c table instead of unso

wsdb=# \d q3c
Table "public.q3c"
Column | Type | Modifiers
--------+--------+-----------
ra | real |
dec | real |
bmag | real |
rmag | real |
ipix | bigint |
errbox | box |
Indexes:
"box_ind1" rtree (errbox)
"ipix_ind1" btree (ipix)
"radec_ind1" btree (ra, "dec")

That table is not empty but filled by random numbers

wsdb=# select * from q3c;
ra | dec | bmag | rmag | ipix | errbox
----+-----+------+------+------+-------------
3 | 3 | 4 | 5 | 55 | (5,6),(3,4)
4 | 5 | 6 | 5 | 33 | (3,4),(1,2)
(2 rows)

Now the changed functions (notice, the only difference is
replacing all occurencies of "usno" to "q3c")

CREATE OR REPLACE FUNCTION xxx(refcursor) RETURNS refcursor AS '
DECLARE query varchar;
BEGIN
query = ''SELECT * FROM q3c'';
OPEN $1 FOR EXECUTE query;
RETURN $1;
END;
' LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION yyy() RETURNS SETOF q3c AS '
DECLARE rec record;
DECLARE cur refcursor;
BEGIN
cur=xxx(''curs_name'');
LOOP
FETCH cur into rec;
EXIT WHEN NOT FOUND;
RETURN NEXT rec;
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;

wsdb=# drop FUNCTION yyy();
DROP FUNCTION
wsdb=# \i q3c.sql
CREATE FUNCTION
CREATE FUNCTION
wsdb=# select * from yyy();
ra | dec | bmag | rmag | ipix | errbox
----+-----+------+------+------+-------------
3 | 3 | 4 | 5 | 55 | (5,6),(3,4)
4 | 5 | 6 | 5 | 33 | (3,4),(1,2)
(2 rows)

We don't see the error. But the only change was the change from one big
table to a smaller one with the precisely same structure.

###########################################


From: Richard Huxton <dev(at)archonet(dot)com>
To: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-16 21:22:20
Message-ID: 4213B98C.7040908@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sergey E. Koposov wrote:
>>>For the real functions which I use, instead of
>>
>>>query = ''SELECT * FROM usno'';
>>
>>>I have
>>
>>>query = my_C_function(some_args);
>>
>>Oh? I'd make a small side bet that the underlying error is in your C
>>function --- possibly it's tromping on some data structure and the
>>damage doesn't have an effect till later. If you can demonstrate the
>>problem without using any custom C functions then I'd be interested to
>>see a test case.
>
>
> I want to clarify, that I have a problem even without my C functions!!
>
> And show the full exact(but long) test case, which I performed just now
> specially.
>
> I begin from table usno with 500 millions records
>
> wsdb=# \d usno
> Table "public.usno"
> Column | Type | Modifiers
> --------+--------+-----------
> ra | real |
> dec | real |
> bmag | real |
> rmag | real |
> ipix | bigint |
> errbox | box |
> Indexes:
> "box_ind" rtree (errbox)
> "ipix_ind" btree (ipix)
> "radec_ind" btree (ra, "dec")

This is just a shot in the dark, but I don't suppose you've dropped or
modified any columns in "usno" have you?

I seem to remember some subtle problems with dropped columns and plpgsql
functions - could be one of those still left. It'd look like tablesize
was the problem because of course no-one's got time to test with 500
million test rows.
--
Richard Huxton
Archonet Ltd


From: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-16 21:28:41
Message-ID: Pine.LNX.4.44.0502170023550.14831-100000@lnfm1.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 16 Feb 2005, Richard Huxton wrote:
> This is just a shot in the dark, but I don't suppose you've dropped or
> modified any columns in "usno" have you?
>
> I seem to remember some subtle problems with dropped columns and plpgsql
> functions - could be one of those still left. It'd look like tablesize
> was the problem because of course no-one's got time to test with 500
> million test rows.

1) I have static tables. I dont modify them!
2) My test table is q3c (with 2 rows), and the table with 500 millions of
rows is not test table, it is the table with data :-)

> --
> Richard Huxton
> Archonet Ltd
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-17 00:38:42
Message-ID: 1149.1108600722@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Richard Huxton <dev(at)archonet(dot)com> writes:
> I seem to remember some subtle problems with dropped columns and plpgsql
> functions - could be one of those still left.

For instance:

regression=# create table usno (ra real, dec real, bmag real, rmag real,ipix int8);
CREATE TABLE
regression=# [ create Sergey's functions ]
regression=# insert into usno values(1,2,3,4);
INSERT 1132435 1
regression=# select * from yyy();
ra | dec | bmag | rmag | ipix
----+-----+------+------+------
1 | 2 | 3 | 4 |
(1 row)

regression=# alter table usno add column errbox box;
ALTER TABLE
regression=# select * from yyy();
ra | dec | bmag | rmag | ipix | errbox
----+-----+------+------+------+--------
1 | 2 | 3 | 4 | |
(1 row)

regression=# alter table usno drop column errbox;
ALTER TABLE
regression=# select * from yyy();
ERROR: wrong record type supplied in RETURN NEXT
CONTEXT: PL/pgSQL function "yyy" line 8 at return next
regression=#

It looks like the code that handles returning a RECORD variable doesn't
cope with dropped columns in the function result rowtype.

(If you instead declare rec as usno%rowtype, you get a different set
of misbehaviors after adding/dropping columns, so that code path isn't
perfect either :-()

regards, tom lane


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Richard Huxton <dev(at)archonet(dot)com>, "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-17 09:15:58
Message-ID: 421460CE.60001@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> It looks like the code that handles returning a RECORD variable doesn't
> cope with dropped columns in the function result rowtype.
>
> (If you instead declare rec as usno%rowtype, you get a different set
> of misbehaviors after adding/dropping columns, so that code path isn't
> perfect either :-()

Isn't it amazing, Tom, that that column dropping code that we did up for
7.3 is STILL causing bugs :D

Chris


From: "Sergey E(dot) Koposov" <math(at)sai(dot)msu(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Richard Huxton <dev(at)archonet(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Strange RETURN NEXT behaviour in Postgres 8.0
Date: 2005-02-17 20:57:53
Message-ID: Pine.LNX.4.44.0502171203140.7439-100000@lnfm1.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 16 Feb 2005, Tom Lane wrote:

> Richard Huxton <dev(at)archonet(dot)com> writes:
> > I seem to remember some subtle problems with dropped columns and plpgsql
> > functions - could be one of those still left.
>
> It looks like the code that handles returning a RECORD variable doesn't
> cope with dropped columns in the function result rowtype.
>
> (If you instead declare rec as usno%rowtype, you get a different set
> of misbehaviors after adding/dropping columns, so that code path isn't
> perfect either :-()

Finally I want to clarify, that after copying my "usno" table into another,
the problems have disappeared.

So I had experienced just exacty the bug with dropped columns.

So, is there a chance that this bug will be fixed in some 8.X postgres ?

Sergey