Re: Null row vs. row of nulls in plpgsql

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: "Oleg Serov" <serovov(at)gmail(dot)com>
Subject: Null row vs. row of nulls in plpgsql
Date: 2008-09-27 18:56:36
Message-ID: 16356.1222541796@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I looked a bit at the bug report here:
http://archives.postgresql.org/pgsql-bugs/2008-09/msg00164.php

ISTM that the fundamental problem is that plpgsql doesn't distinguish
properly between a null row value (eg, "null::somerowtype") and a
row of null values (eg, "row(null,null,...)::somerowtype"). When that
code was designed, our main SQL engine was pretty fuzzy about the
difference too, but now there is a clear semantic distinction.

For plpgsql's RECORD variables this doesn't seem hard to fix: just
take out the code in exec_move_row() that manufactures a row of nulls
when the input is null, and maybe make a few small adjustments
elsewhere. For ROW variables there's a bigger problem, because those
are represented by a list of per-field variables, which doesn't
immediately offer any way to represent overall nullness. I think it
could be dealt with by adding an explicit "the row as a whole is null"
flag to struct PLpgSQL_row. I haven't tried to code it though, so I'm
not sure if there are gotchas or unreasonably large code changes needed
to make it happen.

I thought for a little bit about whether we couldn't get rid of ROW
variables entirely, or at least make them work more like RECORD variables
by storing a HeapTuple instead of a list of per-field variables. But
I soon found out that the reason to have them is to be able to describe
the assignment target of SQL statements that assign to multiple scalar
variables, eg "SELECT ... INTO x,y,z".

Comments?

regards, tom lane


From: "Oleg Serov" <serovov(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Fwd: Null row vs. row of nulls in plpgsql
Date: 2008-09-27 19:59:50
Message-ID: cec7c6df0809271259t3c17c520m489131b9194d1e0d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

---------- Forwarded message ----------
From: Oleg Serov <serovov(at)gmail(dot)com>
Date: 2008/9/27
Subject: Re: Null row vs. row of nulls in plpgsql
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>

I'm newbie, but i think that adding bool flag to PLpgSQL_row isnull will
handle the problem(like in PLpgSQL_var);

2008/9/27 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>

I looked a bit at the bug report here:
> http://archives.postgresql.org/pgsql-bugs/2008-09/msg00164.php
>
> ISTM that the fundamental problem is that plpgsql doesn't distinguish
> properly between a null row value (eg, "null::somerowtype") and a
> row of null values (eg, "row(null,null,...)::somerowtype"). When that
> code was designed, our main SQL engine was pretty fuzzy about the
> difference too, but now there is a clear semantic distinction.
>
> For plpgsql's RECORD variables this doesn't seem hard to fix: just
> take out the code in exec_move_row() that manufactures a row of nulls
> when the input is null, and maybe make a few small adjustments
> elsewhere. For ROW variables there's a bigger problem, because those
> are represented by a list of per-field variables, which doesn't
> immediately offer any way to represent overall nullness. I think it
> could be dealt with by adding an explicit "the row as a whole is null"
> flag to struct PLpgSQL_row. I haven't tried to code it though, so I'm
> not sure if there are gotchas or unreasonably large code changes needed
> to make it happen.
>
> I thought for a little bit about whether we couldn't get rid of ROW
> variables entirely, or at least make them work more like RECORD variables
> by storing a HeapTuple instead of a list of per-field variables. But
> I soon found out that the reason to have them is to be able to describe
> the assignment target of SQL statements that assign to multiple scalar
> variables, eg "SELECT ... INTO x,y,z".
>
> Comments?
>
> regards, tom lane
>


From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org, Oleg Serov <serovov(at)gmail(dot)com>
Subject: Re: Null row vs. row of nulls in plpgsql
Date: 2008-09-27 20:42:54
Message-ID: 1222548174.6993.23.camel@huvostro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2008-09-27 at 14:56 -0400, Tom Lane wrote:
> I looked a bit at the bug report here:
> http://archives.postgresql.org/pgsql-bugs/2008-09/msg00164.php
>
> ISTM that the fundamental problem is that plpgsql doesn't distinguish
> properly between a null row value (eg, "null::somerowtype") and a
> row of null values (eg, "row(null,null,...)::somerowtype"). When that
> code was designed, our main SQL engine was pretty fuzzy about the
> difference too, but now there is a clear semantic distinction.
>
> For plpgsql's RECORD variables this doesn't seem hard to fix: just
> take out the code in exec_move_row() that manufactures a row of nulls
> when the input is null, and maybe make a few small adjustments
> elsewhere. For ROW variables there's a bigger problem, because those
> are represented by a list of per-field variables, which doesn't
> immediately offer any way to represent overall nullness. I think it
> could be dealt with by adding an explicit "the row as a whole is null"
> flag to struct PLpgSQL_row. I haven't tried to code it though, so I'm
> not sure if there are gotchas or unreasonably large code changes needed
> to make it happen.
>
> I thought for a little bit about whether we couldn't get rid of ROW
> variables entirely, or at least make them work more like RECORD variables
> by storing a HeapTuple instead of a list of per-field variables. But
> I soon found out that the reason to have them is to be able to describe
> the assignment target of SQL statements that assign to multiple scalar
> variables, eg "SELECT ... INTO x,y,z".

How hard would it be to have a RECORD that has pointers to those
multiple scalar variables ?

Referring again to my favorite ordinary programming language python, you
can have a very elegant way of assigning a "record" (a tuple in
pythonese) to a set of variables and vice versa

>>> rec = 1,2,3
>>> rec
(1, 2, 3)
>>> a,b,c = rec
>>> a
1
>>> c
3
>>> c,b,a
(3, 2, 1)

In other words, tuples are more or less automatically composed and
decomposed on demand.

I have not yet looked how hard the implementation of this would be for
postgreSQL, but at least the concept should be applicable.

----------------
Hannu


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Oleg Serov <serovov(at)gmail(dot)com>
Subject: Re: Null row vs. row of nulls in plpgsql
Date: 2008-09-27 21:00:49
Message-ID: 19114.1222549249@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hannu Krosing <hannu(at)2ndQuadrant(dot)com> writes:
> Referring again to my favorite ordinary programming language python, you
> can have a very elegant way of assigning a "record" (a tuple in
> pythonese) to a set of variables and vice versa

There are already perfectly good ways to do that in plpgsql, viz
rowvar := row(x,y,z);
select rowvar.* into x,y,z;

regards, tom lane


From: Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Oleg Serov <serovov(at)gmail(dot)com>
Subject: Re: Null row vs. row of nulls in plpgsql
Date: 2008-09-28 01:03:06
Message-ID: ADBBAACB-0C14-40E8-8534-92450B00C199@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Iirc the reason for this fuzziness came from the SQL spec definition
of IS NULL for rows. As long as you maintain that level of spec-
compliance I don't think there are any other important constraints on
pg behaviour.

greg

--sorry for the top posting but the phone makes it hard to do anything
else.

On 27 Sep 2008, at 09:56 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I looked a bit at the bug report here:
> http://archives.postgresql.org/pgsql-bugs/2008-09/msg00164.php
>
> ISTM that the fundamental problem is that plpgsql doesn't distinguish
> properly between a null row value (eg, "null::somerowtype") and a
> row of null values (eg, "row(null,null,...)::somerowtype"). When that
> code was designed, our main SQL engine was pretty fuzzy about the
> difference too, but now there is a clear semantic distinction.
>
> For plpgsql's RECORD variables this doesn't seem hard to fix: just
> take out the code in exec_move_row() that manufactures a row of nulls
> when the input is null, and maybe make a few small adjustments
> elsewhere. For ROW variables there's a bigger problem, because those
> are represented by a list of per-field variables, which doesn't
> immediately offer any way to represent overall nullness. I think it
> could be dealt with by adding an explicit "the row as a whole is null"
> flag to struct PLpgSQL_row. I haven't tried to code it though, so I'm
> not sure if there are gotchas or unreasonably large code changes
> needed
> to make it happen.
>
> I thought for a little bit about whether we couldn't get rid of ROW
> variables entirely, or at least make them work more like RECORD
> variables
> by storing a HeapTuple instead of a list of per-field variables. But
> I soon found out that the reason to have them is to be able to
> describe
> the assignment target of SQL statements that assign to multiple scalar
> variables, eg "SELECT ... INTO x,y,z".
>
> Comments?
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Oleg Serov <serovov(at)gmail(dot)com>
Subject: Re: Null row vs. row of nulls in plpgsql
Date: 2008-09-28 11:45:03
Message-ID: 1222602303.7044.6.camel@huvostro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2008-09-28 at 04:03 +0300, Greg Stark wrote:
> Iirc the reason for this fuzziness came from the SQL spec definition
> of IS NULL for rows. As long as you maintain that level of spec-
> compliance I don't think there are any other important constraints on
> pg behaviour.

What does SQL spec say about recursive IS NULL for rows ?

Should we check that IS NULL is true for each row element, or must they
actually be NULL's ?

hannu=# select row(null, null) is NULL;
?column?
----------
t
(1 row)

hannu=# select row(null, row(null, null)) is NULL;
?column?
----------
f
(1 row)

--------------
Hannu


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Oleg Serov <serovov(at)gmail(dot)com>
Subject: Re: Null row vs. row of nulls in plpgsql
Date: 2008-09-29 00:27:55
Message-ID: 22365.1222648075@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <greg(dot)stark(at)enterprisedb(dot)com> writes:
> On 27 Sep 2008, at 09:56 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> ISTM that the fundamental problem is that plpgsql doesn't distinguish
>> properly between a null row value (eg, "null::somerowtype") and a
>> row of null values (eg, "row(null,null,...)::somerowtype"). When that
>> code was designed, our main SQL engine was pretty fuzzy about the
>> difference too, but now there is a clear semantic distinction.

> Iirc the reason for this fuzziness came from the SQL spec definition
> of IS NULL for rows. As long as you maintain that level of spec-
> compliance I don't think there are any other important constraints on
> pg behaviour.

I started to poke into this and found out that it was a bit subtler than
I thought. It'd be possible to associate a "rowisnull" state value
with a row variable, but the problem is that plpgsql treats the row
fields as independent variables that can be accessed without touching
the row. In particular you can assign null or nonnull values to
individual fields. So consider

-- presumably, this'll set rowisnull to TRUE:
rowvar := NULL;
-- this had better cause rowisnull to become FALSE:
rowvar.field1 := 42;
-- does this cause it to become TRUE again?
rowvar.field1 := NULL;

There are a bunch of implementation problems with making any such
behavior happen, since the row field variables don't currently "know"
that they are members of a row, and indeed it's possible for the same
variable to be a member of more than one row. But the core issue is
that this interaction seems to fuzz the distinction between "row is
null" and "all the row's elements are null". In particular, if you
think that rowisnull should be TRUE after the above sequence, then
I think you are saying they are the same thing. So maybe the spec
authors are smarter than we are.

Thoughts? What would a consistent behavior look like?

regards, tom lane


From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Greg Stark" <greg(dot)stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Oleg Serov" <serovov(at)gmail(dot)com>
Subject: Re: Null row vs. row of nulls in plpgsql
Date: 2008-09-29 07:55:34
Message-ID: 162867790809290055j4419110bs1c53197038ad34f8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2008/9/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Greg Stark <greg(dot)stark(at)enterprisedb(dot)com> writes:
>> On 27 Sep 2008, at 09:56 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> ISTM that the fundamental problem is that plpgsql doesn't distinguish
>>> properly between a null row value (eg, "null::somerowtype") and a
>>> row of null values (eg, "row(null,null,...)::somerowtype"). When that
>>> code was designed, our main SQL engine was pretty fuzzy about the
>>> difference too, but now there is a clear semantic distinction.
>
>> Iirc the reason for this fuzziness came from the SQL spec definition
>> of IS NULL for rows. As long as you maintain that level of spec-
>> compliance I don't think there are any other important constraints on
>> pg behaviour.
>
> I started to poke into this and found out that it was a bit subtler than
> I thought. It'd be possible to associate a "rowisnull" state value
> with a row variable, but the problem is that plpgsql treats the row
> fields as independent variables that can be accessed without touching
> the row. In particular you can assign null or nonnull values to
> individual fields. So consider
>
> -- presumably, this'll set rowisnull to TRUE:
> rowvar := NULL;
> -- this had better cause rowisnull to become FALSE:
> rowvar.field1 := 42;
> -- does this cause it to become TRUE again?
> rowvar.field1 := NULL;

this sequence is wrong. in SQL rowvar has same behave as pointer. When
you would to fill rowvar you should to call constructor first.

some like
rowvar := NULL; -- null value
rowvar := constructor(null);
rowvar := constructor();
rowvar.field = 42;

regards
Pavel Stehule

>
> There are a bunch of implementation problems with making any such
> behavior happen, since the row field variables don't currently "know"
> that they are members of a row, and indeed it's possible for the same
> variable to be a member of more than one row. But the core issue is
> that this interaction seems to fuzz the distinction between "row is
> null" and "all the row's elements are null". In particular, if you
> think that rowisnull should be TRUE after the above sequence, then
> I think you are saying they are the same thing. So maybe the spec
> authors are smarter than we are.
>
> Thoughts? What would a consistent behavior look like?
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Oleg Serov <serovov(at)gmail(dot)com>
Subject: Re: Null row vs. row of nulls in plpgsql
Date: 2009-01-08 03:16:35
Message-ID: 200901080316.n083GZQ13749@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I assume this is a TODO, right?

---------------------------------------------------------------------------

Tom Lane wrote:
> I looked a bit at the bug report here:
> http://archives.postgresql.org/pgsql-bugs/2008-09/msg00164.php
>
> ISTM that the fundamental problem is that plpgsql doesn't distinguish
> properly between a null row value (eg, "null::somerowtype") and a
> row of null values (eg, "row(null,null,...)::somerowtype"). When that
> code was designed, our main SQL engine was pretty fuzzy about the
> difference too, but now there is a clear semantic distinction.
>
> For plpgsql's RECORD variables this doesn't seem hard to fix: just
> take out the code in exec_move_row() that manufactures a row of nulls
> when the input is null, and maybe make a few small adjustments
> elsewhere. For ROW variables there's a bigger problem, because those
> are represented by a list of per-field variables, which doesn't
> immediately offer any way to represent overall nullness. I think it
> could be dealt with by adding an explicit "the row as a whole is null"
> flag to struct PLpgSQL_row. I haven't tried to code it though, so I'm
> not sure if there are gotchas or unreasonably large code changes needed
> to make it happen.
>
> I thought for a little bit about whether we couldn't get rid of ROW
> variables entirely, or at least make them work more like RECORD variables
> by storing a HeapTuple instead of a list of per-field variables. But
> I soon found out that the reason to have them is to be able to describe
> the assignment target of SQL statements that assign to multiple scalar
> variables, eg "SELECT ... INTO x,y,z".
>
> Comments?
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Oleg Serov <serovov(at)gmail(dot)com>
Subject: Re: Null row vs. row of nulls in plpgsql
Date: 2009-01-08 03:38:51
Message-ID: 5554.1231385931@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> I assume this is a TODO, right?

Yah.

regards, tom lane