Assigning NULL to a record variable

Lists: pgsql-hackers
From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Assigning NULL to a record variable
Date: 2012-09-20 21:05:19
Message-ID: 20120920210519.241290@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Related to bug #6123, Wisconsin Courts are now using triggers with
the workaround to be safe with the patch put forward by Tom, even
though they are still running with the earlier patch proposal by me
(which tolerates an UPDATE or DELETE of a row being deleted).  The
general structure of the BEFORE DELETE triggers with cascading logic
was like this:

DECLARE
 return_value parenttable;
BEGIN
 return_value := OLD;

 DELETE FROM childtable1
   WHERE <child of parent>;
 IF FOUND THEN
   return_value := NULL;
 END IF;

 DELETE FROM childtablen
   WHERE <child of parent>;
 IF FOUND THEN
   return_value := NULL;
 END IF;

 IF return_value IS NULL THEN
   DELETE FROM parent
     WHERE <primary key matches OLD>;
 END IF;

 RETURN return_value;
END;

This did not work for cases where the AFTER DELETE trigger performed
an action which was not idempotent because, while return_value was
NULL enough to enter that last IF clause, it was not NULL enough to
prevent the DELETE attempt and fire subsequent triggers.  The
assignment of NULL to a variable with a record type doesn't assign a
"simple" NULL, but a record with NULL in each element.  It seems
like a POLA violation that:

 return_value := NULL;
 RETURN return_value;

behaves differently than:

 RETURN NULL;

We've fixed the afflicted trigger functions by adding a RETURN NULL;
inside that last IF block, but:

- Is this behavior required by standard?
- If not, do we want to keep this behavior?
- If we keep this behavior, should the triggering operation be
  suppressed when (NOT return_value IS NOT NULL) instead of when
  (return_value IS NOT DISTINCT FROM NULL)?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <kgrittn(at)mail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Assigning NULL to a record variable
Date: 2012-09-20 21:31:08
Message-ID: 15727.1348176668@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <kgrittn(at)mail(dot)com> writes:
> ... This did not work for cases where the AFTER DELETE trigger performed
> an action which was not idempotent because, while return_value was
> NULL enough to enter that last IF clause, it was not NULL enough to
> prevent the DELETE attempt and fire subsequent triggers. The
> assignment of NULL to a variable with a record type doesn't assign a
> "simple" NULL, but a record with NULL in each element.

I believe that this is forced by plpgsql's implementation. IIRC, a
declared variable of a named composite type (not RECORD) is implemented
as a "row" structure, meaning it actually consists of a separate plpgsql
variable for each column. So there's no physical way for it to represent
a "simple NULL" composite value.

I've suggested in the past that we might want to go over to treating
such variables more like RECORD, ie the representation is always a
HeapTuple. I'm not sure what the performance tradeoffs would be ---
some things would get faster and others slower, probably, since field
access would be more work but conversion to/from HeapTuple would get far
cheaper.

> - If we keep this behavior, should the triggering operation be
> suppressed when (NOT return_value IS NOT NULL) instead of when
> (return_value IS NOT DISTINCT FROM NULL)?

Can't do that, because it would break the perfectly-legitimate case
where the trigger is trying to process a row of all nulls.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Assigning NULL to a record variable
Date: 2012-09-20 21:39:15
Message-ID: CAFj8pRDuecDJNkRgyMmrtN0yFhqSg1OqnK1QwdDbEFN2fY1aVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/9/20 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> "Kevin Grittner" <kgrittn(at)mail(dot)com> writes:
>> ... This did not work for cases where the AFTER DELETE trigger performed
>> an action which was not idempotent because, while return_value was
>> NULL enough to enter that last IF clause, it was not NULL enough to
>> prevent the DELETE attempt and fire subsequent triggers. The
>> assignment of NULL to a variable with a record type doesn't assign a
>> "simple" NULL, but a record with NULL in each element.
>
> I believe that this is forced by plpgsql's implementation. IIRC, a
> declared variable of a named composite type (not RECORD) is implemented
> as a "row" structure, meaning it actually consists of a separate plpgsql
> variable for each column. So there's no physical way for it to represent
> a "simple NULL" composite value.
>
> I've suggested in the past that we might want to go over to treating
> such variables more like RECORD, ie the representation is always a
> HeapTuple.

I had same idea when I worked on SQL/PSM - but there is significant
difference in performance (probably less in real tasks)

I'm not sure what the performance tradeoffs would be ---
> some things would get faster and others slower, probably, since field
> access would be more work but conversion to/from HeapTuple would get far
> cheaper.

when fields are fix length, then field's update is significantly
faster then when RECORD is used

>
>> - If we keep this behavior, should the triggering operation be
>> suppressed when (NOT return_value IS NOT NULL) instead of when
>> (return_value IS NOT DISTINCT FROM NULL)?
>
> Can't do that, because it would break the perfectly-legitimate case
> where the trigger is trying to process a row of all nulls.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Assigning NULL to a record variable
Date: 2012-09-20 21:55:19
Message-ID: 16302.1348178119@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> 2012/9/20 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> I'm not sure what the performance tradeoffs would be ---
>> some things would get faster and others slower, probably, since field
>> access would be more work but conversion to/from HeapTuple would get far
>> cheaper.

> when fields are fix length, then field's update is significantly
> faster then when RECORD is used

[ shrug... ] Maybe not if we put a little effort into it. It would be
interesting to consider using a TupleTableSlot instead of a bare
HeapTuple for instance. In any case you're ignoring the point that
other things will get faster.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Assigning NULL to a record variable
Date: 2012-09-21 04:20:41
Message-ID: CAFj8pRD_ZfmRm-FqCmGBLhBOM7ztsB4Dz39a6r-5_ePLv53L6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/9/20 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> 2012/9/20 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> I'm not sure what the performance tradeoffs would be ---
>>> some things would get faster and others slower, probably, since field
>>> access would be more work but conversion to/from HeapTuple would get far
>>> cheaper.
>
>> when fields are fix length, then field's update is significantly
>> faster then when RECORD is used
>
> [ shrug... ] Maybe not if we put a little effort into it. It would be
> interesting to consider using a TupleTableSlot instead of a bare
> HeapTuple for instance. In any case you're ignoring the point that
> other things will get faster.

I didn't try to optimize this. But these tests showed important impact.

postgres=# create table foo(a int, b int, c int);
CREATE TABLE
postgres=# insert into foo select 1,2,3 from generate_series(1,100000);
INSERT 0 100000
postgres=# select count(*) from foo;
count
--------
100000
(1 row)

postgres=# do $$ declare r record; begin for r in select * from foo
loop perform r.a + r.b + r.c; end loop; end; $$ language plpgsql;
DO
Time: 712.404 ms

postgres=# do $$ declare r foo; begin for r in select * from foo loop
perform r.a + r.b + r.c; end loop; end; $$ language plpgsql;
DO
Time: 1617.847 ms

In this case, record is faster - it was used in PL/PSM0 too

postgres=# do $$ declare r record; begin for r in select * from foo
loop r.a := r.b + r.c; end loop; end; $$ language plpgsql;
DO
Time: 170.701 ms

postgres=# do $$ declare r foo; begin for r in select * from foo loop
r.a := r.b + r.c; end loop; end; $$ language plpgsql;
DO
Time: 96.467 ms

or

postgres=# do $$ declare r record; t int; begin for r in select * from
foo loop t := r.b + r.c; end loop; end; $$ language plpgsql;
DO
Time: 127.760 ms

postgres=# do $$ declare r foo; t int; begin for r in select * from
foo loop t := r.b + r.c; end loop; end; $$ language plpgsql;
DO
Time: 87.003 ms

typed composite variable is significantly faster in simple queries (expressions)

But I invite any reduction in this area

regards, Pavel

>
> regards, tom lane