Re: plpgsql.consistent_into

Lists: pgsql-hackers
From: Marko Tiikkaja <marko(at)joh(dot)to>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: plpgsql.consistent_into
Date: 2014-01-12 05:51:04
Message-ID: 52D22D48.2090704@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greetings fellow elephants,

I would humbly like to submit for your consideration my proposal for
alleviating pain caused by one of the most annoying footguns in
PL/PgSQL: the behaviour of SELECT .. INTO when the query returns more
than one row. Some of you might know that no exception is raised in
this case (as opposed to INSERT/UPDATE/DELETE .. INTO, all of them
yielding TOO_MANY_ROWS), which can hide subtle bugs in queries if during
testing the query always returns only one row or the "correct" one
happens to be picked up every time. Additionally, the row_count() after
execution is always going to be either 0 or 1, so even if you want to
explicitly guard against potentially broken queries, you can't do so!

So I added the following compile-time option:

set plpgsql.consistent_into to true;

create or replace function footest() returns void as $$
declare
x int;
begin
-- too many rows
select 1 from foo into x;
end$$ language plpgsql;

select footest();
ERROR: query returned more than one row

It defaults to false to preserve full backwards compatibility. Also
turning it on makes the executor try and find two rows, so it might have
an effect on performance as well. The patch, as currently written, also
changes the behaviour of EXECUTE .. INTO, but I don't feel strongly
about whether that should be affected as well or not.

Regards,
Marko Tiikkaja

Attachment Content-Type Size
consistent_into_v1.patch text/plain 12.4 KB

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-12 06:47:03
Message-ID: CAFj8pRABAfxwxvJ9f4sv57VSFsToDhyzkm=ikEeKr2WF2eAoZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

2014/1/12 Marko Tiikkaja <marko(at)joh(dot)to>

> Greetings fellow elephants,
>
> I would humbly like to submit for your consideration my proposal for
> alleviating pain caused by one of the most annoying footguns in PL/PgSQL:
> the behaviour of SELECT .. INTO when the query returns more than one row.
> Some of you might know that no exception is raised in this case (as
> opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding
> TOO_MANY_ROWS), which can hide subtle bugs in queries if during testing the
> query always returns only one row or the "correct" one happens to be picked
> up every time. Additionally, the row_count() after execution is always
> going to be either 0 or 1, so even if you want to explicitly guard against
> potentially broken queries, you can't do so!
>

It is not bad and, sure, - it is very useful and important

but - it is a redundant to INTO STRICT clause. When you use it, then you
change a INTO behaviour. Is not better to ensure STRICT option than hidden
redefining INTO?

Option INTO (without STRICT clause) is not safe and we should to disallow.
I see a three states (not only two)

a) disallow INTO without STRICT (as preferred for new code)
b) implicit check after every INTO without STRICT
c) without check

these modes should be: "strict_required", "strict_default", "strict_legacy"

> So I added the following compile-time option:
>
>
> set plpgsql.consistent_into to true;
>

This name is not best (there is not clean with it a into should be
consistent)

Is question, if this functionality should be enabled by GUC to be used for
legacy code (as protection against some kind of hidden bugs)

This topic is interesting idea for me - some checks can be pushed to
plpgsql_check (as errors or warnings) too.

Generally I like proposed functionality, just I am not sure, so hidden
redefining INTO clause (to INTO STRICT) is what we want. We can do it (but
explicitly). I don't know any situation where INTO without STRICT is valid.
Introduction of STRICT option was wrong idea - and now is not way to back.

Regards

Pavel

>
> create or replace function footest() returns void as $$
> declare
> x int;
> begin
> -- too many rows
> select 1 from foo into x;
> end$$ language plpgsql;
>
> select footest();
> ERROR: query returned more than one row
>
> It defaults to false to preserve full backwards compatibility. Also
> turning it on makes the executor try and find two rows, so it might have an
> effect on performance as well. The patch, as currently written, also
> changes the behaviour of EXECUTE .. INTO, but I don't feel strongly about
> whether that should be affected as well or not.
>
>
> Regards,
> Marko Tiikkaja
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-12 13:02:06
Message-ID: 52D2924E.7060100@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/12/14, 7:47 AM, Pavel Stehule wrote:
> 2014/1/12 Marko Tiikkaja <marko(at)joh(dot)to>
>
>> Greetings fellow elephants,
>>
>> I would humbly like to submit for your consideration my proposal for
>> alleviating pain caused by one of the most annoying footguns in PL/PgSQL:
>> the behaviour of SELECT .. INTO when the query returns more than one row.
>> Some of you might know that no exception is raised in this case (as
>> opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding
>> TOO_MANY_ROWS), which can hide subtle bugs in queries if during testing the
>> query always returns only one row or the "correct" one happens to be picked
>> up every time. Additionally, the row_count() after execution is always
>> going to be either 0 or 1, so even if you want to explicitly guard against
>> potentially broken queries, you can't do so!
>>
>
> It is not bad and, sure, - it is very useful and important
>
> but - it is a redundant to INTO STRICT clause. When you use it, then you
> change a INTO behaviour. Is not better to ensure STRICT option than hidden
> redefining INTO?

That only works if the query should never return 0 rows either. If you
want to allow for missing rows, STRICT is out of the question.

> Option INTO (without STRICT clause) is not safe and we should to disallow.
> I see a three states (not only two)
>
> a) disallow INTO without STRICT (as preferred for new code)
> b) implicit check after every INTO without STRICT
> c) without check
>
> these modes should be: "strict_required", "strict_default", "strict_legacy"

I can't get excited about this. Mostly because it doesn't solve the
problem I'm having.

It is important to be able to execute queries with INTO which might not
return a row. That's what FOUND is for.

>> So I added the following compile-time option:
>>
>>
>> set plpgsql.consistent_into to true;
>>
>
> This name is not best (there is not clean with it a into should be
> consistent)

I agree, but I had to pick something. One of the three hard problems in
CS..

> Is question, if this functionality should be enabled by GUC to be used for
> legacy code (as protection against some kind of hidden bugs)
>
> This topic is interesting idea for me - some checks can be pushed to
> plpgsql_check (as errors or warnings) too.
>
> Generally I like proposed functionality, just I am not sure, so hidden
> redefining INTO clause (to INTO STRICT) is what we want. We can do it (but
> explicitly). I don't know any situation where INTO without STRICT is valid.
> Introduction of STRICT option was wrong idea - and now is not way to back.

Note that this is different from implicitly STRICTifying every INTO,
like I said above.

Regards,
Marko Tiikkaja


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-12 17:20:47
Message-ID: CAFj8pRAE7XKzGx6z4CQdZwxjvbVzN0oAaKgBR4VXHXzRXGFYTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014/1/12 Marko Tiikkaja <marko(at)joh(dot)to>

> On 1/12/14, 7:47 AM, Pavel Stehule wrote:
>
>> 2014/1/12 Marko Tiikkaja <marko(at)joh(dot)to>
>>
>> Greetings fellow elephants,
>>>
>>> I would humbly like to submit for your consideration my proposal for
>>> alleviating pain caused by one of the most annoying footguns in PL/PgSQL:
>>> the behaviour of SELECT .. INTO when the query returns more than one row.
>>> Some of you might know that no exception is raised in this case (as
>>> opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding
>>> TOO_MANY_ROWS), which can hide subtle bugs in queries if during testing
>>> the
>>> query always returns only one row or the "correct" one happens to be
>>> picked
>>> up every time. Additionally, the row_count() after execution is always
>>> going to be either 0 or 1, so even if you want to explicitly guard
>>> against
>>> potentially broken queries, you can't do so!
>>>
>>>
>> It is not bad and, sure, - it is very useful and important
>>
>> but - it is a redundant to INTO STRICT clause. When you use it, then you
>> change a INTO behaviour. Is not better to ensure STRICT option than hidden
>> redefining INTO?
>>
>
> That only works if the query should never return 0 rows either. If you
> want to allow for missing rows, STRICT is out of the question.

hmm - you have true.

try to find better name.

Other questions is using a GUC for legacy code. I am for this checked mode
be default (or can be simply activated for new code)

Regards

Pavel

>
>
> Option INTO (without STRICT clause) is not safe and we should to disallow.
>> I see a three states (not only two)
>>
>> a) disallow INTO without STRICT (as preferred for new code)
>> b) implicit check after every INTO without STRICT
>> c) without check
>>
>> these modes should be: "strict_required", "strict_default",
>> "strict_legacy"
>>
>
> I can't get excited about this. Mostly because it doesn't solve the
> problem I'm having.
>
> It is important to be able to execute queries with INTO which might not
> return a row. That's what FOUND is for.
>
>
> So I added the following compile-time option:
>>>
>>>
>>> set plpgsql.consistent_into to true;
>>>
>>>
>> This name is not best (there is not clean with it a into should be
>> consistent)
>>
>
> I agree, but I had to pick something. One of the three hard problems in
> CS..
>
>
> Is question, if this functionality should be enabled by GUC to be used for
>> legacy code (as protection against some kind of hidden bugs)
>>
>> This topic is interesting idea for me - some checks can be pushed to
>> plpgsql_check (as errors or warnings) too.
>>
>> Generally I like proposed functionality, just I am not sure, so hidden
>> redefining INTO clause (to INTO STRICT) is what we want. We can do it (but
>> explicitly). I don't know any situation where INTO without STRICT is
>> valid.
>> Introduction of STRICT option was wrong idea - and now is not way to back.
>>
>
> Note that this is different from implicitly STRICTifying every INTO, like
> I said above.
>
>
> Regards,
> Marko Tiikkaja
>


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-12 21:19:55
Message-ID: 2FA76496-0530-4EFC-B1DF-53078CE614A6@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan12, 2014, at 06:51 , Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> I would humbly like to submit for your consideration my proposal for alleviating pain caused by one of the most annoying footguns in PL/PgSQL: the behaviour of SELECT .. INTO when the query returns more than one row. Some of you might know that no exception is raised in this case (as opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding TOO_MANY_ROWS), which can hide subtle bugs in queries if during testing the query always returns only one row or the "correct" one happens to be picked up every time. Additionally, the row_count() after execution is always going to be either 0 or 1, so even if you want to explicitly guard against potentially broken queries, you can't do so!
>
> So I added the following compile-time option:
>
> set plpgsql.consistent_into to true;

I don't think a GUC is the best way to handle this. Handling
this via a per-function setting similar to #variable_conflict would
IMHO be better.So a function containing

#into_surplus_rows error

would complain whereas

#into_surplus_rows ignore_for_select

would leave the behaviour unchanged.

best regards,
Florian Pflug


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-12 21:31:24
Message-ID: 52D309AC.8000609@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/12/14, 10:19 PM, Florian Pflug wrote:
> On Jan12, 2014, at 06:51 , Marko Tiikkaja <marko(at)joh(dot)to> wrote:
>>
>> set plpgsql.consistent_into to true;
>
> I don't think a GUC is the best way to handle this. Handling
> this via a per-function setting similar to #variable_conflict would
> IMHO be better.

This is exactly what the patch does. A global default in the form of a
GUC, and then a per-function option for overriding that default.

Regards,
Marko Tiikkaja


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-12 21:37:34
Message-ID: CAFj8pRDjPWGdLgqLvEsRuMb6Xe7F4PKNOExZ8yNyMV_uQj3DFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014/1/12 Florian Pflug <fgp(at)phlo(dot)org>

> On Jan12, 2014, at 06:51 , Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> > I would humbly like to submit for your consideration my proposal for
> alleviating pain caused by one of the most annoying footguns in PL/PgSQL:
> the behaviour of SELECT .. INTO when the query returns more than one row.
> Some of you might know that no exception is raised in this case (as
> opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding
> TOO_MANY_ROWS), which can hide subtle bugs in queries if during testing the
> query always returns only one row or the "correct" one happens to be picked
> up every time. Additionally, the row_count() after execution is always
> going to be either 0 or 1, so even if you want to explicitly guard against
> potentially broken queries, you can't do so!
> >
> > So I added the following compile-time option:
> >
> > set plpgsql.consistent_into to true;
>
> I don't think a GUC is the best way to handle this. Handling
> this via a per-function setting similar to #variable_conflict would
> IMHO be better.So a function containing
>
> #into_surplus_rows error
>
> would complain whereas
>
> #into_surplus_rows ignore_for_select
>
> would leave the behaviour unchanged.
>

There is GUC for variable_conflict already too. In this case I would to
enable this functionality everywhere (it is tool how to simply eliminate
some kind of strange bugs) so it needs a GUC.

We have GUC for plpgsql.variable_conflict three years and I don't know
about any problem.

Regards

Pavel

>
> best regards,
> Florian Pflug
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-12 22:44:19
Message-ID: 2A2714CB-5A14-4ED2-92CD-CAE4E4DA3198@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan12, 2014, at 22:37 , Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> There is GUC for variable_conflict already too. In this case I would to
> enable this functionality everywhere (it is tool how to simply eliminate
> some kind of strange bugs) so it needs a GUC.
>
> We have GUC for plpgsql.variable_conflict three years and I don't know
> about any problem.

I must say I hate behaviour-changing GUCs with quite some passion. IMHO
they tend to cause bugs, not avoid them, in the long run. The pattern
usually is

1) Code gets written, depends on some particular set of settings
to work correctly

2) Code gets reused, with little further testing since it's supposed
to be battle-proven anyway. Settings get dropped.

3) Code blows up for those corner-cases where the setting actually
matter. Debugging is hell, because you effectively have to go
over the code line-by-line and check if it might be affected by
some GUC or another.

Only a few days ago I spent more than an hour tracking down a bug
which, as it turned out, was caused by a regex which subtly changed its
meaning depending on whether standard_conforming_strings is on or off.

Some GUCs are unavoidable - standard_conforming_strings, for example
probably still was a good idea, since the alternative would have been
to stick with the historical, non-standard behaviour forever.

But in this case, my feeling is that the trouble such a GUC may cause
out-weights the potential benefits. I'm all for having a directive like
#consistent_into (though I feel that the name could convey the
meaning better). If we *really* think that this ought to be the default
from 9.4 onward, then we should

*) Change it to always complain, except if the function explictly
specifies "#consistent_into on" or whatever.

*) Have pg_dump add that to all plpgsql functions if the server
version is < 9.4 or whatever major release this ends up in

That's all just my opinion of course.

best regards,
Florian Pflug


From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Florian Pflug <fgp(at)phlo(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-13 00:54:45
Message-ID: 52D33955.4050604@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13/01/14 11:44, Florian Pflug wrote:
> On Jan12, 2014, at 22:37 , Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> There is GUC for variable_conflict already too. In this case I would to
>> enable this functionality everywhere (it is tool how to simply eliminate
>> some kind of strange bugs) so it needs a GUC.
>>
>> We have GUC for plpgsql.variable_conflict three years and I don't know
>> about any problem.
> I must say I hate behaviour-changing GUCs with quite some passion. IMHO
> they tend to cause bugs, not avoid them, in the long run. The pattern
> usually is
>
> 1) Code gets written, depends on some particular set of settings
> to work correctly
>
> 2) Code gets reused, with little further testing since it's supposed
> to be battle-proven anyway. Settings get dropped.
>
> 3) Code blows up for those corner-cases where the setting actually
> matter. Debugging is hell, because you effectively have to go
> over the code line-by-line and check if it might be affected by
> some GUC or another.
>
> Only a few days ago I spent more than an hour tracking down a bug
> which, as it turned out, was caused by a regex which subtly changed its
> meaning depending on whether standard_conforming_strings is on or off.
>
> Some GUCs are unavoidable - standard_conforming_strings, for example
> probably still was a good idea, since the alternative would have been
> to stick with the historical, non-standard behaviour forever.
>
> But in this case, my feeling is that the trouble such a GUC may cause
> out-weights the potential benefits. I'm all for having a directive like
> #consistent_into (though I feel that the name could convey the
> meaning better). If we *really* think that this ought to be the default
> from 9.4 onward, then we should
>
> *) Change it to always complain, except if the function explictly
> specifies "#consistent_into on" or whatever.
>
> *) Have pg_dump add that to all plpgsql functions if the server
> version is < 9.4 or whatever major release this ends up in
>
> That's all just my opinion of course.
>
> best regards,
> Florian Pflug
>
>
>
Possibly there should be a warning put out, whenever someone uses some
behaviour that requires a GUC set to a non-default value?

Cheers,
Gavin


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-13 05:06:42
Message-ID: CAFj8pRDB_WUpWkG85eDQ1UEFFUuEOi7JhBLdpu2TEQT=j4DfAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014/1/12 Florian Pflug <fgp(at)phlo(dot)org>

> On Jan12, 2014, at 22:37 , Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> > There is GUC for variable_conflict already too. In this case I would to
> > enable this functionality everywhere (it is tool how to simply eliminate
> > some kind of strange bugs) so it needs a GUC.
> >
> > We have GUC for plpgsql.variable_conflict three years and I don't know
> > about any problem.
>
> I must say I hate behaviour-changing GUCs with quite some passion. IMHO
> they tend to cause bugs, not avoid them, in the long run. The pattern
> usually is
>
> 1) Code gets written, depends on some particular set of settings
> to work correctly
>
> 2) Code gets reused, with little further testing since it's supposed
> to be battle-proven anyway. Settings get dropped.
>
> 3) Code blows up for those corner-cases where the setting actually
> matter. Debugging is hell, because you effectively have to go
> over the code line-by-line and check if it might be affected by
> some GUC or another.
>
> Only a few days ago I spent more than an hour tracking down a bug
> which, as it turned out, was caused by a regex which subtly changed its
> meaning depending on whether standard_conforming_strings is on or off.
>
> Some GUCs are unavoidable - standard_conforming_strings, for example
> probably still was a good idea, since the alternative would have been
> to stick with the historical, non-standard behaviour forever.
>
> But in this case, my feeling is that the trouble such a GUC may cause
> out-weights the potential benefits. I'm all for having a directive like
> #consistent_into (though I feel that the name could convey the
> meaning better). If we *really* think that this ought to be the default
> from 9.4 onward, then we should
>
> *) Change it to always complain, except if the function explictly
> specifies "#consistent_into on" or whatever.
>
> *) Have pg_dump add that to all plpgsql functions if the server
> version is < 9.4 or whatever major release this ends up in
>

I disagree - automatic code injection can is strange. Is not problem inject
code from simple DO statement, but I dislike append lines to source code
without any specific user request.

Maybe this problem with GUC can be solved in 9.4. Some specific GUC can be
ported with database.

Pavel

>
> That's all just my opinion of course.
>
> best regards,
> Florian Pflug
>
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-13 07:27:35
Message-ID: CAFj8pRCQ34VZXg-doeoDTx6FEo4qMNoERobu_KeN4KLhVexm0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014/1/13 Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>

> On 13/01/14 11:44, Florian Pflug wrote:
>
>> On Jan12, 2014, at 22:37 , Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>
>>> There is GUC for variable_conflict already too. In this case I would to
>>> enable this functionality everywhere (it is tool how to simply eliminate
>>> some kind of strange bugs) so it needs a GUC.
>>>
>>> We have GUC for plpgsql.variable_conflict three years and I don't know
>>> about any problem.
>>>
>> I must say I hate behaviour-changing GUCs with quite some passion. IMHO
>> they tend to cause bugs, not avoid them, in the long run. The pattern
>> usually is
>>
>> 1) Code gets written, depends on some particular set of settings
>> to work correctly
>>
>> 2) Code gets reused, with little further testing since it's supposed
>> to be battle-proven anyway. Settings get dropped.
>>
>> 3) Code blows up for those corner-cases where the setting actually
>> matter. Debugging is hell, because you effectively have to go
>> over the code line-by-line and check if it might be affected by
>> some GUC or another.
>>
>> Only a few days ago I spent more than an hour tracking down a bug
>> which, as it turned out, was caused by a regex which subtly changed its
>> meaning depending on whether standard_conforming_strings is on or off.
>>
>> Some GUCs are unavoidable - standard_conforming_strings, for example
>> probably still was a good idea, since the alternative would have been
>> to stick with the historical, non-standard behaviour forever.
>>
>> But in this case, my feeling is that the trouble such a GUC may cause
>> out-weights the potential benefits. I'm all for having a directive like
>> #consistent_into (though I feel that the name could convey the
>> meaning better). If we *really* think that this ought to be the default
>> from 9.4 onward, then we should
>>
>> *) Change it to always complain, except if the function explictly
>> specifies "#consistent_into on" or whatever.
>>
>> *) Have pg_dump add that to all plpgsql functions if the server
>> version is < 9.4 or whatever major release this ends up in
>>
>> That's all just my opinion of course.
>>
>> best regards,
>> Florian Pflug
>>
>>
>>
>> Possibly there should be a warning put out, whenever someone uses some
> behaviour that requires a GUC set to a non-default value?
>

It is a good idea. I though about it. A worry about GUC are legitimate, but
we are most static and sometimes we try to design bigger creatures, than we
try to solve.

I am thinking so dump can contain a serialized GUC values, and can raises
warnings when GUC are different (not only different from default).

Similar problems are with different FROM_COLAPS_LIMIT, JOIN_COLAPS_LIMIT,
WORK_MEM, ...

Regards

Pavel

>
>
> Cheers,
> Gavin
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-13 07:44:24
Message-ID: CAFj8pRAmhU-3Ffh4qS0Xou59V4a5xsojNejt+baASHS7eF3+Pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014/1/12 Florian Pflug <fgp(at)phlo(dot)org>

> On Jan12, 2014, at 22:37 , Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> > There is GUC for variable_conflict already too. In this case I would to
> > enable this functionality everywhere (it is tool how to simply eliminate
> > some kind of strange bugs) so it needs a GUC.
> >
> > We have GUC for plpgsql.variable_conflict three years and I don't know
> > about any problem.
>
> I must say I hate behaviour-changing GUCs with quite some passion. IMHO
> they tend to cause bugs, not avoid them, in the long run. The pattern
> usually is
>
> 1) Code gets written, depends on some particular set of settings
> to work correctly
>
> 2) Code gets reused, with little further testing since it's supposed
> to be battle-proven anyway. Settings get dropped.
>
> 3) Code blows up for those corner-cases where the setting actually
> matter. Debugging is hell, because you effectively have to go
> over the code line-by-line and check if it might be affected by
> some GUC or another.
>
> Only a few days ago I spent more than an hour tracking down a bug
> which, as it turned out, was caused by a regex which subtly changed its
> meaning depending on whether standard_conforming_strings is on or off.
>
> Some GUCs are unavoidable - standard_conforming_strings, for example
> probably still was a good idea, since the alternative would have been
> to stick with the historical, non-standard behaviour forever.
>
> But in this case, my feeling is that the trouble such a GUC may cause
> out-weights the potential benefits. I'm all for having a directive like
> #consistent_into (though I feel that the name could convey the
> meaning better). If we *really* think that this ought to be the default
> from 9.4 onward, then we should
>
> *) Change it to always complain, except if the function explictly
> specifies "#consistent_into on" or whatever.
>
> *) Have pg_dump add that to all plpgsql functions if the server
> version is < 9.4 or whatever major release this ends up in
>
> That's all just my opinion of course.
>

I am thinking so GUC and plpgsql option can live together. If you like to
accent a some behave, then you can use a plpgsql option. On second hand, I
would to use a some functionality, that is safe, but I don't would to dirty
source code by using repeated options. But I have to check (and calculate
with risk) a GUC settings.

One idea: required GUC? Can be nice a possibility to ensure some GUC
setting, and restore ensure these values or raises warning.

Back to main topic. Required and described feature doesn't change a behave
of INTO clause. I can enable or disable this functionality and well written
code should to work without change (and problems). When check is disabled,
then execution is just less safe. So in this case, a impact of GUC is
significantly less than by you described issues. Does know anybody a use
case where this check should be disabled?

Probably we have a different experience about GUC. I had a problem with
standard_conforming_strings and bytea format some years ago. Now I prepare
document about required setting. But I can see (from my experience from
Czech area) more often problems related to effective_cache_size or
from_collapse_limit and similar GUC. These parameters are behind knowledge
(and visibility) typical user.

Best regards

Pavel

>
> best regards,
> Florian Pflug
>
>


From: Jim Nasby <jim(at)nasby(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-13 21:49:45
Message-ID: 52D45F79.5060107@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/13/14, 1:44 AM, Pavel Stehule wrote:
>
>
>
> 2014/1/12 Florian Pflug <fgp(at)phlo(dot)org <mailto:fgp(at)phlo(dot)org>>
>
> On Jan12, 2014, at 22:37 , Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com <mailto:pavel(dot)stehule(at)gmail(dot)com>> wrote:
> > There is GUC for variable_conflict already too. In this case I would to
> > enable this functionality everywhere (it is tool how to simply eliminate
> > some kind of strange bugs) so it needs a GUC.
> >
> > We have GUC for plpgsql.variable_conflict three years and I don't know
> > about any problem.
>
> I must say I hate behaviour-changing GUCs with quite some passion. IMHO
> they tend to cause bugs, not avoid them, in the long run. The pattern
> usually is
>
> 1) Code gets written, depends on some particular set of settings
> to work correctly
>
> 2) Code gets reused, with little further testing since it's supposed
> to be battle-proven anyway. Settings get dropped.
>
> 3) Code blows up for those corner-cases where the setting actually
> matter. Debugging is hell, because you effectively have to go
> over the code line-by-line and check if it might be affected by
> some GUC or another.
>
> Only a few days ago I spent more than an hour tracking down a bug
> which, as it turned out, was caused by a regex which subtly changed its
> meaning depending on whether standard_conforming_strings is on or off.
>
> Some GUCs are unavoidable - standard_conforming_strings, for example
> probably still was a good idea, since the alternative would have been
> to stick with the historical, non-standard behaviour forever.
>
> But in this case, my feeling is that the trouble such a GUC may cause
> out-weights the potential benefits. I'm all for having a directive like
> #consistent_into (though I feel that the name could convey the
> meaning better). If we *really* think that this ought to be the default
> from 9.4 onward, then we should
>
> *) Change it to always complain, except if the function explictly
> specifies "#consistent_into on" or whatever.
>
> *) Have pg_dump add that to all plpgsql functions if the server
> version is < 9.4 or whatever major release this ends up in
>
> That's all just my opinion of course.
>
>
> I am thinking so GUC and plpgsql option can live together. If you like to accent a some behave, then you can use a plpgsql option. On second hand, I would to use a some functionality, that is safe, but I don't would to dirty source code by using repeated options. But I have to check (and calculate with risk) a GUC settings.
>
> One idea: required GUC? Can be nice a possibility to ensure some GUC setting, and restore ensure these values or raises warning.
>
> Back to main topic. Required and described feature doesn't change a behave of INTO clause. I can enable or disable this functionality and well written code should to work without change (and problems). When check is disabled, then execution is just less safe. So in this case, a impact of GUC is significantly less than by you described issues. Does know anybody a use case where this check should be disabled?
>
> Probably we have a different experience about GUC. I had a problem with standard_conforming_strings and bytea format some years ago. Now I prepare document about required setting. But I can see (from my experience from Czech area) more often problems related to effective_cache_size or from_collapse_limit and similar GUC. These parameters are behind knowledge (and visibility) typical user.

ISTM that in this case, it should be safe to make the new default behavior STRICT; if you forget to set the GUC to disable than you'll get an error that points directly at the problem, at which point you'll go "Oh, yeah... I forgot to set X..."

Outside of the GUC, I believe the default should definitely be STRICT. If your app is relying on non-strict then you need to be made aware of that. We should be able to provide a DO block that will change this setting for every function you've got if someone isn't happy with STRICT mode.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-13 23:41:09
Message-ID: D101ADAF-4C70-4FFF-9E35-17B5D4E8370E@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan13, 2014, at 22:49 , Jim Nasby <jim(at)nasby(dot)net> wrote:
> ISTM that in this case, it should be safe to make the new default behavior STRICT;
> if you forget to set the GUC to disable than you'll get an error that points directly
> at the problem, at which point you'll go "Oh, yeah... I forgot to set X..."

What do you mean by STRICT? STRICT (which we already support) complains if the
query doesn't return *exactly* one row. What Marko wants is to raise an error
for a plain SELECT ... INTO if more than one row is returned, but to still
convert zero rows to NULL.

> Outside of the GUC, I believe the default should definitely be STRICT. If your app is
> relying on non-strict then you need to be made aware of that. We should be able to
> provide a DO block that will change this setting for every function you've got if
> someone isn't happy with STRICT mode.

If you mean that we should change SELECT ... INTO to always behave as if STRICT
had been specified - why on earth would we want to do that? That would break
perfectly fine code for no good reason whatsoever.

In fact, after reading the documentation on SELECT ... INTO, I'm convinced the
the whole consistent_into thing is a bad idea. The documentation states clearly
that

For INSERT/UPDATE/DELETE with RETURNING, PL/pgSQL reports an error for more than
one returned row, even when STRICT is not specified. This is because there is no
option such as ORDER BY with which to determine which affected row should be
returned.

It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
for a reason. We shouldn't be second-guessing ourselves by changing that later -
not, at least, unless we have a *very* good reason for it. Which, AFAICS, we don't.

(And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
IMHO just too late for that)

best regards,
Florian Pflug


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Florian Pflug <fgp(at)phlo(dot)org>, Jim Nasby <jim(at)nasby(dot)net>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-13 23:52:50
Message-ID: 52D47C52.2070405@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/14/14, 12:41 AM, Florian Pflug wrote:
> In fact, after reading the documentation on SELECT ... INTO, I'm convinced the
> the whole consistent_into thing is a bad idea. The documentation states clearly
> that
>
> For INSERT/UPDATE/DELETE with RETURNING, PL/pgSQL reports an error for more than
> one returned row, even when STRICT is not specified. This is because there is no
> option such as ORDER BY with which to determine which affected row should be
> returned.
>
> It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
> but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
> for a reason.

Yeah, it does state that. But it's a BS reason. In addition to ORDER
BY, SELECT also has a LIMIT which you can use to get the "first row"
behaviour. There's no way to go to the more sane behaviour from what we
have right now.

When I've worked with PL/PgSQL, this has been a source of a few bugs
that would have been noticed during testing if the behaviour of INTO
wasn't as dangerous as it is right now. Yes, it breaks backwards
compatibility, but that's why there's a nice GUC. If we're not going to
scrap PL/PgSQL and start over again, we are going to have to do changes
like this to make the language better. Also I think that out of all the
things we could do to break backwards compatibility, this is closer to
"harmless" than "a pain in the butt".

Regards,
Marko Tiikkaja


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>, Jim Nasby <jim(at)nasby(dot)net>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-13 23:57:53
Message-ID: 52D47D81.2010404@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/13/2014 03:41 PM, Florian Pflug wrote:
> It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
> but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
> for a reason. We shouldn't be second-guessing ourselves by changing that later -
> not, at least, unless we have a *very* good reason for it. Which, AFAICS, we don't.
>
> (And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
> IMHO just too late for that)

I *really* don't want to go through all my old code to find places where
I used SELECT ... INTO just to pop off the first row, and ignored the
rest. I doubt anyone else does, either.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Jim Nasby <jim(at)nasby(dot)net>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 00:16:44
Message-ID: EBBD010D-E874-467B-8344-FF49B4924EC5@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan14, 2014, at 00:52 , Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> When I've worked with PL/PgSQL, this has been a source of a few bugs that
> would have been noticed during testing if the behaviour of INTO wasn't as
> dangerous as it is right now.

The question is, how many bugs stemmed from wrong SQL queries, and what
percentage of those would have been caught by this? The way I see it, there
are thousands of ways to screw up a query, and having it return multiple
rows instead of one is just one of them.

> Yes, it breaks backwards compatibility, but that's why there's a nice GUC.

Which doesn't help, because the GUC isn't tied to the code. This *adds*
an error case, not remove one - now, instead of getting your code correct,
you *also* have to get the GUC correct. If you even *know* that such a GUC
exists.

> If we're not going to scrap PL/PgSQL and
> start over again, we are going to have to do changes like this to make the
> language better. Also I think that out of all the things we could do to
> break backwards compatibility, this is closer to "harmless" than "a pain
> in the butt".

I very strongly believe that languages don't get better by adding a thousand
little knobs which subtly change semantics. Look at the mess that is PHP -
we absolutely, certainly don't want to go there. The most important rule in
language design is in my opinion "stick with your choices". C, C++ and JAVA
all seem to follow this, and it's one of the reasons these languages are
popular for big projects, I think.

The way I see it, the only OK way to change existing behaviour is to have
the concept of a "language version", and force code to indicate the language
version it expects. The important thing is that the language version is an
attribute of code, not some global setting that you can change without ever
looking at the code it'd affect.

So if we really want to change this, I think we need to have a
LANGUAGE_VERSION attribute on functions. Each time a major postgres release
changes the behaviour of one of the procedural languages, we'd increment
that language's version, and enable the old behaviour for all functions
tagged with an older one.

best regards,
Florian Pflug


From: Jim Nasby <jim(at)nasby(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 00:20:36
Message-ID: 52D482D4.2090902@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/13/14, 5:57 PM, Josh Berkus wrote:
> On 01/13/2014 03:41 PM, Florian Pflug wrote:
>> It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
>> but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
>> for a reason. We shouldn't be second-guessing ourselves by changing that later -
>> not, at least, unless we have a *very* good reason for it. Which, AFAICS, we don't.
>>
>> (And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
>> IMHO just too late for that)
>
> I *really* don't want to go through all my old code to find places where
> I used SELECT ... INTO just to pop off the first row, and ignored the
> rest. I doubt anyone else does, either.

Do you regularly have use cases where you actually want just one RANDOM row? I suspect the far more likely scenario is that people write code assuming they'll get only one row and they'll end up with extremely hard to trace bugs if that assumption is ever wrong.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Jim Nasby <jim(at)nasby(dot)net>
To: Florian Pflug <fgp(at)phlo(dot)org>, Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 00:25:16
Message-ID: 52D483EC.2060001@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/13/14, 6:16 PM, Florian Pflug wrote:
> On Jan14, 2014, at 00:52 , Marko Tiikkaja <marko(at)joh(dot)to> wrote:
>> When I've worked with PL/PgSQL, this has been a source of a few bugs that
>> would have been noticed during testing if the behaviour of INTO wasn't as
>> dangerous as it is right now.
>
> The question is, how many bugs stemmed from wrong SQL queries, and what
> percentage of those would have been caught by this? The way I see it, there
> are thousands of ways to screw up a query, and having it return multiple
> rows instead of one is just one of them.

A query that's simply wrong is more likely to fail consistently. Non-strict use of INTO is going to fail in very subtle ways (unless you actually DO want just the first row, in which case you should explicitly use LIMIT 1).

>> If we're not going to scrap PL/PgSQL and
>> start over again, we are going to have to do changes like this to make the
>> language better. Also I think that out of all the things we could do to
>> break backwards compatibility, this is closer to "harmless" than "a pain
>> in the butt".
>
> I very strongly believe that languages don't get better by adding a thousand
> little knobs which subtly change semantics. Look at the mess that is PHP -
> we absolutely, certainly don't want to go there. The most important rule in
> language design is in my opinion "stick with your choices". C, C++ and JAVA
> all seem to follow this, and it's one of the reasons these languages are
> popular for big projects, I think.
>
> The way I see it, the only OK way to change existing behaviour is to have
> the concept of a "language version", and force code to indicate the language
> version it expects. The important thing is that the language version is an
> attribute of code, not some global setting that you can change without ever
> looking at the code it'd affect.
>
> So if we really want to change this, I think we need to have a
> LANGUAGE_VERSION attribute on functions. Each time a major postgres release
> changes the behaviour of one of the procedural languages, we'd increment
> that language's version, and enable the old behaviour for all functions
> tagged with an older one.

I like that idea. It allows us to fix past decisions that were ill considered without hosing all existing code.

BTW, have we always had support for STRICT, or was it added at some point? It's in 8.4, but I don't know how far back it goes.

And if we've always had it, why on earth didn't we make STRICT the default behavior?
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 00:36:49
Message-ID: 4B6AC8BD-FDE6-4944-B793-84A5A85F15E9@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(Responding to both of your mails here)

On Jan14, 2014, at 01:20 , Jim Nasby <jim(at)nasby(dot)net> wrote:
> On 1/13/14, 5:57 PM, Josh Berkus wrote:
>> On 01/13/2014 03:41 PM, Florian Pflug wrote:
>>> It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
>>> but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
>>> for a reason. We shouldn't be second-guessing ourselves by changing that later -
>>> not, at least, unless we have a *very* good reason for it. Which, AFAICS, we don't.
>>>
>>> (And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
>>> IMHO just too late for that)
>>
>> I *really* don't want to go through all my old code to find places where
>> I used SELECT ... INTO just to pop off the first row, and ignored the
>> rest. I doubt anyone else does, either.
>
> Do you regularly have use cases where you actually want just one RANDOM row?
> I suspect the far more likely scenario is that people write code assuming they'll
> get only one row and they'll end up with extremely hard to trace bugs if that
> assumption is ever wrong.

One case that immediatly comes to mind is a JOIN which sometimes returns
multiple rows, and a projection clause that only uses one of the tables
involved in the join.

Another are queries including an ORDER BY - I don't think the patch makes an
exception for those, and even if it did, it probably wouldn't catch all
cases, like e.g. an SRF which returns the rows in a deterministic order.

Or maybe you're picking a row to process next, and don't really care about
the order in which you work through them.

>> The question is, how many bugs stemmed from wrong SQL queries, and what
>> percentage of those would have been caught by this? The way I see it, there
>> are thousands of ways to screw up a query, and having it return multiple
>> rows instead of one is just one of them.
>
> A query that's simply wrong is more likely to fail consistently. Non-strict
> use of INTO is going to fail in very subtle ways (unless you actually DO want
> just the first row, in which case you should explicitly use LIMIT 1).

How so? Say you expect "SELECT * FROM view WHERE c=<n>" to only ever return
one row. Then "SELECT sum(f) FROM table JOIN view ON table.c = view.c" is
just as subtly wrong as the first query is.

> And if we've always had it, why on earth didn't we make STRICT the default
> behavior?

Dunno, but AFAIK pl/pgsql mimics Oracle's PL/SQL, at least in some aspects,
so maybe this is one of the areas where we just do what oracle does.

best regards,
Florian Pflug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Jim Nasby <jim(at)nasby(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 00:57:04
Message-ID: 24488.1389661024@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> writes:
> On Jan14, 2014, at 01:20 , Jim Nasby <jim(at)nasby(dot)net> wrote:
>> And if we've always had it, why on earth didn't we make STRICT the default
>> behavior?

> Dunno, but AFAIK pl/pgsql mimics Oracle's PL/SQL, at least in some aspects,
> so maybe this is one of the areas where we just do what oracle does.

STRICT is a relatively recent addition (8.2, looks like). One of the
reasons it got in was that the author didn't have any grandiose ideas
about making it the new default. I think this patch would have a lot
better chance if it was just adding another keyword as an alternative
to STRICT, and not hoping to impose the author's opinions on the rest
of the world. Whatever your opinion of the default behavior, the
fact that it's been that way for upwards of fifteen years without any
mass protests should give you pause about changing it.

regards, tom lane


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Florian Pflug <fgp(at)phlo(dot)org>
Cc: Jim Nasby <jim(at)nasby(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 01:05:37
Message-ID: 52D48D61.2090101@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/14/14, 1:57 AM, Tom Lane wrote:
> Whatever your opinion of the default behavior, the
> fact that it's been that way for upwards of fifteen years without any
> mass protests should give you pause about changing it.

For what it's worth, my patch does not change the default behaviour. I
don't think I've ever publicly said that it should.

Regards,
Marko Tiikkaja


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Jim Nasby <jim(at)nasby(dot)net>, Florian Pflug <fgp(at)phlo(dot)org>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 01:06:41
Message-ID: 52D48DA1.8060107@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/13/2014 04:20 PM, Jim Nasby wrote:
> On 1/13/14, 5:57 PM, Josh Berkus wrote:
>> I *really* don't want to go through all my old code to find places where
>> I used SELECT ... INTO just to pop off the first row, and ignored the
>> rest. I doubt anyone else does, either.
>
> Do you regularly have use cases where you actually want just one RANDOM
> row? I suspect the far more likely scenario is that people write code
> assuming they'll get only one row and they'll end up with extremely hard
> to trace bugs if that assumption is ever wrong.

Regularly? No. But I've seen it, especially as part of a "does this
query return any rows?" test. That's not the best way to test that, but
that doesn't stop a lot of people doing it.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Jim Nasby <jim(at)nasby(dot)net>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 01:08:58
Message-ID: 52D48E2A.8040300@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/13/14, 6:36 PM, Florian Pflug wrote:
> On Jan14, 2014, at 01:20 , Jim Nasby<jim(at)nasby(dot)net> wrote:
>> >On 1/13/14, 5:57 PM, Josh Berkus wrote:
>>> >>On 01/13/2014 03:41 PM, Florian Pflug wrote:
>>>> >>>It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
>>>> >>>but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
>>>> >>>for a reason. We shouldn't be second-guessing ourselves by changing that later -
>>>> >>>not, at least, unless we have a*very* good reason for it. Which, AFAICS, we don't.
>>>> >>>
>>>> >>>(And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
>>>> >>>IMHO just too late for that)
>>> >>
>>> >>I*really* don't want to go through all my old code to find places where
>>> >>I used SELECT ... INTO just to pop off the first row, and ignored the
>>> >>rest. I doubt anyone else does, either.
>> >
>> >Do you regularly have use cases where you actually want just one RANDOM row?
>> >I suspect the far more likely scenario is that people write code assuming they'll
>> >get only one row and they'll end up with extremely hard to trace bugs if that
>> >assumption is ever wrong.
> One case that immediatly comes to mind is a JOIN which sometimes returns
> multiple rows, and a projection clause that only uses one of the tables
> involved in the join.

Which is just bad coding IMHO.

> Another are queries including an ORDER BY - I don't think the patch makes an
> exception for those, and even if it did, it probably wouldn't catch all
> cases, like e.g. an SRF which returns the rows in a deterministic order.

Sure, but if you've gone to the trouble of putting the ORDER BY in, how hard is it to add LIMIT 1?

> Or maybe you're picking a row to process next, and don't really care about
> the order in which you work through them.

Again, how hard is LIMIT 1?

BTW, my issue here isn't with typing out " STRICT". I'd be fine if the way things worked was you had to specify INTO STRICT or INTO LOOSE (or whatever you want to call the opposite of strict).

My problem is that the default here is plain and simple a bad choice for default behavior. In light of Tom's research showing this was added in 8.2 I understand why we went that route, but I'd really like a way to change the default. Or even disallow it (IE: force the user to state which route they're going here).
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Jim Nasby <jim(at)nasby(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 01:10:49
Message-ID: 52D48E99.8000703@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/13/14, 7:06 PM, Josh Berkus wrote:
> On 01/13/2014 04:20 PM, Jim Nasby wrote:
>> On 1/13/14, 5:57 PM, Josh Berkus wrote:
>>> I *really* don't want to go through all my old code to find places where
>>> I used SELECT ... INTO just to pop off the first row, and ignored the
>>> rest. I doubt anyone else does, either.
>>
>> Do you regularly have use cases where you actually want just one RANDOM
>> row? I suspect the far more likely scenario is that people write code
>> assuming they'll get only one row and they'll end up with extremely hard
>> to trace bugs if that assumption is ever wrong.
>
> Regularly? No. But I've seen it, especially as part of a "does this
> query return any rows?" test. That's not the best way to test that, but
> that doesn't stop a lot of people doing it.

Right, and I certainly don't want to force anyone to rewrite all their code. But I'd certainly like a safer default so people don't mistakenly go the "multiple rows is OK" route without doing so very intentionally.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Jim Nasby <jim(at)nasby(dot)net>, Florian Pflug <fgp(at)phlo(dot)org>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 01:27:35
Message-ID: 52D49287.6020502@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/13/2014 05:10 PM, Jim Nasby wrote:
> On 1/13/14, 7:06 PM, Josh Berkus wrote:
>> Regularly? No. But I've seen it, especially as part of a "does this
>> query return any rows?" test. That's not the best way to test that, but
>> that doesn't stop a lot of people doing it.
>
> Right, and I certainly don't want to force anyone to rewrite all their
> code. But I'd certainly like a safer default so people don't mistakenly
> go the "multiple rows is OK" route without doing so very intentionally.

The problem is that if you change the default, you're creating an
unexpected barrier to upgrading. I just don't think that it's worth
doing so in order to meet some standard of code neatness, especially in
plpgsql, the unwanted bastard child of SQL and ADA.

For people who want to enable this in order to prevent stupid query bugs
from creeping into their plpgsql, that's great, let's have an easy
option to turn on. But it's hard enough to get people to upgrade as it
is. If we're going to add an upgrade landmine, it better be for
something really important.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 01:54:48
Message-ID: CABRT9RByLAAg77NGrtvGx39s7k=4HG11-yc_qxz5v=oDtUp_Tg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jan 12, 2014 at 7:51 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> the behaviour of SELECT .. INTO when the query returns more than one row.
> Some of you might know that no exception is raised in this case

Agreed. But I also agree with the rest of the thread about changing
current INTO behavior and introducing new GUC variables.

But PL/pgSQL already has an assignment syntax with the behavior you want:

DECLARE
foo int;
BEGIN
foo = generate_series(1,1); -- this is OK
foo = generate_series(1,2); -- fails
foo = 10 WHERE FALSE; -- sets foo to NULL
-- And you can actually do:
foo = some_col FROM some_table WHERE bar=10;
END;

So if we extend this syntax to support multiple columns, it should
satisfy the use cases you care about.

foo, bar = col1, col2 FROM some_table WHERE bar=10;

It's ugly without the explicit SELECT though. Perhaps make the SELECT optional:

foo, bar = SELECT col1, col2 FROM some_table WHERE bar=10;

I think that's more aesthetically pleasing than INTO and also looks
more familiar to other languages.

Plus, now you can copy-paste the query straight to an SQL shell
without another footgun involving creating new tables in your
database.

Regards,
Marti


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 02:30:13
Message-ID: 52D4A135.1000702@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-01-14 02:54, Marti Raudsepp wrote:
> On Sun, Jan 12, 2014 at 7:51 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
>> the behaviour of SELECT .. INTO when the query returns more than one row.
>> Some of you might know that no exception is raised in this case
>
> Agreed. But I also agree with the rest of the thread about changing
> current INTO behavior and introducing new GUC variables.
>
> But PL/pgSQL already has an assignment syntax with the behavior you want:

According to the docs, that doesn't set FOUND which would make this a
pain to deal with..

Regards,
Marko Tiikkaja


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 08:21:40
Message-ID: CAFj8pRA5+8kJC0B5Ev_ji9OdO=YY+GQ0Nw3YiEEfTjUv-nn=Yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> I am thinking so GUC and plpgsql option can live together. If you like to
>> accent a some behave, then you can use a plpgsql option. On second hand, I
>> would to use a some functionality, that is safe, but I don't would to dirty
>> source code by using repeated options. But I have to check (and calculate
>> with risk) a GUC settings.
>>
>> One idea: required GUC? Can be nice a possibility to ensure some GUC
>> setting, and restore ensure these values or raises warning.
>>
>> Back to main topic. Required and described feature doesn't change a
>> behave of INTO clause. I can enable or disable this functionality and well
>> written code should to work without change (and problems). When check is
>> disabled, then execution is just less safe. So in this case, a impact of
>> GUC is significantly less than by you described issues. Does know anybody a
>> use case where this check should be disabled?
>>
>> Probably we have a different experience about GUC. I had a problem with
>> standard_conforming_strings and bytea format some years ago. Now I prepare
>> document about required setting. But I can see (from my experience from
>> Czech area) more often problems related to effective_cache_size or
>> from_collapse_limit and similar GUC. These parameters are behind knowledge
>> (and visibility) typical user.
>>
>
> ISTM that in this case, it should be safe to make the new default behavior
> STRICT; if you forget to set the GUC to disable than you'll get an error
> that points directly at the problem, at which point you'll go "Oh, yeah...
> I forgot to set X..."
>
>
I disagree - STRICT can be too restrictive - and a combination SELECT INTO
and IF FOUND can be significantly faster - our exception handling is not
cheap.

Regards

Pavel

> Outside of the GUC, I believe the default should definitely be STRICT. If
> your app is relying on non-strict then you need to be made aware of that.
> We should be able to provide a DO block that will change this setting for
> every function you've got if someone isn't happy with STRICT mode.
> --
> Jim C. Nasby, Data Architect jim(at)nasby(dot)net
> 512.569.9461 (cell) http://jim.nasby.net
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <jim(at)nasby(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 09:16:58
Message-ID: CAFj8pRAmjTB7NxTrguVfCniX=nsGKr16gEHOn7WMVBSnQeVwAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014/1/14 Florian Pflug <fgp(at)phlo(dot)org>

> On Jan14, 2014, at 00:52 , Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> > When I've worked with PL/PgSQL, this has been a source of a few bugs that
> > would have been noticed during testing if the behaviour of INTO wasn't as
> > dangerous as it is right now.
>
> The question is, how many bugs stemmed from wrong SQL queries, and what
> percentage of those would have been caught by this? The way I see it, there
> are thousands of ways to screw up a query, and having it return multiple
> rows instead of one is just one of them.
>
> > Yes, it breaks backwards compatibility, but that's why there's a nice
> GUC.
>
> Which doesn't help, because the GUC isn't tied to the code. This *adds*
> an error case, not remove one - now, instead of getting your code correct,
> you *also* have to get the GUC correct. If you even *know* that such a GUC
> exists.
>
> > If we're not going to scrap PL/PgSQL and
> > start over again, we are going to have to do changes like this to make
> the
> > language better. Also I think that out of all the things we could do to
> > break backwards compatibility, this is closer to "harmless" than "a pain
> > in the butt".
>
> I very strongly believe that languages don't get better by adding a
> thousand
> little knobs which subtly change semantics. Look at the mess that is PHP -
> we absolutely, certainly don't want to go there. The most important rule in
> language design is in my opinion "stick with your choices". C, C++ and JAVA
> all seem to follow this, and it's one of the reasons these languages are
> popular for big projects, I think.
>
> The way I see it, the only OK way to change existing behaviour is to have
> the concept of a "language version", and force code to indicate the
> language
> version it expects. The important thing is that the language version is an
> attribute of code, not some global setting that you can change without ever
> looking at the code it'd affect.
>
> So if we really want to change this, I think we need to have a
> LANGUAGE_VERSION attribute on functions. Each time a major postgres release
> changes the behaviour of one of the procedural languages, we'd increment
> that language's version, and enable the old behaviour for all functions
> tagged with an older one.
>

I dislike this proposal

too enterprise, too complex, too bad - after ten years we can have a ten
language versions and it helps nothing.

return back to topic

a) there is agreement so we like this functionality as plpgsql option

b) there is no agreement so we would to see this functionality as default
(or in near future)

@b is a topic, that we should not to solve and it is back again and again.
Sometimes we have success, sometimes not. Temporal GUC is not enough
solution for some people.

So my result - @a can be implement, @b not

and we can continue in other thread about SELECT INTO redesign and about
possibilities that we have or have not. It is about personal perspective -
someone can thinking about missing check after INTO as about feature, other
one can see it as bug. My opinion - it is a bug with possible ambiguous
result and possible negative performance impacts. Probably we can precise a
doc about wrong and good usage SELECT INTO clause.

I am working on plpgsql_debug extension and I am thinking so I am able
(with small change in plpgsql executor) implement this check as extension.
So it can be available for advanced users (that will has a knowledge about
additional plpgsql extensions).

Regards

Pavel

regards

Pavel

>
> best regards,
> Florian Pflug
>
>
>
>
>
>
>


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>
Cc: Jim Nasby <jim(at)nasby(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 10:51:56
Message-ID: 52D516CC.6030706@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/14/14 10:16 AM, Pavel Stehule wrote:
> 2014/1/14 Florian Pflug <fgp(at)phlo(dot)org>
>
>> So if we really want to change this, I think we need to have a
>> LANGUAGE_VERSION attribute on functions. Each time a major postgres release
>> changes the behaviour of one of the procedural languages, we'd increment
>> that language's version, and enable the old behaviour for all functions
>> tagged with an older one.
>>
>
> I dislike this proposal
>
> too enterprise, too complex, too bad - after ten years we can have a ten
> language versions and it helps nothing.

At this point I'm almost tempted to agree with Florian -- I'm really
hoping we could change PL/PgSQL for the better over the next 10 years,
but given the backwards compatibility requirements we have, this seems
like an absolute nightmare. Not saying we need a version number we can
keep bumping every release, but something similar.

> return back to topic
>
> a) there is agreement so we like this functionality as plpgsql option
>
> b) there is no agreement so we would to see this functionality as default
> (or in near future)
>
> @b is a topic, that we should not to solve and it is back again and again.
> Sometimes we have success, sometimes not. Temporal GUC is not enough
> solution for some people.
>
> So my result - @a can be implement, @b not

FWIW, I would like to have this behaviour even if it's not the default
(that was my original proposal anyway). It could help catch a lot of
bugs in testing, and for important code it could be even turned on in
production (perhaps on a per-function basis). Maybe even globally, idk.

> I am working on plpgsql_debug extension and I am thinking so I am able
> (with small change in plpgsql executor) implement this check as extension.
> So it can be available for advanced users (that will has a knowledge about
> additional plpgsql extensions).

While this sounds cool, running a modified postgres locally seems a lot
easier. I already have to do that for PL/PgSQL development because of
the reasons I outlined in the thread "PL/PgSQL, RAISE and error context".

Regards,
Marko Tiikkaja


From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 11:28:56
Message-ID: CABRT9RCP=HWLmiAR6xhb1h-gOPZudihnkcLKfbVNyzupM1rpYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've always hated INTO in procedures since it makes the code harder to
follow and has very different behavior on the SQL level, in addition
to the multi-row problem you bring up. If we can make assignment
syntax more versatile and eventually replace INTO, then that solves
multiple problems in the language without breaking backwards
compatibility.

On Tue, Jan 14, 2014 at 4:30 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> On 2014-01-14 02:54, Marti Raudsepp wrote:
>> But PL/pgSQL already has an assignment syntax with the behavior you want:
>
> According to the docs, that doesn't set FOUND which would make this a pain
> to deal with..

Right you are. If we can extend the syntax then we could make it such
that "= SELECT" sets FOUND and other diagnostics, and a simple
assignment doesn't. Which makes sense IMO:

a = 10; -- simple assignments really shouldn't affect FOUND

With explicit SELECT, clearly the intent is to perform a query:
a = SELECT foo FROM table;
And this could also work:
a = INSERT INTO table (foo) VALUES (10) RETURNING foo_id;

AFAICT the fact that this works is more of an accident and should be
discouraged. We can leave it as is for compatibility's sake:
a = foo FROM table;

Now, another question is whether it's possible to make the syntax
work. Is this an assignment from the result of a subquery, or is it a
query by itself?
a = (SELECT foo FROM table);

Does anyone consider this proposal workable?

Regards,
Marti


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 11:46:39
Message-ID: 52D5239F.6030409@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/14/14 12:28 PM, Marti Raudsepp wrote:
> I've always hated INTO in procedures since it makes the code harder to
> follow and has very different behavior on the SQL level, in addition
> to the multi-row problem you bring up. If we can make assignment
> syntax more versatile and eventually replace INTO, then that solves
> multiple problems in the language without breaking backwards
> compatibility.

I don't personally have a problem with INTO other than the behaviour
that started this thread. But I'm willing to consider other options.

> On Tue, Jan 14, 2014 at 4:30 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
>> On 2014-01-14 02:54, Marti Raudsepp wrote:
>>> But PL/pgSQL already has an assignment syntax with the behavior you want:
>>
>> According to the docs, that doesn't set FOUND which would make this a pain
>> to deal with..
>
> Right you are. If we can extend the syntax then we could make it such
> that "= SELECT" sets FOUND and other diagnostics, and a simple
> assignment doesn't. Which makes sense IMO:
>
> a = 10; -- simple assignments really shouldn't affect FOUND

With you so far.

> With explicit SELECT, clearly the intent is to perform a query:
> a = SELECT foo FROM table;
> And this could also work:
> a = INSERT INTO table (foo) VALUES (10) RETURNING foo_id;

I'm not sure that would work with the grammar. Basically what PL/PgSQL
does right now is for a statement like:

a = 1;

It parses the "a =" part itself, and then just reads until the next
unquoted semicolon without actually looking at it, and slams a "SELECT "
in front of it. With this approach we'd have to look into the query and
try and guess what it does. That might be possible, but I don't like
the idea.

> AFAICT the fact that this works is more of an accident and should be
> discouraged. We can leave it as is for compatibility's sake:
> a = foo FROM table;

I've always considered that ugly (IIRC it's still undocumented as well),
and would encourage people not to do that.

> Now, another question is whether it's possible to make the syntax
> work. Is this an assignment from the result of a subquery, or is it a
> query by itself?
> a = (SELECT foo FROM table);

That looks like a scalar subquery, which is wrong because they can't
return more than one column (nor can they be INSERT etc., obviously).

How about:

(a) = SELECT 1;
(a, b) = SELECT 1, 2;
(a, b) = INSERT INTO foo RETURNING col1, col2;

Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
AFAICT this can be parsed unambiguously, too, and we don't need to look
at the query string because this is new syntax.

Regards,
Marko Tiikkaja


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 12:28:34
Message-ID: CAFj8pRAbcf3gsXt_xN34b27sobZUuTG6ZYNuU5-1mUCT3yGmuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014/1/14 Marko Tiikkaja <marko(at)joh(dot)to>

> On 1/14/14 12:28 PM, Marti Raudsepp wrote:
>
>> I've always hated INTO in procedures since it makes the code harder to
>> follow and has very different behavior on the SQL level, in addition
>> to the multi-row problem you bring up. If we can make assignment
>> syntax more versatile and eventually replace INTO, then that solves
>> multiple problems in the language without breaking backwards
>> compatibility.
>>
>
> I don't personally have a problem with INTO other than the behaviour that
> started this thread. But I'm willing to consider other options.
>
>
> On Tue, Jan 14, 2014 at 4:30 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
>>
>>> On 2014-01-14 02:54, Marti Raudsepp wrote:
>>>
>>>> But PL/pgSQL already has an assignment syntax with the behavior you
>>>> want:
>>>>
>>>
>>> According to the docs, that doesn't set FOUND which would make this a
>>> pain
>>> to deal with..
>>>
>>
>> Right you are. If we can extend the syntax then we could make it such
>> that "= SELECT" sets FOUND and other diagnostics, and a simple
>> assignment doesn't. Which makes sense IMO:
>>
>> a = 10; -- simple assignments really shouldn't affect FOUND
>>
>
> With you so far.
>
>
> With explicit SELECT, clearly the intent is to perform a query:
>> a = SELECT foo FROM table;
>> And this could also work:
>> a = INSERT INTO table (foo) VALUES (10) RETURNING foo_id;
>>
>
> I'm not sure that would work with the grammar. Basically what PL/PgSQL
> does right now is for a statement like:
>
> a = 1;
>
> It parses the "a =" part itself, and then just reads until the next
> unquoted semicolon without actually looking at it, and slams a "SELECT " in
> front of it. With this approach we'd have to look into the query and try
> and guess what it does. That might be possible, but I don't like the idea.
>
>
> AFAICT the fact that this works is more of an accident and should be
>> discouraged. We can leave it as is for compatibility's sake:
>> a = foo FROM table;
>>
>
> I've always considered that ugly (IIRC it's still undocumented as well),
> and would encourage people not to do that.
>
>
> Now, another question is whether it's possible to make the syntax
>> work. Is this an assignment from the result of a subquery, or is it a
>> query by itself?
>> a = (SELECT foo FROM table);
>>
>
only this form is allowed in SQL/PSM - and it has some logic - you can
assign result of subquery (should be one row only) to variable.

>
> That looks like a scalar subquery, which is wrong because they can't
> return more than one column (nor can they be INSERT etc., obviously).
>
> How about:
>
> (a) = SELECT 1;
> (a, b) = SELECT 1, 2;
> (a, b) = INSERT INTO foo RETURNING col1, col2;
>
>
I prefer subquery only syntax - a := (some) or (a,b,c) = (some a,b,c) with
possible enhancing for statements with RETURNING

a advance is compatibility with DB2 (SQL/PSM) syntax - and this code is
written now - it is done in my sql/psm implementation

Regards

Pavel

> Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
> AFAICT this can be parsed unambiguously, too, and we don't need to look at
> the query string because this is new syntax.
>
>
> Regards,
> Marko Tiikkaja
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 15:31:13
Message-ID: 52D55841.1030607@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/14/14 1:28 PM, Pavel Stehule wrote:
> I prefer subquery only syntax - a := (some) or (a,b,c) = (some a,b,c) with
> possible enhancing for statements with RETURNING
>
> a advance is compatibility with DB2 (SQL/PSM) syntax - and this code is
> written now - it is done in my sql/psm implementation

Are you saying that DB2 supports the multi-variable assignment? I
couldn't find any reference suggesting that, can you point me to one?

Regards,
Marko Tiikkaja


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 15:39:33
Message-ID: CAFj8pRCQVFyNxJ9NRePPRLQoFxLCWrc=MHJDgJ-2kbJZXcvbHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

2014/1/14 Marko Tiikkaja <marko(at)joh(dot)to>

> On 1/14/14 1:28 PM, Pavel Stehule wrote:
>
>> I prefer subquery only syntax - a := (some) or (a,b,c) = (some a,b,c) with
>> possible enhancing for statements with RETURNING
>>
>> a advance is compatibility with DB2 (SQL/PSM) syntax - and this code is
>> written now - it is done in my sql/psm implementation
>>
>
> Are you saying that DB2 supports the multi-variable assignment? I
> couldn't find any reference suggesting that, can you point me to one?
>

I was wrong - it is different syntax - it is SQL/PSM form - db2 supports
SET var=val, var=val, ...

http://postgres.cz/wiki/Basic_statements_SQL/PSM_samples

Regards

Pavel

>
> Regards,
> Marko Tiikkaja
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 17:15:08
Message-ID: 19226.1389719708@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Tiikkaja <marko(at)joh(dot)to> writes:
> On 1/14/14 12:28 PM, Marti Raudsepp wrote:
>> Now, another question is whether it's possible to make the syntax
>> work. Is this an assignment from the result of a subquery, or is it a
>> query by itself?
>> a = (SELECT foo FROM table);

> That looks like a scalar subquery, which is wrong because they can't
> return more than one column (nor can they be INSERT etc., obviously).

Yeah, it's a scalar subquery, which means that plpgsql already assigns
a non-error meaning to this syntax.

> How about:

> (a) = SELECT 1;
> (a, b) = SELECT 1, 2;
> (a, b) = INSERT INTO foo RETURNING col1, col2;

> Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
> AFAICT this can be parsed unambiguously, too, and we don't need to look
> at the query string because this is new syntax.

The idea of inventing new syntax along this line seems like a positive
direction to pursue. Since assignment already rejects multiple rows
from the source expression, this wouldn't be weirdly inconsistent.

It might be worth thinking about the <multiple column assignment> UPDATE
syntax that's in recent versions of the SQL standard:

UPDATE targettab SET (a, b, c) = row-valued-expression [ , ... ] [ WHERE ... ]

We don't actually implement this in PG yet, except for trivial cases, but
it will certainly happen eventually. I think your sketch above deviates
unnecessarily from what the standard says for UPDATE. In particular
I think it'd be better to write things like

(a, b) = ROW(1, 2);
(a, b, c) = (SELECT x, y, z FROM foo WHERE id = 42);

which would exactly match what you'd write in a multiple-assignment
UPDATE, and it has the same rejects-multiple-rows semantics too.

Also note that the trivial cases we do already implement in UPDATE
look like

UPDATE targettab SET (a, b, c) = (1, 2, 3) [ WHERE ... ]

that is, we allow a row constructor where the optional keyword ROW has
been omitted. I think people would expect to be able to write this in
plpgsql:

(a, b) = (1, 2);

Now, this doesn't provide any guidance for INSERT/UPDATE/DELETE RETURNING,
but frankly I don't feel any need to invent new syntax for those, since
RETURNING INTO already works the way you want.

I'm not too sure what it'd take to make this work. Right now,

SELECT (SELECT x, y FROM foo WHERE id = 42);

would generate "ERROR: subquery must return only one column", but
I think it's mostly a historical artifact that it does that rather than
returning a composite value (of an anonymous record type). If we were
willing to make that change then it seems like it'd be pretty
straightforward to teach plpgsql to handle

(a, b, ...) = row-valued-expression

where there wouldn't actually be any need to parse the RHS any differently
from the way plpgsql parses an assignment RHS right now. Which would be
a good thing IMO. If we don't generalize the behavior of scalar
subqueries then plpgsql would have to jump through a lot of hoops to
support the subselect case.

regards, tom lane


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 17:51:09
Message-ID: 52D5790D.3070804@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 1/14/14, 6:15 PM, Tom Lane wrote:
> Marko Tiikkaja <marko(at)joh(dot)to> writes:
>> How about:
>
>> (a) = SELECT 1;
>> (a, b) = SELECT 1, 2;
>> (a, b) = INSERT INTO foo RETURNING col1, col2;
>
>> Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
>> AFAICT this can be parsed unambiguously, too, and we don't need to look
>> at the query string because this is new syntax.
>
> The idea of inventing new syntax along this line seems like a positive
> direction to pursue. Since assignment already rejects multiple rows
> from the source expression, this wouldn't be weirdly inconsistent.
>
> It might be worth thinking about the <multiple column assignment> UPDATE
> syntax that's in recent versions of the SQL standard:
>
> UPDATE targettab SET (a, b, c) = row-valued-expression [ , ... ] [ WHERE ... ]
>
> We don't actually implement this in PG yet, except for trivial cases, but
> it will certainly happen eventually. I think your sketch above deviates
> unnecessarily from what the standard says for UPDATE. In particular
> I think it'd be better to write things like
>
> (a, b) = ROW(1, 2);
> (a, b, c) = (SELECT x, y, z FROM foo WHERE id = 42);
>
> which would exactly match what you'd write in a multiple-assignment
> UPDATE, and it has the same rejects-multiple-rows semantics too.

Hmm. That's a fair point I did not consider.

> Also note that the trivial cases we do already implement in UPDATE
> look like
>
> UPDATE targettab SET (a, b, c) = (1, 2, 3) [ WHERE ... ]
>
> that is, we allow a row constructor where the optional keyword ROW has
> been omitted. I think people would expect to be able to write this in
> plpgsql:
>
> (a, b) = (1, 2);
>
> Now, this doesn't provide any guidance for INSERT/UPDATE/DELETE RETURNING,
> but frankly I don't feel any need to invent new syntax for those, since
> RETURNING INTO already works the way you want.

Yeah, I don't feel strongly about having to support them with this
syntax. The inconsistency is a bit ugly, but it's not the end of the world.

> I'm not too sure what it'd take to make this work. Right now,
>
> SELECT (SELECT x, y FROM foo WHERE id = 42);
>
> would generate "ERROR: subquery must return only one column", but
> I think it's mostly a historical artifact that it does that rather than
> returning a composite value (of an anonymous record type). If we were
> willing to make that change then it seems like it'd be pretty
> straightforward to teach plpgsql to handle
>
> (a, b, ...) = row-valued-expression
>
> where there wouldn't actually be any need to parse the RHS any differently
> from the way plpgsql parses an assignment RHS right now. Which would be
> a good thing IMO. If we don't generalize the behavior of scalar
> subqueries then plpgsql would have to jump through a lot of hoops to
> support the subselect case.

You can already do the equivalent of (a,b,c) = (1,2,3) with SELECT ..
INTO. Would you oppose to starting the work on this by only supporting
the subquery syntax, with the implementation being similar to how we
currently handle SELECT .. INTO? If we could also not flat out reject
including that into 9.4, I would be quite happy.

Regards,
Marko Tiikkaja


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 17:56:01
Message-ID: 20306.1389722161@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Tiikkaja <marko(at)joh(dot)to> writes:
> On 1/14/14, 6:15 PM, Tom Lane wrote:
>> I'm not too sure what it'd take to make this work. Right now,
>>
>> SELECT (SELECT x, y FROM foo WHERE id = 42);
>>
>> would generate "ERROR: subquery must return only one column", but
>> I think it's mostly a historical artifact that it does that rather than
>> returning a composite value (of an anonymous record type). If we were
>> willing to make that change then it seems like it'd be pretty
>> straightforward to teach plpgsql to handle
>>
>> (a, b, ...) = row-valued-expression
>>
>> where there wouldn't actually be any need to parse the RHS any differently
>> from the way plpgsql parses an assignment RHS right now. Which would be
>> a good thing IMO. If we don't generalize the behavior of scalar
>> subqueries then plpgsql would have to jump through a lot of hoops to
>> support the subselect case.

> You can already do the equivalent of (a,b,c) = (1,2,3) with SELECT ..
> INTO. Would you oppose to starting the work on this by only supporting
> the subquery syntax, with the implementation being similar to how we
> currently handle SELECT .. INTO?

You can try if you want, but I suspect it will result in writing a lot of
basically throwaway code, ie, not something that would be a precursor
of code that could support the generic-row-valued-expression case.

regards, tom lane


From: Jim Nasby <jim(at)nasby(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-15 06:23:25
Message-ID: 52D6295D.6000104@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/14/14, 11:15 AM, Tom Lane wrote:
>> How about:
>> > (a) = SELECT 1;
>> > (a, b) = SELECT 1, 2;
>> > (a, b) = INSERT INTO foo RETURNING col1, col2;
>> >Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
>> >AFAICT this can be parsed unambiguously, too, and we don't need to look
>> >at the query string because this is new syntax.
> The idea of inventing new syntax along this line seems like a positive
> direction to pursue. Since assignment already rejects multiple rows
> from the source expression, this wouldn't be weirdly inconsistent.

Do we actually support = right now? We already support

v_field := field FROM table ... ;

and I think it's a bad idea to have different meaning for = and :=.

> I'm not too sure what it'd take to make this work. Right now,
>
> SELECT (SELECT x, y FROM foo WHERE id = 42);
>
> would generate "ERROR: subquery must return only one column", but
> I think it's mostly a historical artifact that it does that rather than
> returning a composite value (of an anonymous record type). If we were
> willing to make that change then it seems like it'd be pretty
> straightforward to teach plpgsql to handle
>
> (a, b, ...) = row-valued-expression
>
> where there wouldn't actually be any need to parse the RHS any differently
> from the way plpgsql parses an assignment RHS right now. Which would be
> a good thing IMO. If we don't generalize the behavior of scalar
> subqueries then plpgsql would have to jump through a lot of hoops to
> support the subselect case.

I have no idea if this is related or not, but I would REALLY like for this to work (doesn't in 8.4, AFAIK not in 9.1 either...)

CREATE FUNCTION f(int) RETURNS text STABLE LANGUAGE sql AS ( SELECT field FROM table WHERE table_id = $1 );
SELECT f(blah_id) FROM ...

to be equivalent to

SELECT ( SELECT field FROM table WHERE table_id = blah_id ) FROM ...

That would make it very easy to do a lot of code simplification with no performance loss.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-15 06:35:46
Message-ID: 13490.1389767746@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim Nasby <jim(at)nasby(dot)net> writes:
> Do we actually support = right now? We already support
> v_field := field FROM table ... ;
> and I think it's a bad idea to have different meaning for = and :=.

That ship sailed a *very* long time ago. See other thread about
documenting rather than ignoring this more-or-less-aboriginal
behavior of plpgsql.

> I have no idea if this is related or not, but I would REALLY like for this to work (doesn't in 8.4, AFAIK not in 9.1 either...)

Hm ... too tired to be sure, but I think the issue about inlining a
function of this kind has to do with whether you get the same answers
in corner cases such as subselect fetching no rows.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(at)joh(dot)to>, Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-15 10:11:55
Message-ID: CAFj8pRDij_Pq=5Qmpb3MLD7BJgmyznU4OHhgj+oYCKB9Mkqx0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014/1/15 Jim Nasby <jim(at)nasby(dot)net>

> On 1/14/14, 11:15 AM, Tom Lane wrote:
>
>> How about:
>>> > (a) = SELECT 1;
>>> > (a, b) = SELECT 1, 2;
>>> > (a, b) = INSERT INTO foo RETURNING col1, col2;
>>> >Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
>>> >AFAICT this can be parsed unambiguously, too, and we don't need to look
>>> >at the query string because this is new syntax.
>>>
>> The idea of inventing new syntax along this line seems like a positive
>> direction to pursue. Since assignment already rejects multiple rows
>> from the source expression, this wouldn't be weirdly inconsistent.
>>
>
> Do we actually support = right now? We already support
>
> v_field := field FROM table ... ;
>
> and I think it's a bad idea to have different meaning for = and :=.

It is probably second the most ugly thing on plpgsql. I don't know about
other crippled embedded SQL statement in any stored procedure language.

Regards

Pavel

>
>
> I'm not too sure what it'd take to make this work. Right now,
>>
>> SELECT (SELECT x, y FROM foo WHERE id = 42);
>>
>> would generate "ERROR: subquery must return only one column", but
>> I think it's mostly a historical artifact that it does that rather than
>> returning a composite value (of an anonymous record type). If we were
>> willing to make that change then it seems like it'd be pretty
>> straightforward to teach plpgsql to handle
>>
>> (a, b, ...) = row-valued-expression
>>
>> where there wouldn't actually be any need to parse the RHS any differently
>> from the way plpgsql parses an assignment RHS right now. Which would be
>> a good thing IMO. If we don't generalize the behavior of scalar
>> subqueries then plpgsql would have to jump through a lot of hoops to
>> support the subselect case.
>>
>
> I have no idea if this is related or not, but I would REALLY like for this
> to work (doesn't in 8.4, AFAIK not in 9.1 either...)
>
> CREATE FUNCTION f(int) RETURNS text STABLE LANGUAGE sql AS ( SELECT field
> FROM table WHERE table_id = $1 );
> SELECT f(blah_id) FROM ...
>
> to be equivalent to
>
> SELECT ( SELECT field FROM table WHERE table_id = blah_id ) FROM ...
>
> That would make it very easy to do a lot of code simplification with no
> performance loss.
>
> --
> Jim C. Nasby, Data Architect jim(at)nasby(dot)net
> 512.569.9461 (cell) http://jim.nasby.net
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-17 13:50:31
Message-ID: CABRT9RBkT2kAYaw1bQV9bs0Eb5L7usfQKEd33T33trPEuLkXYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 15, 2014 at 8:23 AM, Jim Nasby <jim(at)nasby(dot)net> wrote:
> Do we actually support = right now? We already support
>
> v_field := field FROM table ... ;
>
> and I think it's a bad idea to have different meaning for = and :=.

That was already discussed before. Yes, we support both = and := and
they have exactly the same behavior; I agree, we should keep them
equivalent.

Regards,
Marti


From: Jim Nasby <jim(at)nasby(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-22 19:35:36
Message-ID: 52E01D88.5060702@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/15/14, 12:35 AM, Tom Lane wrote:
> Jim Nasby <jim(at)nasby(dot)net> writes:
>> Do we actually support = right now? We already support
>> v_field := field FROM table ... ;
>> and I think it's a bad idea to have different meaning for = and :=.
>
> That ship sailed a *very* long time ago. See other thread about
> documenting rather than ignoring this more-or-less-aboriginal
> behavior of plpgsql.

Yeah, I had no idea that was even supported...

>> I have no idea if this is related or not, but I would REALLY like for this to work (doesn't in 8.4, AFAIK not in 9.1 either...)
>
> Hm ... too tired to be sure, but I think the issue about inlining a
> function of this kind has to do with whether you get the same answers
> in corner cases such as subselect fetching no rows.

There was some discussion about this a few years ago and I think that was essentially the issue.

What I think would work is essentially a macro that would dump the function definition right into the query and then let the planner deal with it. So

SELECT blah, ( SELECT status_code FROM status_code WHERE status_code_id = blah_status_code_id ) FROM blah;

can become simply

SELECT blah, status_code__get_text( blah_status_code_id ) FROM blah;

but have it translate to the same raw SQL, same as views.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-07-29 22:00:51
Message-ID: 53D81993.9060804@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/14/14, 6:15 PM, Tom Lane wrote:
> We don't actually implement this in PG yet, except for trivial cases, but
> it will certainly happen eventually. I think your sketch above deviates
> unnecessarily from what the standard says for UPDATE. In particular
> I think it'd be better to write things like
>
> (a, b) = ROW(1, 2);
> (a, b, c) = (SELECT x, y, z FROM foo WHERE id = 42);
>
> which would exactly match what you'd write in a multiple-assignment
> UPDATE, and it has the same rejects-multiple-rows semantics too.

Just in case someone's interested: I won't be working on this for 9.5.
If someone feels like picking this patch up, be my guest.

.marko