Re: plpgsql_check_function - rebase for 9.3

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: plpgsql_check_function - rebase for 9.3
Date: 2012-06-26 08:19:07
Message-ID: CAFj8pRAYVTQYCL8_NF_hDQjc0m+JBvbwR6E_ZJ0SJfkKQ9m2kA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I am sending lightly refreshed patch for checking plpgsql functions..

I checked different implementation, but without success: a) enhancing
of SPI to some fake mode can has negative impact on application, and
patch was not clear, b) generic plpgsql walker doesn't save lines too.

I invite any ideas how to improve this patch

Regards

Pavel

Attachment Content-Type Size
plpgsql_check_function-2012-06-26.diff application/octet-stream 84.7 KB

From: Selena Deckelmann <selena(at)chesnok(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2012-10-07 00:07:01
Message-ID: CAN1EF+zH8FUzs6FqnbA3fGJAbBRaOb5HSKAp5nKcD1KL0jHVbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:

> I am sending lightly refreshed patch for checking plpgsql functions..
>
> I checked different implementation, but without success: a) enhancing
> of SPI to some fake mode can has negative impact on application, and
> patch was not clear, b) generic plpgsql walker doesn't save lines too.
>
> I invite any ideas how to improve this patch

I reviewed this and did a clean up for bitrot and a little whitespace.
In particular, it needed to learn a little about event triggers.

This patch is a follow on from an earlier review thread I found:
http://archives.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C2072DF447@exadv11.host.magwien.gv.at

I dug through that thread a bit, and I believe issues raised by
Laurenz, Petr and Alvaro were resolved by Pavel over time.

All tests pass, and after a read-through, the code seems fine.

This also represents my inaugural use of pg_bsd_indent. I ran it on
pl_check.c - which made things mostly better. Happy to try and fix it
up more if someone can explain to me what (if anything) I did
incorrectly when using it.

-selena

--
http://chesnok.com

Attachment Content-Type Size
plpgsql_check_function-20121006.patch application/octet-stream 97.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Selena Deckelmann <selena(at)chesnok(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2012-10-07 15:25:50
Message-ID: 11878.1349623550@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Selena Deckelmann <selena(at)chesnok(dot)com> writes:
> This also represents my inaugural use of pg_bsd_indent. I ran it on
> pl_check.c - which made things mostly better. Happy to try and fix it
> up more if someone can explain to me what (if anything) I did
> incorrectly when using it.

It looks like you didn't give it a typedef list? Fetch the file from
http://buildfarm.postgresql.org/cgi-bin/typedefs.pl
and pass it with --typedefs=filename. If you're dealing with something
that adds new typedef names, you can add them to the list file manually.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Selena Deckelmann <selena(at)chesnok(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2012-10-07 15:39:15
Message-ID: 5071A223.6030209@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 10/07/2012 11:25 AM, Tom Lane wrote:
> Selena Deckelmann <selena(at)chesnok(dot)com> writes:
>> This also represents my inaugural use of pg_bsd_indent. I ran it on
>> pl_check.c - which made things mostly better. Happy to try and fix it
>> up more if someone can explain to me what (if anything) I did
>> incorrectly when using it.
> It looks like you didn't give it a typedef list? Fetch the file from
> http://buildfarm.postgresql.org/cgi-bin/typedefs.pl
> and pass it with --typedefs=filename. If you're dealing with something
> that adds new typedef names, you can add them to the list file manually.
>
>

You's not supposed to be calling pg_bsd_indent directly at all. You's
supposed to be calling pgindent - it will call pg_bsd_indent as needed
and adjust the results. See src/tools/pgindent/README.

cheers

andrew


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Selena Deckelmann <selena(at)chesnok(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2012-11-27 19:45:01
Message-ID: CAFj8pRDN1Ae7mFzdWZMAETDWT4FWHE-AX5=6sLROzn5ZSDBQyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I am sending a updated version - now it is prepared for event triggers
and it is little bit more robust

I run pgindent, but I have no experience with it, so I am not sure about success

Regards

Pavel Stehule

2012/10/7 Selena Deckelmann <selena(at)chesnok(dot)com>:
> Hi!
>
> On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
>> I am sending lightly refreshed patch for checking plpgsql functions..
>>
>> I checked different implementation, but without success: a) enhancing
>> of SPI to some fake mode can has negative impact on application, and
>> patch was not clear, b) generic plpgsql walker doesn't save lines too.
>>
>> I invite any ideas how to improve this patch
>
> I reviewed this and did a clean up for bitrot and a little whitespace.
> In particular, it needed to learn a little about event triggers.
>
> This patch is a follow on from an earlier review thread I found:
> http://archives.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C2072DF447@exadv11.host.magwien.gv.at
>
> I dug through that thread a bit, and I believe issues raised by
> Laurenz, Petr and Alvaro were resolved by Pavel over time.
>
> All tests pass, and after a read-through, the code seems fine.
>
> This also represents my inaugural use of pg_bsd_indent. I ran it on
> pl_check.c - which made things mostly better. Happy to try and fix it
> up more if someone can explain to me what (if anything) I did
> incorrectly when using it.
>
> -selena
>
> --
> http://chesnok.com

Attachment Content-Type Size
plpgsql_check_function_20121127_01.patch.gz application/x-gzip 20.1 KB

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Selena Deckelmann <selena(at)chesnok(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2012-11-28 10:06:33
Message-ID: CAFj8pRBODPwuELVuOwgiS3ZS3b5M4W+QMJ5RieZzHv40WSsD-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

a some updated version

* possibility to raise (and filter) performance warnings - detects IO castings
* detects assign composite value to scalar variable

Regards

Pavel Stehule

2012/11/27 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> Hello
>
> I am sending a updated version - now it is prepared for event triggers
> and it is little bit more robust
>
> I run pgindent, but I have no experience with it, so I am not sure about success
>
> Regards
>
> Pavel Stehule
>
>
> 2012/10/7 Selena Deckelmann <selena(at)chesnok(dot)com>:
>> Hi!
>>
>> On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>
>>> I am sending lightly refreshed patch for checking plpgsql functions..
>>>
>>> I checked different implementation, but without success: a) enhancing
>>> of SPI to some fake mode can has negative impact on application, and
>>> patch was not clear, b) generic plpgsql walker doesn't save lines too.
>>>
>>> I invite any ideas how to improve this patch
>>
>> I reviewed this and did a clean up for bitrot and a little whitespace.
>> In particular, it needed to learn a little about event triggers.
>>
>> This patch is a follow on from an earlier review thread I found:
>> http://archives.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C2072DF447@exadv11.host.magwien.gv.at
>>
>> I dug through that thread a bit, and I believe issues raised by
>> Laurenz, Petr and Alvaro were resolved by Pavel over time.
>>
>> All tests pass, and after a read-through, the code seems fine.
>>
>> This also represents my inaugural use of pg_bsd_indent. I ran it on
>> pl_check.c - which made things mostly better. Happy to try and fix it
>> up more if someone can explain to me what (if anything) I did
>> incorrectly when using it.
>>
>> -selena
>>
>> --
>> http://chesnok.com

Attachment Content-Type Size
plpgsql_check_function_20121128_01.patch.gz application/x-gzip 20.4 KB

From: "Petr Jelinek" <pjmodos(at)pjmodos(dot)net>
To: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>, "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-01-26 16:52:31
Message-ID: 000c01cdfbe5$88d07da0$9a7178e0$@pjmodos.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of Pavel Stehule
> Sent: 28 November 2012 11:07
> To: Selena Deckelmann
> Cc: PostgreSQL Hackers
> Subject: Re: [HACKERS] plpgsql_check_function - rebase for 9.3
>
> Hello
>
> a some updated version
>
> * possibility to raise (and filter) performance warnings - detects IO castings
> * detects assign composite value to scalar variable
>

Hello,

I was wondering if maybe this could be split to 2 separate patches, one would be all the changes to the existing plpgsql code - rename delete_function to plpgsql_delete_function and extern plpgsql_delete_function, copy_plpgsql_datum, plpgsql_estate_setup, plpgsql_destroy_econtext and the other patch would be the actual checker.

Reasoning for this is that the first patch (exporting some of plpgsql internals) should be very safe to commit and would be useful by itself even if the checker does not get in 9.3 since it would enable users to write lints/checkers/analysers for plpgsql as standalone extensions (for my use case this is actually way more useful than having the checker as part of core).

Regards
Petr


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Petr Jelinek" <pjmodos(at)pjmodos(dot)net>
Cc: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-01-26 17:16:18
Message-ID: 20747.1359220578@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Petr Jelinek" <pjmodos(at)pjmodos(dot)net> writes:
> I was wondering if maybe this could be split to 2 separate patches, one would be all the changes to the existing plpgsql code - rename delete_function to plpgsql_delete_function and extern plpgsql_delete_function, copy_plpgsql_datum, plpgsql_estate_setup, plpgsql_destroy_econtext and the other patch would be the actual checker.

> Reasoning for this is that the first patch (exporting some of plpgsql internals) should be very safe to commit and would be useful by itself even if the checker does not get in 9.3 since it would enable users to write lints/checkers/analysers for plpgsql as standalone extensions (for my use case this is actually way more useful than having the checker as part of core).

What exactly do you have in mind there? The way we load extensions,
they can't (AFAIK) see each other's defined symbols, so you couldn't
really make an independent extension that would call functions in
plpgsql. This is not really open for debate, either, as changing that
would risk creating symbol collisions between modules that never had to
worry about polluting global namespace before.

I would note also that we absolutely, positively do not guarantee not
to change plpgsql's private data structures in minor releases.

regards, tom lane


From: "Petr Jelinek" <pjmodos(at)pjmodos(dot)net>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-01-26 18:24:01
Message-ID: 001401cdfbf2$5102fc60$f308f520$@pjmodos.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of Tom Lane
> Sent: 26 January 2013 18:16
> Subject: Re: [HACKERS] plpgsql_check_function - rebase for 9.3
>
> "Petr Jelinek" <pjmodos(at)pjmodos(dot)net> writes:
> > I was wondering if maybe this could be split to 2 separate patches, one
> would be all the changes to the existing plpgsql code - rename
> delete_function to plpgsql_delete_function and extern
> plpgsql_delete_function, copy_plpgsql_datum, plpgsql_estate_setup,
> plpgsql_destroy_econtext and the other patch would be the actual checker.
>
> > Reasoning for this is that the first patch (exporting some of plpgsql
> internals) should be very safe to commit and would be useful by itself
even if
> the checker does not get in 9.3 since it would enable users to write
> lints/checkers/analysers for plpgsql as standalone extensions (for my use
> case this is actually way more useful than having the checker as part of
core).
>
> What exactly do you have in mind there? The way we load extensions, they
> can't (AFAIK) see each other's defined symbols, so you couldn't really
make
> an independent extension that would call functions in plpgsql. This is
not
> really open for debate, either, as changing that would risk creating
symbol
> collisions between modules that never had to worry about polluting global
> namespace before.

I can call functions that are exported by plpgsql.so just fine from
different extension now, I just have to preload the plpgsql.so (via LOAD or
guc) first, so I don't see what is the problem here.

> I would note also that we absolutely, positively do not guarantee not to
> change plpgsql's private data structures in minor releases.

That didn't stop people from from writing plpgsql extensions before, don't
see why it would now, the problem is that they have to use hooks now and
those require the function to be executed, making those plpgsql interfaces
external would help with setting up things so that extension can work with
function without function being executed or duplicating a lot of plpgsql
code. As an example of all of this you can see
https://github.com/okbob/plpgsql_lint which is the original module Pavel
wrote before writing this patch.

The other thing is this is something the patch in current form does anyway
so I don't see the harm.

Regards
Petr


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Petr Jelinek <pjmodos(at)pjmodos(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-01-26 18:38:41
Message-ID: CAFj8pRDteF7C-u-zy3Y2FqqoCFVfYxdMTwsOvVK0NrOKMRfjBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/1/26 Petr Jelinek <pjmodos(at)pjmodos(dot)net>:
>> -----Original Message-----
>> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
>> owner(at)postgresql(dot)org] On Behalf Of Tom Lane
>> Sent: 26 January 2013 18:16
>> Subject: Re: [HACKERS] plpgsql_check_function - rebase for 9.3
>>
>> "Petr Jelinek" <pjmodos(at)pjmodos(dot)net> writes:
>> > I was wondering if maybe this could be split to 2 separate patches, one
>> would be all the changes to the existing plpgsql code - rename
>> delete_function to plpgsql_delete_function and extern
>> plpgsql_delete_function, copy_plpgsql_datum, plpgsql_estate_setup,
>> plpgsql_destroy_econtext and the other patch would be the actual checker.
>>
>> > Reasoning for this is that the first patch (exporting some of plpgsql
>> internals) should be very safe to commit and would be useful by itself
> even if
>> the checker does not get in 9.3 since it would enable users to write
>> lints/checkers/analysers for plpgsql as standalone extensions (for my use
>> case this is actually way more useful than having the checker as part of
> core).

A significant improvement in this are can be placing plpgsql.h to
other header files. Now plpgsql extensions have to use private copy of
this file - what is really ugly

Pavel

>>
>> What exactly do you have in mind there? The way we load extensions, they
>> can't (AFAIK) see each other's defined symbols, so you couldn't really
> make
>> an independent extension that would call functions in plpgsql. This is
> not
>> really open for debate, either, as changing that would risk creating
> symbol
>> collisions between modules that never had to worry about polluting global
>> namespace before.
>
> I can call functions that are exported by plpgsql.so just fine from
> different extension now, I just have to preload the plpgsql.so (via LOAD or
> guc) first, so I don't see what is the problem here.
>
>> I would note also that we absolutely, positively do not guarantee not to
>> change plpgsql's private data structures in minor releases.
>
> That didn't stop people from from writing plpgsql extensions before, don't
> see why it would now, the problem is that they have to use hooks now and
> those require the function to be executed, making those plpgsql interfaces
> external would help with setting up things so that extension can work with
> function without function being executed or duplicating a lot of plpgsql
> code. As an example of all of this you can see
> https://github.com/okbob/plpgsql_lint which is the original module Pavel
> wrote before writing this patch.
>
> The other thing is this is something the patch in current form does anyway
> so I don't see the harm.
>
> Regards
> Petr
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Petr Jelinek" <pjmodos(at)pjmodos(dot)net>
Cc: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-01-26 18:53:43
Message-ID: 22835.1359226423@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Petr Jelinek" <pjmodos(at)pjmodos(dot)net> writes:
>> What exactly do you have in mind there? The way we load extensions, they
>> can't (AFAIK) see each other's defined symbols, so you couldn't really make
>> an independent extension that would call functions in plpgsql. This is not
>> really open for debate, either, as changing that would risk creating symbol
>> collisions between modules that never had to worry about polluting global
>> namespace before.

> I can call functions that are exported by plpgsql.so just fine from
> different extension now, I just have to preload the plpgsql.so (via LOAD or
> guc) first, so I don't see what is the problem here.

[ pokes around... ] Hm, it appears that that does work on Linux,
because for some reason we're specifying RTLD_GLOBAL to dlopen().
TBH that seems like a truly horrid idea that we should reconsider.
Aside from the danger of unexpected symbol collisions between
independent loadable modules, I seriously doubt that it works like
that on every platform we support --- so I'd be very strongly against
accepting any code that depends on this working.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Petr Jelinek" <pjmodos(at)pjmodos(dot)net>
Cc: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-01-26 19:12:19
Message-ID: 23204.1359227539@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> [ pokes around... ] Hm, it appears that that does work on Linux,
> because for some reason we're specifying RTLD_GLOBAL to dlopen().
> TBH that seems like a truly horrid idea that we should reconsider.

A bit of research in the archives revealed that we're using it because
back in 2001, the lame hack that then passed for a shared-library
version of python required it:
http://www.postgresql.org/message-id/Pine.LNX.4.30.0105121914200.757-100000@peter.localdomain

There was subsequent discussion of removing it, because reportedly now
(a) that's no longer the case, and (b) we need to get rid of it to allow
plpython2 and plpython3 to coexist in one session. See for instance:
http://www.postgresql.org/message-id/1277506674.5356.27.camel@vanquo.pezone.net

Nothing's been done about that yet, but I think that assuming that we'll
be using RTLD_GLOBAL forever would be foolish.

regards, tom lane


From: "Petr Jelinek" <pjmodos(at)pjmodos(dot)net>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-01-26 23:41:40
Message-ID: 001601cdfc1e$b1348500$139d8f00$@pjmodos.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: 26 January 2013 20:12
> Subject: Re: [HACKERS] plpgsql_check_function - rebase for 9.3
>
> I wrote:
> > [ pokes around... ] Hm, it appears that that does work on Linux,
> > because for some reason we're specifying RTLD_GLOBAL to dlopen().
> > TBH that seems like a truly horrid idea that we should reconsider.
>
> A bit of research in the archives revealed that we're using it because
back in
> 2001, the lame hack that then passed for a shared-library version of
python
> required it:
> http://www.postgresql.org/message-id/Pine.LNX.4.30.0105121914200.757-
> 100000(at)peter(dot)localdomain
>
> There was subsequent discussion of removing it, because reportedly now
> (a) that's no longer the case, and (b) we need to get rid of it to allow
> plpython2 and plpython3 to coexist in one session. See for instance:
> http://www.postgresql.org/message-
> id/1277506674(dot)5356(dot)27(dot)camel(at)vanquo(dot)pezone(dot)net
>
> Nothing's been done about that yet, but I think that assuming that we'll
be
> using RTLD_GLOBAL forever would be foolish.
>

Ok then it was a bad idea after all.

Regards
Petr


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Petr Jelinek <pjmodos(at)pjmodos(dot)net>, 'Pavel Stehule' <pavel(dot)stehule(at)gmail(dot)com>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-01-28 15:11:42
Message-ID: 5106952E.4080802@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/26/13 1:53 PM, Tom Lane wrote:
> [ pokes around... ] Hm, it appears that that does work on Linux,
> because for some reason we're specifying RTLD_GLOBAL to dlopen().
> TBH that seems like a truly horrid idea that we should reconsider.
> Aside from the danger of unexpected symbol collisions between
> independent loadable modules, I seriously doubt that it works like
> that on every platform we support --- so I'd be very strongly against
> accepting any code that depends on this working.

Well, that would kill a lot of potentially useful features, including
the transforms feature I've been working on and any kind of hook or
debugger or profiler on an existing module. (How do plpgsql plugins
work?) We also couldn't transparently move functionality out of the
postgres binary into a module.

I see the concern about symbol collisions. But you can normally work
around that by prefixing exported symbols.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <pjmodos(at)pjmodos(dot)net>, 'Pavel Stehule' <pavel(dot)stehule(at)gmail(dot)com>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-01-28 15:38:48
Message-ID: 51069B88.8000505@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/28/2013 10:11 AM, Peter Eisentraut wrote:
> On 1/26/13 1:53 PM, Tom Lane wrote:
>> [ pokes around... ] Hm, it appears that that does work on Linux,
>> because for some reason we're specifying RTLD_GLOBAL to dlopen().
>> TBH that seems like a truly horrid idea that we should reconsider.
>> Aside from the danger of unexpected symbol collisions between
>> independent loadable modules, I seriously doubt that it works like
>> that on every platform we support --- so I'd be very strongly against
>> accepting any code that depends on this working.
> Well, that would kill a lot of potentially useful features, including
> the transforms feature I've been working on and any kind of hook or
> debugger or profiler on an existing module. (How do plpgsql plugins
> work?) We also couldn't transparently move functionality out of the
> postgres binary into a module.
>
> I see the concern about symbol collisions. But you can normally work
> around that by prefixing exported symbols.
>

Yeah, I was just writing something a couple of days ago that leveraged
stuff in an extension, so it looks like this is wanted functionality. In
general we want to be able to layer addon modules, ISTM.

cheers

andrew


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Petr Jelinek <pjmodos(at)pjmodos(dot)net>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-03-03 14:53:41
Message-ID: CAFj8pRBHgk5f8ZjvN3c1VdfJ-z8_xWUf6jzF5g84hEos67FUBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I though about any possibility how to reduce a size of this patch.

I see a one solution - if we would not support a detection of multiple
errors - it stops on first error, we can push this code to compilation
stage.

a plpgsql_validator can be overloaded by function

plpgsql_validator(oid, reloid, level)

reloid - is requested by triggers - it is related class oid
level - it is level of checking

0 - same as current implementation - check only syntax errors
1 - stop on first object error - no table, no field, no expected data type
2 - stop on first performance issue - implicit casting identification
(should be removed or moved to next releases)

This proposal reduces functionality proposed for
plpgsql_check_function - but basic functionality is there.

It has not a impact on performance and it allow check all paths in
compile time.

It will not change behave of current CREATE OR REPLACE function -
there is not a problem with back compatibility

It is good for you?

I can have this modification to end of this week.

Regards

Pavel

2012/11/28 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> Hello
>
> a some updated version
>
> * possibility to raise (and filter) performance warnings - detects IO castings
> * detects assign composite value to scalar variable
>
> Regards
>
> Pavel Stehule
>
> 2012/11/27 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>> Hello
>>
>> I am sending a updated version - now it is prepared for event triggers
>> and it is little bit more robust
>>
>> I run pgindent, but I have no experience with it, so I am not sure about success
>>
>> Regards
>>
>> Pavel Stehule
>>
>>
>> 2012/10/7 Selena Deckelmann <selena(at)chesnok(dot)com>:
>>> Hi!
>>>
>>> On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>
>>>> I am sending lightly refreshed patch for checking plpgsql functions..
>>>>
>>>> I checked different implementation, but without success: a) enhancing
>>>> of SPI to some fake mode can has negative impact on application, and
>>>> patch was not clear, b) generic plpgsql walker doesn't save lines too.
>>>>
>>>> I invite any ideas how to improve this patch
>>>
>>> I reviewed this and did a clean up for bitrot and a little whitespace.
>>> In particular, it needed to learn a little about event triggers.
>>>
>>> This patch is a follow on from an earlier review thread I found:
>>> http://archives.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C2072DF447@exadv11.host.magwien.gv.at
>>>
>>> I dug through that thread a bit, and I believe issues raised by
>>> Laurenz, Petr and Alvaro were resolved by Pavel over time.
>>>
>>> All tests pass, and after a read-through, the code seems fine.
>>>
>>> This also represents my inaugural use of pg_bsd_indent. I ran it on
>>> pl_check.c - which made things mostly better. Happy to try and fix it
>>> up more if someone can explain to me what (if anything) I did
>>> incorrectly when using it.
>>>
>>> -selena
>>>
>>> --
>>> http://chesnok.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-03-03 21:17:09
Message-ID: 5133BDD5.9080802@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Folks,

Where are we with this patch? I'm a bit unclear from the discussion on
whether it passes muster or not. Things seem to have petered out.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-03-26 03:14:25
Message-ID: 27661.1364267665@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Where are we with this patch? I'm a bit unclear from the discussion on
> whether it passes muster or not. Things seem to have petered out.

I took another look at this patch tonight. I think the thing that is
bothering everybody (including Pavel) is that as submitted, pl_check.c
involves a huge amount of duplication of knowledge and code from
pl_exec.c, and to a lesser extent pl_comp.c. It certainly looks like a
maintenance disaster in the making. It doesn't bother me so much that
pl_check.c knows about each syntactic structure in plpgsql: there are
already four or five places you have to touch when adding new syntax.
Rather, it's that there's so much duplication of knowledge about
semantic details, which is largely expressed by copied-and-pasted code
from pl_exec.c. It seems like a safe bet that we'll sometimes miss the
need to fix pl_check.c when we fix something in pl_exec.c. There are
annoying duplications from pl_comp.c as well, eg knowledge of exactly
which magic variables are defined in trigger functions.

Having said all that, it's not clear that we can really do any better.
The only obvious alternative is pushing support for a checking mode
directly into pl_exec.c, which would obfuscate and slow down that code
to an unacceptable degree if Pavel's results at
http://www.postgresql.org/message-id/CAFj8pRAKuJmVjPjzfSryE7+uB8jF8Wtz5rkxK-0ykXme-k81kA@mail.gmail.com
are any indication. (In that message, Pavel proposes shoveling the
problem into the compile code instead, but that seems unlikely to work
IMO: the main problem in pl_check.c as it stands is duplication of code
from pl_exec.c not pl_comp.c. So I think that path could only lead to
duplicating the same code into pl_comp.c.)

So question 1 is do we want to accept that this is the implementation
pathway we're going to settle for, or are we going to hold out for a
better one? I'd be the first in line to hold out if I had a clue how
to move towards a better implementation, but I have none. Are we
prepared to reject this type of feature entirely because of the
code-duplication problem? That doesn't seem attractive either.

But, even granting that this implementation approach is acceptable,
the patch does not seem close to being committable as-is: there's
a lot of fit-and-finish work yet to be done. To make my point I will
just quote from one of the regression test additions:

create or replace function f1()
returns void as $$
declare a1 int; a2 int;
begin
select 10,20 into a1;
end;
$$ language plpgsql;
-- raise warning
select plpgsql_check_function('f1()');
plpgsql_check_function
-------------------------------------------------------------------------
warning:00000:4:SQL statement:too many attributies for target variables
Detail: There are less target variables than output columns in query.
Hint: Check target variables in SELECT INTO statement
(3 rows)

Do we like this output format? I don't. The unlabeled, undocumented
fields crammed into a single line with colon separators are neither
readable nor useful. If we actually need these fields, why aren't we
splitting the output into multiple columns? (I'm also wondering why
the patch bothers with an option to emit this same info in XML. Surely
there is vanishingly small use-case for mechanical examination of this
output.)

This example also shows that the user-visible messages have had neither
editing from a native speaker of English, nor even attention from a
spell checker. I don't fault Pavel for that (his English is far better
than my Czech) but this patch has been passed by at least one reviewer
who is a native speaker. Personally I'd also say that neither the
Detail nor Hint messages in this example are worth the electrons they
take up, as they add nothing useful to the basic error message. I'd be
embarrassed to be presenting output like this to end users of Postgres.

(The code comments are in even worse shape than the user-visible
messages, by the by, where they exist at all.)

A somewhat related point is that it doesn't look like any thought
has been taken about localized message output.

This stuff is certainly all fixable if we agree on the basic
implementation approach; and I can't really fault Pavel for not
worrying much about such details while the implementation approach
hasn't been agreed to. But I'd judge that there's a week or more's
worth of work here to get to a patch I'd be happy about committing,
even without any change in the basic approach. That's not time I'm
willing to put in personally right now.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-03-26 19:19:31
Message-ID: CAFj8pRC6+Ri_HfkiWX=NF9cW3nU3O1ktiPkwnVp+WmbCOn7qJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello all

2013/3/26 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Where are we with this patch? I'm a bit unclear from the discussion on
>> whether it passes muster or not. Things seem to have petered out.
>
> I took another look at this patch tonight. I think the thing that is
> bothering everybody (including Pavel) is that as submitted, pl_check.c
> involves a huge amount of duplication of knowledge and code from
> pl_exec.c, and to a lesser extent pl_comp.c. It certainly looks like a
> maintenance disaster in the making. It doesn't bother me so much that
> pl_check.c knows about each syntactic structure in plpgsql: there are
> already four or five places you have to touch when adding new syntax.
> Rather, it's that there's so much duplication of knowledge about
> semantic details, which is largely expressed by copied-and-pasted code
> from pl_exec.c. It seems like a safe bet that we'll sometimes miss the
> need to fix pl_check.c when we fix something in pl_exec.c. There are
> annoying duplications from pl_comp.c as well, eg knowledge of exactly
> which magic variables are defined in trigger functions.
>
> Having said all that, it's not clear that we can really do any better.
> The only obvious alternative is pushing support for a checking mode
> directly into pl_exec.c, which would obfuscate and slow down that code
> to an unacceptable degree if Pavel's results at
> http://www.postgresql.org/message-id/CAFj8pRAKuJmVjPjzfSryE7+uB8jF8Wtz5rkxK-0ykXme-k81kA@mail.gmail.com
> are any indication. (In that message, Pavel proposes shoveling the
> problem into the compile code instead, but that seems unlikely to work
> IMO: the main problem in pl_check.c as it stands is duplication of code
> from pl_exec.c not pl_comp.c. So I think that path could only lead to
> duplicating the same code into pl_comp.c.)
>
> So question 1 is do we want to accept that this is the implementation
> pathway we're going to settle for, or are we going to hold out for a
> better one? I'd be the first in line to hold out if I had a clue how
> to move towards a better implementation, but I have none. Are we
> prepared to reject this type of feature entirely because of the
> code-duplication problem? That doesn't seem attractive either.

I wrote lot of versions and proposed code is redundant, but most
simple and clean.

I am really against to pushing check to pl_exec, because it
significantly decrease code readability and increase some bottle neck
in CPU extensive tests. More, there are too less place for some future
feature - performance advising, more verbose error messages, etc

In PL/pgPSM I used a little bit different architecture - necessary for
PSM and maybe better for PL/pgSQL too. It is two stage compiler -
parsing to AST, and AST compilation. It simplify gram.y and processing
order depends on AST deep iteration and not on bizon rules. It can
have a impact on speed of very large procedures probably - I don't see
different disadvantages. With this architecture I was able do lot of
controls to compile stage without problems.

Most complexity in current code is related to detecting record types
from expressions without expression evaluation. Maybe this code can be
in core or pl_compile.c. Code for Describe (F) message is not too
reusable. It is necessary for

DECLARE r RECORD;
FOR r IN SELECT ...
LOOP
RAISE NOTICE '>>%<<', r.xx;
END LOOP;

>
> But, even granting that this implementation approach is acceptable,
> the patch does not seem close to being committable as-is: there's
> a lot of fit-and-finish work yet to be done. To make my point I will
> just quote from one of the regression test additions:
>
> create or replace function f1()
> returns void as $$
> declare a1 int; a2 int;
> begin
> select 10,20 into a1;
> end;
> $$ language plpgsql;
> -- raise warning
> select plpgsql_check_function('f1()');
> plpgsql_check_function
> -------------------------------------------------------------------------
> warning:00000:4:SQL statement:too many attributies for target variables
> Detail: There are less target variables than output columns in query.
> Hint: Check target variables in SELECT INTO statement
> (3 rows)
>
> Do we like this output format? I don't. The unlabeled, undocumented
> fields crammed into a single line with colon separators are neither
> readable nor useful. If we actually need these fields, why aren't we
> splitting the output into multiple columns? (I'm also wondering why
> the patch bothers with an option to emit this same info in XML. Surely
> there is vanishingly small use-case for mechanical examination of this
> output.)

This format can be reduced, redesigned, changed. It is designed like
gcc output and optimized for using from psql console.

I tested table output - in original CHECK statement implementation,
but it is not too friendly for showing in monitor - it is just too
wide. There are similar arguments like using tabular output for
EXPLAIN, although there are higher complexity and nested structures
(in EXPLAIN). Personally, I can live with tabular output, it is not
user friendly, but it can be used for machine processing simply, and
any advanced user can write some transform functions to any output
format.

>
> This example also shows that the user-visible messages have had neither
> editing from a native speaker of English, nor even attention from a
> spell checker. I don't fault Pavel for that (his English is far better
> than my Czech) but this patch has been passed by at least one reviewer
> who is a native speaker. Personally I'd also say that neither the
> Detail nor Hint messages in this example are worth the electrons they
> take up, as they add nothing useful to the basic error message. I'd be
> embarrassed to be presenting output like this to end users of Postgres.
>
> (The code comments are in even worse shape than the user-visible
> messages, by the by, where they exist at all.)
>
> A somewhat related point is that it doesn't look like any thought
> has been taken about localized message output.
>
> This stuff is certainly all fixable if we agree on the basic
> implementation approach; and I can't really fault Pavel for not
> worrying much about such details while the implementation approach
> hasn't been agreed to. But I'd judge that there's a week or more's
> worth of work here to get to a patch I'd be happy about committing,
> even without any change in the basic approach. That's not time I'm
> willing to put in personally right now.

Yes. A details should be precised after final decision. I am hope so
this code doesn't contains significant bug - I have not any reported
issue related to this functionality - it is based on plpgsql_lint -
and this code is used in production by some very large companies. Of
course, there are not too large community - it should be manually
compiled.

Regards

Pavel

>
> regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-03-27 22:25:35
Message-ID: CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_JjoP21FJmrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I redesigned output from plpgsql_check_function. Now, it returns table
everytime.
Litlle bit code simplification.

postgres=# \sf fx
CREATE OR REPLACE FUNCTION public.fx(integer)
RETURNS integer
LANGUAGE plpgsql
AS $function$
BEGIN
RETURN (SELECT a FROM t1 WHERE b < $1);
END;
$function$
postgres=# select * from plpgsql_check_function('fx(int)');
-[ RECORD 1 ]--------------------------------------
functionid | fx
lineno | 3
statement | RETURN
sqlstate | 42703
message | column "b" does not exist
detail | [null]
hint | [null]
level | error
position | 32
query | SELECT (SELECT a FROM t1 WHERE b < $1)
context | [null]

Regards

Pavel

2013/3/26 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Where are we with this patch? I'm a bit unclear from the discussion on
>> whether it passes muster or not. Things seem to have petered out.
>
> I took another look at this patch tonight. I think the thing that is
> bothering everybody (including Pavel) is that as submitted, pl_check.c
> involves a huge amount of duplication of knowledge and code from
> pl_exec.c, and to a lesser extent pl_comp.c. It certainly looks like a
> maintenance disaster in the making. It doesn't bother me so much that
> pl_check.c knows about each syntactic structure in plpgsql: there are
> already four or five places you have to touch when adding new syntax.
> Rather, it's that there's so much duplication of knowledge about
> semantic details, which is largely expressed by copied-and-pasted code
> from pl_exec.c. It seems like a safe bet that we'll sometimes miss the
> need to fix pl_check.c when we fix something in pl_exec.c. There are
> annoying duplications from pl_comp.c as well, eg knowledge of exactly
> which magic variables are defined in trigger functions.
>
> Having said all that, it's not clear that we can really do any better.
> The only obvious alternative is pushing support for a checking mode
> directly into pl_exec.c, which would obfuscate and slow down that code
> to an unacceptable degree if Pavel's results at
> http://www.postgresql.org/message-id/CAFj8pRAKuJmVjPjzfSryE7+uB8jF8Wtz5rkxK-0ykXme-k81kA@mail.gmail.com
> are any indication. (In that message, Pavel proposes shoveling the
> problem into the compile code instead, but that seems unlikely to work
> IMO: the main problem in pl_check.c as it stands is duplication of code
> from pl_exec.c not pl_comp.c. So I think that path could only lead to
> duplicating the same code into pl_comp.c.)
>
> So question 1 is do we want to accept that this is the implementation
> pathway we're going to settle for, or are we going to hold out for a
> better one? I'd be the first in line to hold out if I had a clue how
> to move towards a better implementation, but I have none. Are we
> prepared to reject this type of feature entirely because of the
> code-duplication problem? That doesn't seem attractive either.
>
> But, even granting that this implementation approach is acceptable,
> the patch does not seem close to being committable as-is: there's
> a lot of fit-and-finish work yet to be done. To make my point I will
> just quote from one of the regression test additions:
>
> create or replace function f1()
> returns void as $$
> declare a1 int; a2 int;
> begin
> select 10,20 into a1;
> end;
> $$ language plpgsql;
> -- raise warning
> select plpgsql_check_function('f1()');
> plpgsql_check_function
> -------------------------------------------------------------------------
> warning:00000:4:SQL statement:too many attributies for target variables
> Detail: There are less target variables than output columns in query.
> Hint: Check target variables in SELECT INTO statement
> (3 rows)
>
> Do we like this output format? I don't. The unlabeled, undocumented
> fields crammed into a single line with colon separators are neither
> readable nor useful. If we actually need these fields, why aren't we
> splitting the output into multiple columns? (I'm also wondering why
> the patch bothers with an option to emit this same info in XML. Surely
> there is vanishingly small use-case for mechanical examination of this
> output.)
>
> This example also shows that the user-visible messages have had neither
> editing from a native speaker of English, nor even attention from a
> spell checker. I don't fault Pavel for that (his English is far better
> than my Czech) but this patch has been passed by at least one reviewer
> who is a native speaker. Personally I'd also say that neither the
> Detail nor Hint messages in this example are worth the electrons they
> take up, as they add nothing useful to the basic error message. I'd be
> embarrassed to be presenting output like this to end users of Postgres.
>
> (The code comments are in even worse shape than the user-visible
> messages, by the by, where they exist at all.)
>
> A somewhat related point is that it doesn't look like any thought
> has been taken about localized message output.
>
> This stuff is certainly all fixable if we agree on the basic
> implementation approach; and I can't really fault Pavel for not
> worrying much about such details while the implementation approach
> hasn't been agreed to. But I'd judge that there's a week or more's
> worth of work here to get to a patch I'd be happy about committing,
> even without any change in the basic approach. That's not time I'm
> willing to put in personally right now.
>
> regards, tom lane

Attachment Content-Type Size
plpgsql_check_function_20130327.patch.gz application/x-gzip 19.8 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-08-22 00:36:06
Message-ID: 1377131766.11715.1.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
> I redesigned output from plpgsql_check_function. Now, it returns table
> everytime.
> Litlle bit code simplification.

This patch is in the 2013-09 commitfest but needs a rebase.


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-08-22 06:02:19
Message-ID: CAFj8pRALRqASmqVW+D_4xLQMV8DGG-31os-mJc-X-eCXO55J=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

rebased

Regards

Pavel

2013/8/22 Peter Eisentraut <peter_e(at)gmx(dot)net>

> On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
> > I redesigned output from plpgsql_check_function. Now, it returns table
> > everytime.
> > Litlle bit code simplification.
>
> This patch is in the 2013-09 commitfest but needs a rebase.
>
>
>

Attachment Content-Type Size
plpgsql_check_function_20130822.patch.gz application/x-gzip 19.6 KB

From: Steve Singer <steve(at)ssinger(dot)info>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-07 20:18:06
Message-ID: BLU0-SMTP87239B8E85BC2E8534B426DCD10@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/22/2013 02:02 AM, Pavel Stehule wrote:
> rebased
>
> Regards
>
> Pavel
>
This patch again needs a rebase, it no longer applies cleanly.
plpgsql_estate_setup now takes a shared estate parameter and the call in
pl_check needs that. I passed NULL in and things seem to work.

I think this is another example where are processes aren't working as
we'd like.

The last time this patch got a review was in March when Tom said
http://www.postgresql.org/message-id/27661.1364267665@sss.pgh.pa.us

Shortly after Pavel responded with a new patch that changes the output
format.
http://www.postgresql.org/message-id/CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_JjoP21FJmrQ@mail.gmail.com

I don't much has progress in the following 9 months on this patch.

In Tom's review the issue of code duplication and asks:
"do we want to accept that this is the implementation pathway we're
going to settle for, or are we going to hold out for a better one? I'd
be the first in line to hold out if I had a clue how to move towards a
better implementation, but I have none."

This question is still outstanding.

Here are my comments on this version of the patch:

This version of the patch gives output as a table with the structure:

functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context
------------+--------+-----------+----------+---------+--------+------+-------+----------+-------+---------

I like this much better than the previous format.

I think the patch needs more documentation.

The extent of the documentation is

<title>Checking of embedded SQL</title>
+ <para>
+ The SQL statements inside <application>PL/pgSQL</> functions are
+ checked by validator for semantic errors. These errors
+ can be found by <function>plpgsql_check_function</function>:

I a don't think that this adequately describes the function. It isn't clear to me what we mean by 'embedded' SQL.
and 'semantic errors'.

I think this would read better as

<sect2 id="plpgsql-check-function">
<title>Checking SQL Inside Functions</title>
<para>
The SQL Statements inside of <application>PL/pgsql</> functions can be
checked by a validation function for semantic errors. You can check
<application>PL/pgsl</application> functions by passing the name of the
function to <function>plpgsql_check_function</function>.
</para>
<para>
The <function>plpgsql_check_function</function> will check the SQL
inside of a <application>PL/pgsql</application> function for certain
types of errors and warnings.
</para>

but that needs to be expanded on.

I was expecting something like:

create function x() returns void as $$ declare a int4; begin a='f'; end $$ language plpgsql;

to return an error but it doesn't

but

create temp table x(a int4);
create or replace function x() returns void as $$ declare a int4; begin insert into x values('h'); end $$ language plpgsql;

does give an error when I pass it to the validator. Is this the intended behavior of the patch? If so we need to explain why the first function doesn't generate an error.

I feel we need to document what each of the columns in the output means. What is the difference between statement, query and context?

Some random comments on the messages you return:

Line 816:

if (is_expression && tupdesc->natts != 1)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("qw",
query->query,
tupdesc->natts)));

Should this be "query \"%s\" returned %d columns but only 1 is allowed" or something similar?

Line 837

else
/* XXX: should be a warning? */
ereport(ERROR,
(errmsg("cannot to identify real type for record type variable")));

In what situation does this come up? Should it be a warning or error?
Do we mean "the real type for record variable" ?

Line 1000
ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("function does not return
composite type, is not possible to identify composite type")));

Should this be "function does not return a composite type, it is not
possible to identify the composite type" ?

Line 1109:

appendStringInfo(&str, "parameter \"%s\" is overlapped",
+ refname);

gives warnings like:

select message,detail FROM plpgsql_check_function('b(int)');
message | detail
-----------------------------+--------------------------------------------
parameter "a" is overlapped | Local variable overlap function parameter.

How about instead
"parameter "a" is duplicated" | Local variable is named the same as the
function parameter

Line 1278:

+ checker_error(cstate,
+ 0, 0,
+ "cannot determinate a result of
dynamic SQL",
+ "Cannot to contine in check.",
+ "Don't use dynamic SQL and record type together,
when you would check function.",
+ "warning",
+ 0, NULL, NULL);

How about
"cannot determine the result of dynamic SQL" , "Cannot continue
validating the function", "Do not use plpgsql_check_function on
functions with dynamic SQL"
Also this limitation should be explained in the documentation.

I also thing we need to distinguish between warnings generated because
of problems in the function versus warnings generated because of
limitations in the validator. This implies that there is maybe
something wrong with my function but there is nothing wrong with using
dynamic SQL in functions this is just telling users about a runtime
warning of the validator itself.

Same thing around line 1427

I have not done an in-depth read of the code.

I'm sending this out this patch at least gets some review. I don't
think that I will have a lot more time in the next week to do a more
thorough review or follow-up review

If we aren't happy with the overal approach of this patch then we need
to tell Pavel.

My vote would be to try to get the patch (documentation, error messages,
'XXX' items, etc) into a better state so it can eventually be committed

Steve

>
> 2013/8/22 Peter Eisentraut <peter_e(at)gmx(dot)net <mailto:peter_e(at)gmx(dot)net>>
>
> On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
> > I redesigned output from plpgsql_check_function. Now, it returns
> table
> > everytime.
> > Litlle bit code simplification.
>
> This patch is in the 2013-09 commitfest but needs a rebase.
>
>
>
>
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Steve Singer <steve(at)ssinger(dot)info>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-07 21:57:47
Message-ID: CAFj8pRCq1ymh1z=NTbK_pnye=oqcQ=xcueX=jyttBgOF=Squ4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

thank you for review

2013/12/7 Steve Singer <steve(at)ssinger(dot)info>

> On 08/22/2013 02:02 AM, Pavel Stehule wrote:
>
>> rebased
>>
>> Regards
>>
>> Pavel
>>
>> This patch again needs a rebase, it no longer applies cleanly.
> plpgsql_estate_setup now takes a shared estate parameter and the call in
> pl_check needs that. I passed NULL in and things seem to work.
>
>
>
> I think this is another example where are processes aren't working as we'd
> like.
>
> The last time this patch got a review was in March when Tom said
> http://www.postgresql.org/message-id/27661.1364267665@sss.pgh.pa.us
>
> Shortly after Pavel responded with a new patch that changes the output
> format. http://www.postgresql.org/message-id/
> CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_JjoP21FJmrQ(at)mail(dot)gmail(dot)com
>
> I don't much has progress in the following 9 months on this patch.
>
> In Tom's review the issue of code duplication and asks:
>
> "do we want to accept that this is the implementation pathway we're going
> to settle for, or are we going to hold out for a better one? I'd be the
> first in line to hold out if I had a clue how to move towards a better
> implementation, but I have none."
>

yes, I am waiting still to a) have some better idea, or b) have
significantly more free time for larger plpgsql refactoring. Nothing of it
happens :(

> This question is still outstanding.
>
> Here are my comments on this version of the patch:
>
>
>
> This version of the patch gives output as a table with the structure:
>
> functionid | lineno | statement | sqlstate | message | detail | hint |
> level | position | query | context
> ------------+--------+-----------+----------+---------+-----
> ---+------+-------+----------+-------+---------
>
> I like this much better than the previous format.
>
> I think the patch needs more documentation.
>
> The extent of the documentation is
>
> <title>Checking of embedded SQL</title>
> + <para>
> + The SQL statements inside <application>PL/pgSQL</> functions are
> + checked by validator for semantic errors. These errors
> + can be found by <function>plpgsql_check_function</function>:
>
>
>
> I a don't think that this adequately describes the function. It isn't
> clear to me what we mean by 'embedded' SQL.
> and 'semantic errors'.
>
>
> I think this would read better as
>
> <sect2 id="plpgsql-check-function">
> <title>Checking SQL Inside Functions</title>
> <para>
> The SQL Statements inside of <application>PL/pgsql</> functions can be
> checked by a validation function for semantic errors. You can check
> <application>PL/pgsl</application> functions by passing the name of
> the
> function to <function>plpgsql_check_function</function>.
> </para>
> <para>
> The <function>plpgsql_check_function</function> will check the SQL
> inside of a <application>PL/pgsql</application> function for certain
> types of errors and warnings.
> </para>
>
> but that needs to be expanded on.
>
> I was expecting something like:
>
> create function x() returns void as $$ declare a int4; begin a='f'; end $$
> language plpgsql;
>
> to return an error but it doesn't
>

This is expected. PL/pgSQL use a late casting - so a := '10'; is
acceptable. And in this moment plpgsql check doesn't evaluate constant and
doesn't try to cast to target type. But this check can be implemented
sometime. In this moment, we have no simple way how to identify constant
expression in plpgsql. So any constants are expressions from plpgsql
perspective.

>
> but
>
> create temp table x(a int4);
> create or replace function x() returns void as $$ declare a int4; begin
> insert into x values('h'); end $$ language plpgsql;
>

it is expected too. SQL doesn't use late casting - casting is controlled by
available casts and constants are evaluated early - due possible
optimization.

>
> does give an error when I pass it to the validator. Is this the intended
> behavior of the patch? If so we need to explain why the first function
> doesn't generate an error.
>
>
> I feel we need to document what each of the columns in the output means.
> What is the difference between statement, query and context?
>
> Some random comments on the messages you return:
>
> Line 816:
>
> if (is_expression && tupdesc->natts != 1)
> ereport(ERROR,
> (errcode(ERRCODE_SYNTAX_ERROR),
> errmsg("qw",
> query->query,
> tupdesc->natts)));
>
> Should this be "query \"%s\" returned %d columns but only 1 is allowed"
> or something similar?
>
> Line 837
>
> else
> /* XXX: should be a warning? */
> ereport(ERROR,
> (errmsg("cannot to identify real
> type for record type variable")));
>
> In what situation does this come up? Should it be a warning or error?
> Do we mean "the real type for record variable" ?
>

a most difficult part of plpgsql check function is about transformation
dynamic types to know row types. It is necessary for checking usable
functions and operators.

typical pattern is:

declare r record;
begin
for r in select a, b from some_tab
loop
raise notice '%', extract(day from r.a);
end loop;

and we should to detect type of r.a. Sometimes we cannot to do it due using
dynamic SQL - plpgsql check doesn't try to evaluate expressions (as
protection against side efects).

>
> Line 1000
> ereport(ERROR,
> + (errcode(ERRCODE_DATATYPE_MISMATCH),
> + errmsg("function does not return
> composite type, is not possible to identify composite type")));
>
> Should this be "function does not return a composite type, it is not
> possible to identify the composite type" ?
>
> Line 1109:
>
> appendStringInfo(&str, "parameter \"%s\" is overlapped",
> + refname);
>
> gives warnings like:
>
> select message,detail FROM plpgsql_check_function('b(int)');
> message | detail
> -----------------------------+--------------------------------------------
> parameter "a" is overlapped | Local variable overlap function parameter.
>
>
> How about instead
> "parameter "a" is duplicated" | Local variable is named the same as the
> function parameter
>

I have no idea about good sentence, but "duplicate" probably is not best.
Any variable can be locally overlapped. Probably any overlapping should be
signalized - it is a bad design always.

>
>
> Line 1278:
>
> + checker_error(cstate,
> + 0, 0,
> + "cannot determinate a result of dynamic
> SQL",
> + "Cannot to contine in check.",
> + "Don't use dynamic SQL and record type together,
> when you would check function.",
> + "warning",
> + 0, NULL, NULL);
>
> How about
> "cannot determine the result of dynamic SQL" , "Cannot continue validating
> the function", "Do not use plpgsql_check_function on functions with dynamic
> SQL"
> Also this limitation should be explained in the documentation.
>
> I also thing we need to distinguish between warnings generated because of
> problems in the function versus warnings generated because of limitations
> in the validator. This implies that there is maybe something wrong with
> my function but there is nothing wrong with using dynamic SQL in functions
> this is just telling users about a runtime warning of the validator itself.
>
> Same thing around line 1427
>
>
> I have not done an in-depth read of the code.
>
> I'm sending this out this patch at least gets some review. I don't think
> that I will have a lot more time in the next week to do a more thorough
> review or follow-up review
>
>
> If we aren't happy with the overal approach of this patch then we need to
> tell Pavel.
>
> My vote would be to try to get the patch (documentation, error messages,
> 'XXX' items, etc) into a better state so it can eventually be committed
>
>
Thank you

Regards

Pavel

>
> Steve
>
>
>> 2013/8/22 Peter Eisentraut <peter_e(at)gmx(dot)net <mailto:peter_e(at)gmx(dot)net>>
>>
>>
>> On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
>> > I redesigned output from plpgsql_check_function. Now, it returns
>> table
>> > everytime.
>> > Litlle bit code simplification.
>>
>> This patch is in the 2013-09 commitfest but needs a rebase.
>>
>>
>>
>>
>>
>>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Steve Singer <steve(at)ssinger(dot)info>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-08 08:45:01
Message-ID: CAFj8pRCJZ236dqHqo8HZ2t9wMAcKsPuH4APZePrW2BwyV9MQOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

There is still valid a variant to move check function to contrib for fist
moment.

Regards

Pavel

2013/12/7 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>

> Hello
>
> thank you for review
>
>
> 2013/12/7 Steve Singer <steve(at)ssinger(dot)info>
>
>> On 08/22/2013 02:02 AM, Pavel Stehule wrote:
>>
>>> rebased
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>> This patch again needs a rebase, it no longer applies cleanly.
>> plpgsql_estate_setup now takes a shared estate parameter and the call in
>> pl_check needs that. I passed NULL in and things seem to work.
>>
>>
>>
>> I think this is another example where are processes aren't working as
>> we'd like.
>>
>> The last time this patch got a review was in March when Tom said
>> http://www.postgresql.org/message-id/27661.1364267665@sss.pgh.pa.us
>>
>> Shortly after Pavel responded with a new patch that changes the output
>> format. http://www.postgresql.org/message-id/
>> CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_JjoP21FJmrQ(at)mail(dot)gmail(dot)com
>>
>> I don't much has progress in the following 9 months on this patch.
>>
>> In Tom's review the issue of code duplication and asks:
>>
>> "do we want to accept that this is the implementation pathway we're going
>> to settle for, or are we going to hold out for a better one? I'd be the
>> first in line to hold out if I had a clue how to move towards a better
>> implementation, but I have none."
>>
>
> yes, I am waiting still to a) have some better idea, or b) have
> significantly more free time for larger plpgsql refactoring. Nothing of it
> happens :(
>
>
>> This question is still outstanding.
>>
>> Here are my comments on this version of the patch:
>>
>>
>>
>> This version of the patch gives output as a table with the structure:
>>
>> functionid | lineno | statement | sqlstate | message | detail | hint |
>> level | position | query | context
>> ------------+--------+-----------+----------+---------+-----
>> ---+------+-------+----------+-------+---------
>>
>> I like this much better than the previous format.
>>
>> I think the patch needs more documentation.
>>
>> The extent of the documentation is
>>
>> <title>Checking of embedded SQL</title>
>> + <para>
>> + The SQL statements inside <application>PL/pgSQL</> functions are
>> + checked by validator for semantic errors. These errors
>> + can be found by <function>plpgsql_check_function</function>:
>>
>>
>>
>> I a don't think that this adequately describes the function. It isn't
>> clear to me what we mean by 'embedded' SQL.
>> and 'semantic errors'.
>>
>>
>> I think this would read better as
>>
>> <sect2 id="plpgsql-check-function">
>> <title>Checking SQL Inside Functions</title>
>> <para>
>> The SQL Statements inside of <application>PL/pgsql</> functions can
>> be
>> checked by a validation function for semantic errors. You can check
>> <application>PL/pgsl</application> functions by passing the name of
>> the
>> function to <function>plpgsql_check_function</function>.
>> </para>
>> <para>
>> The <function>plpgsql_check_function</function> will check the SQL
>> inside of a <application>PL/pgsql</application> function for certain
>> types of errors and warnings.
>> </para>
>>
>> but that needs to be expanded on.
>>
>> I was expecting something like:
>>
>> create function x() returns void as $$ declare a int4; begin a='f'; end
>> $$ language plpgsql;
>>
>> to return an error but it doesn't
>>
>
> This is expected. PL/pgSQL use a late casting - so a := '10'; is
> acceptable. And in this moment plpgsql check doesn't evaluate constant and
> doesn't try to cast to target type. But this check can be implemented
> sometime. In this moment, we have no simple way how to identify constant
> expression in plpgsql. So any constants are expressions from plpgsql
> perspective.
>
>
>>
>> but
>>
>> create temp table x(a int4);
>> create or replace function x() returns void as $$ declare a int4; begin
>> insert into x values('h'); end $$ language plpgsql;
>>
>
> it is expected too. SQL doesn't use late casting - casting is controlled
> by available casts and constants are evaluated early - due possible
> optimization.
>
>
>>
>> does give an error when I pass it to the validator. Is this the
>> intended behavior of the patch? If so we need to explain why the first
>> function doesn't generate an error.
>>
>>
>> I feel we need to document what each of the columns in the output means.
>> What is the difference between statement, query and context?
>>
>> Some random comments on the messages you return:
>>
>> Line 816:
>>
>> if (is_expression && tupdesc->natts != 1)
>> ereport(ERROR,
>> (errcode(ERRCODE_SYNTAX_ERROR),
>> errmsg("qw",
>> query->query,
>> tupdesc->natts)));
>>
>> Should this be "query \"%s\" returned %d columns but only 1 is allowed"
>> or something similar?
>>
>> Line 837
>>
>> else
>> /* XXX: should be a warning? */
>> ereport(ERROR,
>> (errmsg("cannot to identify real
>> type for record type variable")));
>>
>> In what situation does this come up? Should it be a warning or error?
>> Do we mean "the real type for record variable" ?
>>
>
> a most difficult part of plpgsql check function is about transformation
> dynamic types to know row types. It is necessary for checking usable
> functions and operators.
>
> typical pattern is:
>
> declare r record;
> begin
> for r in select a, b from some_tab
> loop
> raise notice '%', extract(day from r.a);
> end loop;
>
> and we should to detect type of r.a. Sometimes we cannot to do it due
> using dynamic SQL - plpgsql check doesn't try to evaluate expressions (as
> protection against side efects).
>
>
>
>
>>
>> Line 1000
>> ereport(ERROR,
>> + (errcode(ERRCODE_DATATYPE_MISMATCH),
>> + errmsg("function does not return
>> composite type, is not possible to identify composite type")));
>>
>> Should this be "function does not return a composite type, it is not
>> possible to identify the composite type" ?
>>
>> Line 1109:
>>
>> appendStringInfo(&str, "parameter \"%s\" is overlapped",
>> + refname);
>>
>> gives warnings like:
>>
>> select message,detail FROM plpgsql_check_function('b(int)');
>> message | detail
>> -----------------------------+------------------------------
>> --------------
>> parameter "a" is overlapped | Local variable overlap function parameter.
>>
>>
>> How about instead
>> "parameter "a" is duplicated" | Local variable is named the same as the
>> function parameter
>>
>
> I have no idea about good sentence, but "duplicate" probably is not best.
> Any variable can be locally overlapped. Probably any overlapping should be
> signalized - it is a bad design always.
>
>
>>
>>
>> Line 1278:
>>
>> + checker_error(cstate,
>> + 0, 0,
>> + "cannot determinate a result of dynamic
>> SQL",
>> + "Cannot to contine in check.",
>> + "Don't use dynamic SQL and record type together,
>> when you would check function.",
>> + "warning",
>> + 0, NULL, NULL);
>>
>> How about
>> "cannot determine the result of dynamic SQL" , "Cannot continue
>> validating the function", "Do not use plpgsql_check_function on functions
>> with dynamic SQL"
>> Also this limitation should be explained in the documentation.
>>
>> I also thing we need to distinguish between warnings generated because of
>> problems in the function versus warnings generated because of limitations
>> in the validator. This implies that there is maybe something wrong with
>> my function but there is nothing wrong with using dynamic SQL in functions
>> this is just telling users about a runtime warning of the validator itself.
>>
>> Same thing around line 1427
>>
>>
>> I have not done an in-depth read of the code.
>>
>> I'm sending this out this patch at least gets some review. I don't think
>> that I will have a lot more time in the next week to do a more thorough
>> review or follow-up review
>>
>>
>> If we aren't happy with the overal approach of this patch then we need to
>> tell Pavel.
>>
>> My vote would be to try to get the patch (documentation, error messages,
>> 'XXX' items, etc) into a better state so it can eventually be committed
>>
>>
> Thank you
>
> Regards
>
> Pavel
>
>
>
>>
>> Steve
>>
>>
>>> 2013/8/22 Peter Eisentraut <peter_e(at)gmx(dot)net <mailto:peter_e(at)gmx(dot)net>>
>>>
>>>
>>> On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
>>> > I redesigned output from plpgsql_check_function. Now, it returns
>>> table
>>> > everytime.
>>> > Litlle bit code simplification.
>>>
>>> This patch is in the 2013-09 commitfest but needs a rebase.
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-08 16:12:56
Message-ID: 1386519176.31519.9.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

In my opinion, the idea of having a separate lint checker for a language
is antiquated. If there are problems, they should be diagnosed at
compile time or run time. You can add options about warning levels or
strictness if there are concerns about which diagnostics are
appropriate.
>
>
>
>
>
>
>


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-08 16:15:36
Message-ID: 1386519336.31519.11.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2013-12-08 at 09:45 +0100, Pavel Stehule wrote:
> There is still valid a variant to move check function to contrib for
> fist moment.

If we are not happy with the code or the interface, then moving it to
contrib is not a solution. We are still committed to main things in
contrib indefinitely.


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-08 17:01:40
Message-ID: CAFj8pRBBUF2zowRRiTUbcSCVKHaspL=+=-pBg8P2B-UXH7cjkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/12/8 Peter Eisentraut <peter_e(at)gmx(dot)net>

> In my opinion, the idea of having a separate lint checker for a language
> is antiquated. If there are problems, they should be diagnosed at
> compile time or run time. You can add options about warning levels or
> strictness if there are concerns about which diagnostics are
> appropriate.
>

There are two points, that should be solved

a) introduction a dependency to other object in schema - now CREATE
FUNCTION is fully independent on others

b) slow start - if we check all paths on start, then start can be slower -
and some functions should not work due dependency on temporary tables.

I am thinking about possible marking a function by #option (we have same
idea)

some like

#option check_on_first_start
#option check_on_create
#option check_newer

But still I have no idea, how to push check without possible slowdown
execution with code duplication

Pavel

> >
> >
> >
> >
> >
> >
> >
>
>


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-09 03:23:06
Message-ID: CAA4eK1LdVcNdTbkKnJvdWTP1B+V=K4Y1R9qakgXCm68z3vRirA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Dec 8, 2013 at 10:31 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
>
>
> 2013/12/8 Peter Eisentraut <peter_e(at)gmx(dot)net>
>>
>> In my opinion, the idea of having a separate lint checker for a language
>> is antiquated. If there are problems, they should be diagnosed at
>> compile time or run time. You can add options about warning levels or
>> strictness if there are concerns about which diagnostics are
>> appropriate.
>
>
> There are two points, that should be solved
>
> a) introduction a dependency to other object in schema - now CREATE FUNCTION
> is fully independent on others
>
> b) slow start - if we check all paths on start, then start can be slower -
> and some functions should not work due dependency on temporary tables.
>
> I am thinking about possible marking a function by #option (we have same
> idea)
>
> some like
>
> #option check_on_first_start
> #option check_on_create
> #option check_newer

what exactly check_newer means, does it mean whenever a function is
replaced (changed)?

> But still I have no idea, how to push check without possible slowdown
> execution with code duplication

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-09 05:24:51
Message-ID: CAFj8pRA09FsRLA=wSkmM0MwVBs5B5f6dLTLgYKCFzfA-1850zw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/12/9 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>

> On Sun, Dec 8, 2013 at 10:31 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
> >
> >
> >
> > 2013/12/8 Peter Eisentraut <peter_e(at)gmx(dot)net>
> >>
> >> In my opinion, the idea of having a separate lint checker for a language
> >> is antiquated. If there are problems, they should be diagnosed at
> >> compile time or run time. You can add options about warning levels or
> >> strictness if there are concerns about which diagnostics are
> >> appropriate.
> >
> >
> > There are two points, that should be solved
> >
> > a) introduction a dependency to other object in schema - now CREATE
> FUNCTION
> > is fully independent on others
> >
> > b) slow start - if we check all paths on start, then start can be slower
> -
> > and some functions should not work due dependency on temporary tables.
> >
> > I am thinking about possible marking a function by #option (we have same
> > idea)
> >
> > some like
> >
> > #option check_on_first_start
> > #option check_on_create
> > #option check_newer
>
> what exactly check_newer means, does it mean whenever a function is
> replaced (changed)?
>
>
no, it means, so request for check will be ignored ever - some functions
cannot be deeply checked due using dynamic SQL or dynamic created data
types - temporary tables created in functions.

Regards

Pavel

> > But still I have no idea, how to push check without possible slowdown
> > execution with code duplication
>
>
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>


From: Jim Nasby <jim(at)nasby(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-09 18:51:01
Message-ID: 52A61115.6040305@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/8/13 11:24 PM, Pavel Stehule wrote:
> > #option check_on_first_start
> > #option check_on_create
> > #option check_newer
>
> what exactly check_newer means, does it mean whenever a function is
> replaced (changed)?
>
>
> no, it means, so request for check will be ignored ever - some functions cannot be deeply checked due using dynamic SQL or dynamic created data types - temporary tables created in functions.

So presumably it would be check_never, not check_newer... :) BTW, it's not terribly hard to work around the temp table issue; you just need to create the expected table in the session when you create the function. But even in this case, I think it would still be good to check what we can, like at least basic plpgsql syntax.

Do we really need first_start? ISTM that if you're dependent on run state then you're basically out of luck.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-09 19:08:32
Message-ID: CAFj8pRBM0Uat90CQAnWLDX7SjABAurUppp8fRri-QQN2OAnXoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/12/9 Jim Nasby <jim(at)nasby(dot)net>

> On 12/8/13 11:24 PM, Pavel Stehule wrote:
>
>> > #option check_on_first_start
>> > #option check_on_create
>> > #option check_newer
>>
>> what exactly check_newer means, does it mean whenever a function is
>> replaced (changed)?
>>
>>
>> no, it means, so request for check will be ignored ever - some functions
>> cannot be deeply checked due using dynamic SQL or dynamic created data
>> types - temporary tables created in functions.
>>
>
> So presumably it would be check_never, not check_newer... :) BTW, it's not
> terribly hard to work around the temp table issue; you just need to create
> the expected table in the session when you create the function. But even in
> this case, I think it would still be good to check what we can, like at
> least basic plpgsql syntax.
>

I sorry.

You cannot to create temporary table - this check should not have any side
effect - and creating temporary table can run some event trigger.

But there should be some hints for check like annotations or some similar.
Or you can minimize a area where check will be disabled.

>
> Do we really need first_start? ISTM that if you're dependent on run state
> then you're basically out of luck.
>

I afraid so checking on creation time is not enough for plpgsql.

and I have a very good experience with check on start from plpgsql_lint
usage when I wrote a regression tests.

A "first start" doesn't create dependency on state - but just more
preciously define a time, when checking will be done. Probably a option
check_create_and_start can be useful.

Regards

Pavel

> --
> Jim C. Nasby, Data Architect jim(at)nasby(dot)net
> 512.569.9461 (cell) http://jim.nasby.net
>


From: Jim Nasby <jim(at)nasby(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-09 21:14:31
Message-ID: 52A632B7.6030804@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/9/13 1:08 PM, Pavel Stehule wrote:
> So presumably it would be check_never, not check_newer... :) BTW, it's not terribly hard to work around the temp table issue; you just need to create the expected table in the session when you create the function. But even in this case, I think it would still be good to check what we can, like at least basic plpgsql syntax.
>
>
> I sorry.
>
> You cannot to create temporary table - this check should not have any side effect - and creating temporary table can run some event trigger.
>
> But there should be some hints for check like annotations or some similar. Or you can minimize a area where check will be disabled.

Sorry, I meant that the user can work around it by creating the table. I didn't mean to imply that we would magically create a temp table to do the checking.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-09 22:09:41
Message-ID: 52A63FA5.7060902@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/8/13, 12:01 PM, Pavel Stehule wrote:
> But still I have no idea, how to push check without possible slowdown
> execution with code duplication

Create a GUC parameter plpgsql.slow_checks or whatever (perhaps more
specific), which people can turn on when they run their test suites.

This doesn't really have to be all that much different from what we are
currently doing in C with scan-build and address sanitizer, for example.


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-10 04:02:09
Message-ID: CAA4eK1KvPYLVRzcKmrjhKOO_LEju_KBuRj_q1hjCqcg4D6kVYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 9, 2013 at 10:54 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2013/12/9 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>>
>> >
>> > There are two points, that should be solved
>> >
>> > a) introduction a dependency to other object in schema - now CREATE
>> > FUNCTION
>> > is fully independent on others
>> >
>> > b) slow start - if we check all paths on start, then start can be slower
>> > -
>> > and some functions should not work due dependency on temporary tables.
>> >
>> > I am thinking about possible marking a function by #option (we have same
>> > idea)
>> >
>> > some like
>> >
>> > #option check_on_first_start
>> > #option check_on_create
>> > #option check_newer
>>
>> what exactly check_newer means, does it mean whenever a function is
>> replaced (changed)?
>>
>
> no, it means, so request for check will be ignored ever - some functions
> cannot be deeply checked due using dynamic SQL or dynamic created data types
> - temporary tables created in functions.

Thanks for clarification, the part of name 'newer' has created
confusion. I understand
that creating/identifying dependency in some of the cases will be
quite tricky, does other
similar languages for other databases does that for all cases (objects
in dynamic statements).

Is the main reason for identifying/creating dependency is to mark
function as invalid when
any dependent object is Dropped/Altered?

This thread is from quite some time, so please excuse me if I had
asked anything which has
been discussed previously.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-10 06:30:41
Message-ID: CAFj8pRAL2s2EkuaW7L-jp24Zowovr9yfm_Dmc3JBBCwouzkETw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/12/9 Peter Eisentraut <peter_e(at)gmx(dot)net>

> On 12/8/13, 12:01 PM, Pavel Stehule wrote:
> > But still I have no idea, how to push check without possible slowdown
> > execution with code duplication
>
> Create a GUC parameter plpgsql.slow_checks or whatever (perhaps more
> specific), which people can turn on when they run their test suites.
>
> This doesn't really have to be all that much different from what we are
> currently doing in C with scan-build and address sanitizer, for example.
>
>
A main issue is placing these tests on critical path.

You have to check it in every expression, in every internal switch

There are two main purposes

a) ensure a expression/query is valid (not only syntax valid)
b) search all expressions/queries - visit all possible paths in code.

so you should to place new switch everywhere in plpgsql executor, where is
entry to some path.

Second issue - these check decrease a readability of plpgsql statement
executor handlers. This code is relative very readable now. With new switch
is little bit (more) less clean.

I think so fact we use a two other large statement switch (printing, free
expressions) is natural, and it hard to write it better.

Regards

Pavel


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-10 06:45:52
Message-ID: CAFj8pRBmeX3PKiEYH6+Xuzawpzem-TU-fPJPxv8+Tgwdd6fFnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/12/10 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>

> On Mon, Dec 9, 2013 at 10:54 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
> > 2013/12/9 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> >>
> >> >
> >> > There are two points, that should be solved
> >> >
> >> > a) introduction a dependency to other object in schema - now CREATE
> >> > FUNCTION
> >> > is fully independent on others
> >> >
> >> > b) slow start - if we check all paths on start, then start can be
> slower
> >> > -
> >> > and some functions should not work due dependency on temporary tables.
> >> >
> >> > I am thinking about possible marking a function by #option (we have
> same
> >> > idea)
> >> >
> >> > some like
> >> >
> >> > #option check_on_first_start
> >> > #option check_on_create
> >> > #option check_newer
> >>
> >> what exactly check_newer means, does it mean whenever a function is
> >> replaced (changed)?
> >>
> >
> > no, it means, so request for check will be ignored ever - some functions
> > cannot be deeply checked due using dynamic SQL or dynamic created data
> types
> > - temporary tables created in functions.
>
> Thanks for clarification, the part of name 'newer' has created
> confusion. I understand
> that creating/identifying dependency in some of the cases will be
> quite tricky, does other
> similar languages for other databases does that for all cases (objects
> in dynamic statements).
>

I am sorry

PL/pgSQL is really specific due invisible SQL base and mixing two
environments and languages in user visible area.

>
> Is the main reason for identifying/creating dependency is to mark
> function as invalid when
> any dependent object is Dropped/Altered?
>

Now, PG has no any tool for checking dependency between functions and other
objects. What has positive effects - we have very simply deploying, that
works in almost use cases very well - and works with our temporary tables
implementation. There is simple rule - depended object must living before
they are >>used in runtime<<. But checking should not be runtime event and
we would to decrease a false alarms. So we have to expect so almost all
necessary object are created - it is reason, why we decided don't do check
in create function statement time. We don't would to introduce new
dependency if it will be possible.

Regards

Pavel

> This thread is from quite some time, so please excuse me if I had
> asked anything which has
> been discussed previously.
>

> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-10 20:45:04
Message-ID: CA+TgmobssQXEvvu3+_YkUdjqSRNrkrBXiQoeZiicnA7_ueEFyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 10, 2013 at 1:45 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> Now, PG has no any tool for checking dependency between functions and other
> objects. What has positive effects - we have very simply deploying, that
> works in almost use cases very well - and works with our temporary tables
> implementation. There is simple rule - depended object must living before
> they are >>used in runtime<<. But checking should not be runtime event and
> we would to decrease a false alarms. So we have to expect so almost all
> necessary object are created - it is reason, why we decided don't do check
> in create function statement time. We don't would to introduce new
> dependency if it will be possible.

This is a very good point. Annotating the function itself with
markers that cause it to be more strictly checked will create a
dump/reload problem that we won't enjoy solving. The decision to
check the function more strictly or not would need to be based on some
kind of session state that users could establish but dump restore
would not.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-10 20:50:58
Message-ID: 9271.1386708658@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> This is a very good point. Annotating the function itself with
> markers that cause it to be more strictly checked will create a
> dump/reload problem that we won't enjoy solving. The decision to
> check the function more strictly or not would need to be based on some
> kind of session state that users could establish but dump restore
> would not.

One would hope that turning off check_function_bodies would be sufficient
to disable any added checking, though, so I don't see this being a problem
for pg_dump. But there might be other scenarios where an additional knob
would be useful.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-10 21:26:23
Message-ID: 52A786FF.9040409@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/10/2013 12:50 PM, Tom Lane wrote:
> One would hope that turning off check_function_bodies would be sufficient
> to disable any added checking, though, so I don't see this being a problem
> for pg_dump. But there might be other scenarios where an additional knob
> would be useful.

I can't think of one, offhand. And +1 for NOT adding a new knob.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-11 04:28:38
Message-ID: CAA4eK1JiiEH-sF6JyOK5GFh_iZE0pQafmKJLxPy4=RNc3gxVEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 10, 2013 at 12:15 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2013/12/10 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>> On Mon, Dec 9, 2013 at 10:54 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>> wrote:
>> > 2013/12/9 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>> >> > There are two points, that should be solved
>> >> >
>> >> > a) introduction a dependency to other object in schema - now CREATE
>> >> > FUNCTION
>> >> > is fully independent on others
>> >> >
>> >> > b) slow start - if we check all paths on start, then start can be
>> >> > slower
>>>>
>> >> > some like
>> >> >
>> >> > #option check_on_first_start
>> >> > #option check_on_create
>> >> > #option check_newer
>> >>
>> >> what exactly check_newer means, does it mean whenever a function is
>> >> replaced (changed)?
>> >>
>> >
>> > no, it means, so request for check will be ignored ever - some functions
>> > cannot be deeply checked due using dynamic SQL or dynamic created data
>> > types
>> > - temporary tables created in functions.
>>
>
>
> Now, PG has no any tool for checking dependency between functions and other
> objects. What has positive effects - we have very simply deploying, that
> works in almost use cases very well - and works with our temporary tables
> implementation. There is simple rule - depended object must living before
> they are >>used in runtime<<. But checking should not be runtime event and
> we would to decrease a false alarms. So we have to expect so almost all
> necessary object are created - it is reason, why we decided don't do check
> in create function statement time.

Isn't that already done for SQL function's (fmgr_sql_validator)?

postgres=# CREATE FUNCTION clean_emp() RETURNS void AS
postgres'# DELETE FROM emp
postgres'# WHERE salary < 0;
postgres'# ' LANGUAGE SQL;
ERROR: relation "emp" does not exist
LINE 2: DELETE FROM emp
^

I mean to say that the above rule stated by you ("There is simple rule
- depended object must living before
they are >>used in runtime<<") doesn't seem to be true for SQL functions.
So isn't it better to do same for plpgsql functions as well?

For doing at runtime (during first execution of function) are you
planing to add it as a extra step
such that if parameter check_on_first_start is set, then do it.

>We don't would to introduce new dependency if it will be possible.
In that case what exactly you mean to say in point a) ("introduction
a dependency to other object..") above in you mail.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-11 05:10:38
Message-ID: 4048.1386738638@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> On Tue, Dec 10, 2013 at 12:15 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> Now, PG has no any tool for checking dependency between functions and other
>> objects.

> Isn't that already done for SQL function's (fmgr_sql_validator)?

Pavel's point is that the only way to find out if the validator will fail
is to run it and see if it fails; and even if it does, how much will you
know about why? That's not particularly helpful for pg_dump, which needs
to understand dependencies in a fairly deep fashion. It not only needs to
figure out how to dump the database objects in a dependency-safe order,
but what to do to break dependency loops.

Right now I believe that pg_dump has a workable strategy for every
possible case of circular dependencies, because they are all caused by
secondary attributes of objects that can be split out and applied later,
for example applying a column default via ALTER TABLE ALTER COLUMN SET
DEFAULT rather than listing the default right in the CREATE TABLE command.

However, if function A depends on B and also vice-versa (mutual recursion
is not exactly an unheard-of technique), there is no way to load them both
if the function bodies are both checked at creation time.

I guess we could invent some equivalent of a forward declaration, but
that still leaves us short of understanding what the function body is
depending on.

regards, tom lane


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Steve Singer <steve(at)ssinger(dot)info>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-12 04:30:03
Message-ID: CAA4eK1LY0p_5uj3YLnPntbzi7vFHpKYUrOj8Es3LX2fuBYtQ_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 11, 2013 at 10:40 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>> On Tue, Dec 10, 2013 at 12:15 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> Now, PG has no any tool for checking dependency between functions and other
>>> objects.
>
>> Isn't that already done for SQL function's (fmgr_sql_validator)?
>
> Pavel's point is that the only way to find out if the validator will fail
> is to run it and see if it fails; and even if it does, how much will you
> know about why?

One of the important thing at time of function creation, users are
interested in knowing
is that if there are any objects (table/view/sequence ..) that are
used in function body
and are missing and the reason is I think they don't want such
things to come up during execution.

Similar thing happens for prepared statements in PostgreSQL, like
at time of parse message
only it checks both syntax errors and semantic check (which ensures
statement is meaningful,
for ex. whether objects and columns used in the statements exist)

Like we do checks other than syntax check at time of creation of
prepared statement, same
thing should be considered meaning full at time of function creation.

As you mentioned, there are checks (like dependency, mutual
recursion) which are difficult or not
feasible in current design to perform, but so will be the case for
them to execute during first execution
of function. So is it not better to do what is more feasible during
function creation rather than leaving
most of the things at execution phase?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Steve Singer <steve(at)ssinger(dot)info>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-12-12 07:33:30
Message-ID: CAFj8pRDWdeWfeM3pOmuVpbsW4JUZTV=uWen98at_BJGPxFY8OA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I dislike it. a too early check means a issues with temporary tables and
mainy new dependency between functions in complex projects. It is some what
we don't want.
Dne 12. 12. 2013 5:30 "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com> napsal(a):

> On Wed, Dec 11, 2013 at 10:40 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> >> On Tue, Dec 10, 2013 at 12:15 PM, Pavel Stehule <
> pavel(dot)stehule(at)gmail(dot)com> wrote:
> >>> Now, PG has no any tool for checking dependency between functions and
> other
> >>> objects.
> >
> >> Isn't that already done for SQL function's (fmgr_sql_validator)?
> >
> > Pavel's point is that the only way to find out if the validator will fail
> > is to run it and see if it fails; and even if it does, how much will you
> > know about why?
>
> One of the important thing at time of function creation, users are
> interested in knowing
> is that if there are any objects (table/view/sequence ..) that are
> used in function body
> and are missing and the reason is I think they don't want such
> things to come up during execution.
>
> Similar thing happens for prepared statements in PostgreSQL, like
> at time of parse message
> only it checks both syntax errors and semantic check (which ensures
> statement is meaningful,
> for ex. whether objects and columns used in the statements exist)
>
> Like we do checks other than syntax check at time of creation of
> prepared statement, same
> thing should be considered meaning full at time of function creation.
>
> As you mentioned, there are checks (like dependency, mutual
> recursion) which are difficult or not
> feasible in current design to perform, but so will be the case for
> them to execute during first execution
> of function. So is it not better to do what is more feasible during
> function creation rather than leaving
> most of the things at execution phase?
>
>
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>