Re: [REVIEW] prepare plans of embedded sql on function start

From: Andy Colson <andy(at)squeakycode(dot)net>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, pavel(dot)stehule(at)gmail(dot)com
Subject: Re: [REVIEW] prepare plans of embedded sql on function start
Date: 2011-09-10 21:10:19
Message-ID: 4E6BD23B.3080308@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Purpose
========
Better test coverage of functions. On first call of a function, all sql statements will be prepared, even those not directly called. Think:

create function test() returns void as $$
begin
if false then
select badcolumn from badtable;
end if;
end; $$ language plpgsql;

At first I thought this patch would check sql on create, but that would create a dependency, which would be bad.
Before, if you called this function, you'd get no error. With this patch, and with postgresql.conf settings enabled, you get:

select * from test();

ERROR: relation "badtable" does not exist
LINE 1: select badcolumn from badtable
^
QUERY: select badcolumn from badtable
CONTEXT: PL/pgSQL function "test" line 4 at SQL statement

The Patch
=========
Applied ok, compile and make check ran ok. It seems to add/edit regression tests, but no documentation.

I tried several different things I could think of, but it always found my bugs. Its disabled by default so wont cause unexpected changes. Its easy to enable, and to have individual functions exempt themselves.

Performance
===========
No penalty. At first I was concerned every function call would have overhead of extra preparing, but that is not the case. It's prepared on first call but not subsequent calls. But that made me worry that the prepare would exist too long and use old outdated stats, that as well is not a problem. I was able to setup a test where a bad index was chosen. I used two different psql sessions. In one I started a transaction, and selected from my function several times (and it was slow because it was using a bad index). In the other psql session I ran analyze on my table. Back in my first psql session, I just waited, running my function ever once and a while. Eventually it picked up the new stats and start running quick again.

Code Review
===========
I am not qualified

Problems
========
I like the idea of this patch. I think it will help catch more bugs in functions sooner. However, a function like:

create function test5() returns integer as $$
begin
create temp table junk(id integer);
insert into junk(id) values(100);
drop table temp;
return 1;
end;
$$ language plpgsql;

Will always throw an error because at prepare time, the temp junk table wont exist. This patch implements new syntax to disable the check:

create function test5() returns integer as $$
#prepare_plans on_demand
begin
...

Was it Tom Lane that said, "if we add new syntax, we have to support it forever"? As a helpful feature I can see people (myself included) enabling this system wide. So what happens to all the plpgsql on pgxn that this happens to break? Well it needs updated, no problem, but the fix will be to add "#prepare_plans on_demand" in the magic spot. That breaks it for prior versions of PG. Is this the syntax we want? What if we add more "compiler flags" in the future:

create function test5() returns integer as $$
#prepare_plans on_demand
#disable_xlog
#work_mem 10MB
begin
create temp table junk(id integer);
insert into junk(id) values(100);
drop table temp;
return 1;
end;
$$ language plpgsql;

I don't have an answer to that. Other sql implement via OPTION(...).

One option I'd thought about, was to extended ANALYZE to support functions. It would require no additional plpgsql syntax changes, no postgresql.conf settings. If you wanted to prepare (prepare for a testing purpose, not a performance purpose) all the sql inside your function, youd:

analyze test5();

I'd expect to get errors from that, because the junk table doesn't exist. I'd expect it, and just never analyze it.

Summary
=======
Its a tough one. I see benefit here. I can see myself using it. If I had to put my finger on it, I'm not 100% sold on the syntax. It is usable though, it does solve problems, so I'd use it. (I'm not 100% sure ANALYZE is better, either).

I'm going to leave this patch as "needs review", I think more eyes might be helpful.

-Andy

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-09-10 22:03:23 Thinking about inventing MemoryContextSetParent
Previous Message Peter Eisentraut 2011-09-10 21:03:15 Re: Alpha 1 for 9.2