Re: Plugins redux (was Re: [PATCHES] PL instrumentation

Lists: pgsql-hackerspgsql-patches
From: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
To: pgsql-patches(at)postgresql(dot)org
Subject: PL instrumentation plugin support (i.e. PL/pgSQL debugger infrastructure)
Date: 2006-07-28 09:44:19
Message-ID: 1154079859.7099.28.camel@sakai.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

The attached patch adds support for loadable instrumentation plugins for
procedural languages (as discussed at the anniversary summit). It also
adds plugin support to the PL/pgSQL language handler.

We are using this plugin mechanism to load the PL/pgSQL debugger on
demand (the debugger is not part of this patch, the patch is just the
infrastructure that we need to load various instrumentation plugins).
We will also post some sample plugins (a PL/pgSQL profiler and a simple
tracer), probably at the edb-debugger pgFoundry project site so you can
see how to use the plugin mechanism.

A couple of notes: to use a plugin, you must define a custom GUC
variable that specifies the name of the shared-object that implements
the plugin. For example, to use the PL/pgSQL profiler (which is
implement in $libdir/plugin_profiler.so), you would add the following to
postgresql.conf

custom_variable_classes = 'plpgsql'

plpgsql.plugin = 'plugin_profiler'

The plpgsql.plugin variable is treated as an SUSET variable so you must
be a superuser to change plugins.

Also, we define a helper function (in fmgr.c) named load_pl_plugin()
that a language handler can use to load a plugin. The plugin requires
the name of the GUC variable (plplgsql.plugin, pljava.plugin,
plperl.pluhgN, etc.) and a pointer to a language-specific struct that
the plugin fills in with a set of function pointers (presumably, the
language handler will call those functions at appropriate points in
time).

-- Korry

Attachment Content-Type Size
plugin.patch text/x-patch 15.0 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org, "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: PL instrumentation plugin support (i.e. PL/pgSQL debugger infrastructure)
Date: 2006-08-08 17:26:25
Message-ID: 24101.1155057985@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com> writes:
> The attached patch adds support for loadable instrumentation plugins for
> procedural languages (as discussed at the anniversary summit). It also
> adds plugin support to the PL/pgSQL language handler.

In view of the other patch submitted to support init/fini functions for
shared libraries, I'm inclined to change this one to depend on that;
in particular it seems like we could eliminate the necessity for users
to specify the correct setup-function names. Thoughts?

regards, tom lane


From: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-patches(at)postgresql(dot)org
Subject: Re: PL instrumentation plugin support (i.e. PL/pgSQL
Date: 2006-08-08 17:35:52
Message-ID: 1155058552.6960.29.camel@sakai.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> In view of the other patch submitted to support init/fini functions for
> shared libraries, I'm inclined to change this one to depend on that;
> in particular it seems like we could eliminate the necessity for users
> to specify the correct setup-function names. Thoughts?

I think that would be great. Can you point me to the patch you're
referring to? I can convert my patch if you prefer.

-- Korry

--
Korry Douglas korryd(at)enterprisedb(dot)com
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: korryd(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Plugins redux (was Re: [PATCHES] PL instrumentation plugin support)
Date: 2006-08-09 16:44:00
Message-ID: 27198.1155141840@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Yesterday I suggested that we should redesign the PL plugin patch:
http://archives.postgresql.org/pgsql-patches/2006-07/msg00291.php
in view of the recently-committed patch to add initialization/
finalization support for all dynamic libraries loaded into the backend:
http://archives.postgresql.org/pgsql-committers/2006-08/msg00176.php

That was just handwaving at the time, because I didn't have a concrete
proposal, but here is one.

Overview
--------

ISTM there are two separate bits of functionality that the core backend
needs to provide in support of PL plugins. The first is that we need a
"rendezvous" facility whereby two separately-loaded shared libraries can
connect and exchange data (a struct of function pointers for the immediate
plans for plpgsql, but in general it might be anything). The exchanged
data mustn't be limited to things known in advance to the core backend.
The facility should support loading of communicating libraries in either
order, and it should also allow for unloading of libraries. This seems
easy enough to design: the core backend should basically provide a
hashtable that maps names to "void *" pointer variables, and then two
libraries that want to communicate just agree on a name and the data
structure to be pointed to. (More details below.)

The other, probably more controversial bit of functionality is that there
needs to be a way to cause a backend to load a PL plugin shared library
without any cooperation from the connected client application. For
interactive use of a PL debugger it might be sufficient to tell people to
do "LOAD 'plpgsql_debugger'" before running their function-to-be-tested,
but if you're trying to debug some behavior that only manifests in a large
complicated application, it may be impractical to get the application to
issue such a command. We agreed while discussing this at the PG
Anniversary Conference that providing a GUC variable for this purpose is
probably sufficient, as there are multiple ways that such a variable can
be set on behalf of an application: ALTER USER SET, ALTER DATABASE SET,
"export PGOPTIONS=--var=val" before starting the app, etc. As submitted,
the plugin patch envisions defining a custom GUC variable for each PL and
having each PL add logic to force loading of any shared library mentioned
in its specific variable. I claim however that that is just duplicative,
and that a single backend-wide GUC variable should be sufficient.

Details
-------

I propose adding one function to dfmgr.c,

void ** find_rendezvous_variable(const char *varname)

Internally this function has a per-backend hashtable that associates names
with "void *" data fields. It merely looks up the given name and returns
a pointer to the data field; if the name is not already present then it's
added with a data field initialized to NULL. The hashtable entries
continue to exist for the life of the backend, and are never modified by
the core backend code.

A plugin such as a plpgsql debugger would do this in its _PG_init()
function:

static PLpgSQL_plugin my_plugin = { ... function addresses ... };

PLpgSQL_plugin **var_ptr;

var_ptr = (PLpgSQL_plugin **) find_rendezvous_variable("PLpgSQL_plugin");
*var_ptr = &my_plugin;

and this in its _PG_fini() function:

PLpgSQL_plugin **var_ptr;

var_ptr = (PLpgSQL_plugin **) find_rendezvous_variable("PLpgSQL_plugin");
*var_ptr = NULL;

Meanwhile, plpgsql itself would do this in its _PG_init() function:

static PLpgSQL_plugin **plugin_ptr = NULL;

plugin_ptr = (PLpgSQL_plugin **) find_rendezvous_variable("PLpgSQL_plugin");

and in the places where it wants to pass control to the plugin, it'd do

if (*plugin_ptr)
((*plugin_ptr)->function_field) (... args ...);

The principal difference between this proposal and the original patch
is that this way supports unloading of PL plugins; AFAICS, the original
patch would just crash if the plugin were unloaded and then use of plpgsql
continued. Also, this way doesn't depend on the user to correctly specify
an initialization function within the plugin: as long as you load the
right shared library, everything works.

As for forcing the library load to occur, I propose a new GUC variable
"backend_load_libraries" that is much like the postmaster's
preload_libraries, except that the requested library loads happen
at backend start time instead of in the postmaster. Then we need
write and document the code only once, and there are other possible
uses for it besides PL plugins.

We need to think about security issues in defining such a variable:
we don't want unprivileged users to be able to load arbitrary code
into the server. One solution to this is to make the variable SUSET,
ie, you must be superuser to set it. This would make it impossible
for people to debug their own functions without superuser privileges
... but IIRC actual use of the proposed plpgsql debugger requires
superuser privileges too, so this isn't necessarily a fatal objection.
Still, it'd be nice if the PGOPTIONS way of setting the variable could
be used without requiring that the client app itself run as superuser.
Plan B is to restrict which shared libraries can be loaded using the
variable --- there are different ways we could do that, but my suggestion
would be to insist that the shared library come from "$libdir/plugins/".
The DBA is then responsible for allowing only "safe" shared libraries
to be installed in that directory, and therefore we can allow unprivileged
users to set "backend_load_libraries". (We might as well also allow them
to execute LOAD for these libraries, so that plugins could be switched
intra-session without superuser privileges.)

BTW, is anyone up for renaming the existing "preload_libraries" variable
to "postmaster_load_libraries"? This would be more symmetrical with
"backend_load_libraries", and so perhaps clearer about which does what
when.

Comments?

regards, tom lane


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: remote query debugging was: Plugins redux
Date: 2006-08-09 17:28:07
Message-ID: 44DA1B27.1060500@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> The other, probably more controversial bit of functionality is that there
> needs to be a way to cause a backend to load a PL plugin shared library
> without any cooperation from the connected client application. For
> interactive use of a PL debugger it might be sufficient to tell people to
> do "LOAD 'plpgsql_debugger'" before running their function-to-be-tested,
> but if you're trying to debug some behavior that only manifests in a large
> complicated application, it may be impractical to get the application to
> issue such a command.
A similar issue applies to plain SQL that's not touching any PL: In the
past, I encountered numerous situations where I'd have liked to take a
peek at the current application's queries (on MSSQL, this can be done
with SQL Profiler), but not have constant statement logging. IMHO it
would be a good idea if

- debugging could be turned on and off on-the-fly, e.g. to skip the app
startup phase. Thus I question the statement "GUC variable is sufficient"
- the mechnism would cover plain SQL too.

Regards,
Andreas


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: korryd(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation
Date: 2006-08-09 18:17:38
Message-ID: 1155147458.12968.147.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, 2006-08-09 at 12:44 -0400, Tom Lane wrote:
> A plugin such as a plpgsql debugger would do this in its _PG_init()
> function:
>
> static PLpgSQL_plugin my_plugin = { ... function addresses ... };
>
> PLpgSQL_plugin **var_ptr;
>
> var_ptr = (PLpgSQL_plugin **) find_rendezvous_variable("PLpgSQL_plugin");
> *var_ptr = &my_plugin;
>
> and this in its _PG_fini() function:
>
> PLpgSQL_plugin **var_ptr;
>
> var_ptr = (PLpgSQL_plugin **) find_rendezvous_variable("PLpgSQL_plugin");
> *var_ptr = NULL;
>
> Meanwhile, plpgsql itself would do this in its _PG_init() function:
>
> static PLpgSQL_plugin **plugin_ptr = NULL;
>
> plugin_ptr = (PLpgSQL_plugin **) find_rendezvous_variable("PLpgSQL_plugin");
>
> and in the places where it wants to pass control to the plugin, it'd do
>
> if (*plugin_ptr)
> ((*plugin_ptr)->function_field) (... args ...);
>

I know this is a trivial question, but is there some kind of lock that
would prevent the PG_fini for the plpgsql debugger from executing after
"if(*plugin_ptr)" and before "((*plugin_ptr)->function_field)(...)"?

Regards,
Jeff Davis


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: korryd(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation plugin support)
Date: 2006-08-09 18:27:33
Message-ID: 28037.1155148053@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> I know this is a trivial question, but is there some kind of lock that
> would prevent the PG_fini for the plpgsql debugger from executing after
> "if(*plugin_ptr)" and before "((*plugin_ptr)->function_field)(...)"?

Backends are not multi-threaded.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: remote query debugging was: Plugins redux
Date: 2006-08-09 20:04:49
Message-ID: 28808.1155153889@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andreas Pflug <pgadmin(at)pse-consulting(dot)de> writes:
> Tom Lane wrote:
>> The other, probably more controversial bit of functionality is that there
>> needs to be a way to cause a backend to load a PL plugin shared library
>> without any cooperation from the connected client application.

> A similar issue applies to plain SQL that's not touching any PL: In the
> past, I encountered numerous situations where I'd have liked to take a
> peek at the current application's queries (on MSSQL, this can be done
> with SQL Profiler), but not have constant statement logging. IMHO it
> would be a good idea if

> - debugging could be turned on and off on-the-fly, e.g. to skip the app
> startup phase. Thus I question the statement "GUC variable is sufficient"
> - the mechnism would cover plain SQL too.

I'd turn that around: I think you are arguing for a way to change GUC
settings on-the-fly for a single existing session, without cooperation
from the client. You can do that now if you're desperate enough (change
postgresql.conf, SIGHUP just that backend, revert postgresql.conf
change) but I agree something easier would be nice. In any case the
problem needs a generic solution not something that only works for PL
plugin loading, so solving it within GUC seems right to me.

regards, tom lane


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: remote query debugging was: Plugins redux
Date: 2006-08-09 20:46:41
Message-ID: 44DA49B1.5030406@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
>
> I'd turn that around: I think you are arguing for a way to change GUC
> settings on-the-fly for a single existing session, without cooperation
> from the client.

Ok, implemented that way would solve it (partially)
Something like pg_set_guc(pid int4, varname text, value text) would be
fine to set GUC on-the-fly. Could probably be signaled to the target
backend with SIGHUP, but how should the individual parameter be
transmitted, and eventually be retrieved? What about multiple parameters
to be set atomically?

A different aproach: A system table pg_guc, that holds current GUC
settings for each backend.
- on SIGHUP, the backend reload postgresql.conf as usual and writes guc
into pg_guc, unless a "config file override" flag is set.
- if pg_guc.config_override is set, guc are read from the table instead,
and the flag is reset.
- truncate pg_guc on postmaster start/restart

Regards,
Andreas

PS the non-solved part for me is still that log_statement logging would
still go to the standard log, in a less machine-readable way, mixed with
other backend's data and possibly truncated. But that's a different story.


From: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation plugin
Date: 2006-08-09 21:24:28
Message-ID: 1155158668.24313.64.camel@sakai.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> ISTM there are two separate bits of functionality that the core backend
> needs to provide in support of PL plugins. The first is that we need a
> "rendezvous" facility whereby two separately-loaded shared libraries can
> connect and exchange data (a struct of function pointers for the immediate
> plans for plpgsql, but in general it might be anything). The exchanged
> data mustn't be limited to things known in advance to the core backend.
> The facility should support loading of communicating libraries in either
> order, and it should also allow for unloading of libraries. This seems
> easy enough to design: the core backend should basically provide a
> hashtable that maps names to "void *" pointer variables, and then two
> libraries that want to communicate just agree on a name and the data
> structure to be pointed to. (More details below.)

I like it so far...

> The other, probably more controversial bit of functionality is that there
> needs to be a way to cause a backend to load a PL plugin shared library
> without any cooperation from the connected client application. For
> interactive use of a PL debugger it might be sufficient to tell people to
> do "LOAD 'plpgsql_debugger'" before running their function-to-be-tested,
> but if you're trying to debug some behavior that only manifests in a large
> complicated application, it may be impractical to get the application to
> issue such a command.

Ok, but you should know that the PL/pgSQL debugger already handles this
without any cooperation from the client.

Loading the PL debugger means that your backend becomes debuggable, not
that it actually starts debugging.

If you want to debug a PL/pgSQL function invoked by some other process
(such as a Web server), you create a "global breakpoint". When a
debuggable process (that is, a process that has loaded the debugger
plugin) trips across the breakpoint, that's when it starts debugging.
(Note: a global breakpoint can identify a particular backend, by process
ID, or it can omit the process ID and trap the first debuggable backend
that trips across the breakpoint).

So, other than the (negligible) performance penalty, its safe to load
the debugger plugin into every backend. (If you've loaded the debugger,
the extra overhead is a single, lightweight-lock protected,
shared-memory hash lookup for each PL function invocation).

> We agreed while discussing this at the PG
> Anniversary Conference that providing a GUC variable for this purpose is
> probably sufficient, as there are multiple ways that such a variable can
> be set on behalf of an application: ALTER USER SET, ALTER DATABASE SET,
> "export PGOPTIONS=--var=val" before starting the app, etc. As submitted,
> the plugin patch envisions defining a custom GUC variable for each PL and
> having each PL add logic to force loading of any shared library mentioned
> in its specific variable. I claim however that that is just duplicative,
> and that a single backend-wide GUC variable should be sufficient.

Ok, so the rendevous variables replace the per-language GUC variables
(from my patch). In effect, a GUC variable is something that a user
would manage, a rendevous variable is something that two (or more)
pieces of code manage.

GUC variables are visible to the user, rendevous variables are visible
to dynamically loaded libraries.

Oh, and rendevous variables are insensitve to load order too.

Nice design.

> The principal difference between this proposal and the original patch
> is that this way supports unloading of PL plugins; AFAICS, the original
> patch would just crash if the plugin were unloaded and then use of plpgsql
> continued.

I don't think it could crash because there's no way to unload a plugin
(there is no UNLOAD statement, is there?).

Which reminds me, you haven't proposed a way to unload a shared-library.

> We need to think about security issues in defining such a variable:
> we don't want unprivileged users to be able to load arbitrary code
> into the server. One solution to this is to make the variable SUSET,
> ie, you must be superuser to set it. This would make it impossible
> for people to debug their own functions without superuser privileges
> ... but IIRC actual use of the proposed plpgsql debugger requires
> superuser privileges too, so this isn't necessarily a fatal objection.

Correct (the debugger currently requires superuser privileges). We'll
probably want to relax that later.

> Still, it'd be nice if the PGOPTIONS way of setting the variable could
> be used without requiring that the client app itself run as superuser.
> Plan B is to restrict which shared libraries can be loaded using the
> variable --- there are different ways we could do that, but my suggestion
> would be to insist that the shared library come from "$libdir/plugins/".
> The DBA is then responsible for allowing only "safe" shared libraries
> to be installed in that directory, and therefore we can allow unprivileged
> users to set "backend_load_libraries". (We might as well also allow them
> to execute LOAD for these libraries, so that plugins could be switched
> intra-session without superuser privileges.)

How about a combination of plan A and plan B? Make
backend_load_libraries a USERSET variable, but you can't *add* libraries
outside of $libdir/plugins/ unless you are a superuser.

> BTW, is anyone up for renaming the existing "preload_libraries" variable
> to "postmaster_load_libraries"? This would be more symmetrical with
> "backend_load_libraries", and so perhaps clearer about which does what

Makes sense to me, of course that breaks existing postgresql.conf files.

Do you want me to do any of this coding?

--
Korry Douglas korryd(at)enterprisedb(dot)com
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation plugin
Date: 2006-08-09 22:04:49
Message-ID: 29822.1155161089@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com> writes:
>> The other, probably more controversial bit of functionality is that there
>> needs to be a way to cause a backend to load a PL plugin shared library
>> without any cooperation from the connected client application.

> Ok, but you should know that the PL/pgSQL debugger already handles this
> without any cooperation from the client.
> Loading the PL debugger means that your backend becomes debuggable, not
> that it actually starts debugging.

Right, the question is how do you load it, assuming that you do not want
to enable it for every single session in your database.

> So, other than the (negligible) performance penalty, its safe to load
> the debugger plugin into every backend. (If you've loaded the debugger,
> the extra overhead is a single, lightweight-lock protected,
> shared-memory hash lookup for each PL function invocation).

I'm not sure you'll be able to convince everyone that the penalty is so
negligible --- any high-rate access to shared memory is potentially very
expensive, see nearby threads for examples. Even if this is affordable
for the debugger, what of more-invasive plugins such as the performance
monitor? I think a credible general-purpose plugin design has to
address the problem of enabling plugins on-the-fly.

> I don't think it could crash because there's no way to unload a plugin
> (there is no UNLOAD statement, is there?).

What we actually have at the moment is that you can LOAD a library
again, which causes an unload of the prior version and then loading the
new. I suppose this feature was intended to speed library development by
letting you recompile and then update into your existing backend session.
Not sure how many people are using it, but it's there ...

> Which reminds me, you haven't proposed a way to unload a shared-library.

This depends on the semantics we want to assign to
backend_shared_libraries --- I could imagine defining it as "if you
remove an entry from the value then we'll unload that library".

> How about a combination of plan A and plan B? Make
> backend_load_libraries a USERSET variable, but you can't *add* libraries
> outside of $libdir/plugins/ unless you are a superuser.

Not sure how easy that is (ie, can we track which part of the list came
from where), but if doable it'd be OK with me. We might have to split
it into two list variables to make it work, and I'm not sure it's worth
the complication.

>> BTW, is anyone up for renaming the existing "preload_libraries" variable
>> to "postmaster_load_libraries"? This would be more symmetrical with
>> "backend_load_libraries", and so perhaps clearer about which does what

> Makes sense to me, of course that breaks existing postgresql.conf files.

We already broke them by removing the init-function name...

> Do you want me to do any of this coding?

Up to you --- I can do it if you don't want to.

regards, tom lane


From: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation
Date: 2006-08-09 22:43:06
Message-ID: 1155163386.24313.79.camel@sakai.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> I'm not sure you'll be able to convince everyone that the penalty is so
> negligible --- any high-rate access to shared memory is potentially very
> expensive, see nearby threads for examples. Even if this is affordable
> for the debugger, what of more-invasive plugins such as the performance
> monitor? I think a credible general-purpose plugin design has to
> address the problem of enabling plugins on-the-fly.

I have no problem with your solution - I think its a very nice design.

>
> > I don't think it could crash because there's no way to unload a plugin
> > (there is no UNLOAD statement, is there?).
>
> What we actually have at the moment is that you can LOAD a library
> again, which causes an unload of the prior version and then loading the
> new. I suppose this feature was intended to speed library development by
> letting you recompile and then update into your existing backend session.
> Not sure how many people are using it, but it's there ...

Right, but you still end up with a plugin loaded afterwards so no crash
(of course you could do something stupid like load a new plugin with the
same name that isn't really a plugin).

> > Which reminds me, you haven't proposed a way to unload a shared-library.
>
> This depends on the semantics we want to assign to
> backend_shared_libraries --- I could imagine defining it as "if you
> remove an entry from the value then we'll unload that library".

That's what I was thinking too.

> > How about a combination of plan A and plan B? Make
> > backend_load_libraries a USERSET variable, but you can't *add* libraries
> > outside of $libdir/plugins/ unless you are a superuser.
>
> Not sure how easy that is (ie, can we track which part of the list came
> from where), but if doable it'd be OK with me. We might have to split
> it into two list variables to make it work, and I'm not sure it's worth
> the complication.

The GUC assign hook would parse through backend_load_libraries...

for each library,
{
if ( the library is already loaded )
{
// it's not a new library, doesn't matter where it lives,
doesn't matter if we're a superuser

load( library)
}
else
{
// it's a new entry in backed_load_libraries

if( library lives in $libdir/plugins )
load( library )
else
{
if( is_superuser())
load( library )
else
throw an error
}
}
}

> We already broke them by removing the init-function name...

Right...

> > Do you want me to do any of this coding?
>
> Up to you --- I can do it if you don't want to.

I'll take a stab at it... thanks for your help so far.

-- Korry

--
Korry Douglas korryd(at)enterprisedb(dot)com
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation
Date: 2006-08-09 23:05:46
Message-ID: 1903.1155164746@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com> writes:
>>> (there is no UNLOAD statement, is there?).
>>
>> What we actually have at the moment is that you can LOAD a library
>> again, which causes an unload of the prior version and then loading the
>> new.

> Right, but you still end up with a plugin loaded afterwards so no crash
> (of course you could do something stupid like load a new plugin with the
> same name that isn't really a plugin).

That is only true given careful design and implementation of the
hooks. Right now for instance I think it's possible to crash the
backend by doing "LOAD 'plpgsql'" multiple times, because it hooks
into CallXactCallbacks and doesn't unhook. (Now that we have PG_fini
it should be possible to fix that...) Doesn't seem to crash
on the HPUX machine I just tried it on, but maybe HPUX is weird and
doesn't actually remove the old library from the address space?

Anyway I disagree with your proposal to let unprivileged users re-LOAD
random libraries. If they've not been modified to have clean unload
semantics this isn't safe.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation
Date: 2006-08-10 06:45:23
Message-ID: 1155192324.2368.116.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, 2006-08-09 at 18:04 -0400, Tom Lane wrote:
> >> BTW, is anyone up for renaming the existing "preload_libraries"
> variable
> >> to "postmaster_load_libraries"? This would be more symmetrical
> with
> >> "backend_load_libraries", and so perhaps clearer about which does
> what
>
> > Makes sense to me, of course that breaks existing postgresql.conf
> files.
>
> We already broke them by removing the init-function name...

How about:

shared_load_libraries (rather than postmaster_load_libraries)
local_load_libraries (rather than backend_load_libraries)

which is more informative for people that don't know/care what a
postmaster or a backend is and what implications those phrases carry.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation plugin
Date: 2006-08-10 15:55:19
Message-ID: 1155225319.24313.95.camel@sakai.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> As for forcing the library load to occur, I propose a new GUC variable
> "backend_load_libraries" that is much like the postmaster's
> preload_libraries, except that the requested library loads happen
> at backend start time instead of in the postmaster. Then we need
> write and document the code only once, and there are other possible
> uses for it besides PL plugins.

Any thoughts about where to put the call to process_backend_libraries()
(the new function to handle backend_load_libraries)?

I'm thinking that it should go in PostgresMain(), just after (before?)
the call to BeginReportingGUCOptions() - by that time, we know whether
we are a superuser and we have processed all GUC options.

Also, should we create an on_proc_exit() handler that would unload all
dynamic libraries (specifically to call the _PG_fini() functions)?

-- Korry

--
Korry Douglas korryd(at)enterprisedb(dot)com
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation plugin
Date: 2006-08-10 16:05:06
Message-ID: 16230.1155225906@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com> writes:
> Any thoughts about where to put the call to process_backend_libraries()
> (the new function to handle backend_load_libraries)?
> I'm thinking that it should go in PostgresMain(), just after (before?)
> the call to BeginReportingGUCOptions() - by that time, we know whether
> we are a superuser and we have processed all GUC options.

Sounds fine.

> Also, should we create an on_proc_exit() handler that would unload all
> dynamic libraries (specifically to call the _PG_fini() functions)?

Yeah, I thought about that too, but I'm inclined not to do it;
it seems like just excess cycles. The process is quitting anyway,
so the only reason this would be useful is if the library thinks it's
going to update external or shared state during _PG_fini ... and on
the whole that sounds like a bad idea. Besides, if a library really
needs this it can add its own on_proc_exit handler.

regards, tom lane


From: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation
Date: 2006-08-10 23:03:10
Message-ID: 1155250990.24313.114.camel@sakai.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> > Also, should we create an on_proc_exit() handler that would unload all
> > dynamic libraries (specifically to call the _PG_fini() functions)?
>
> Yeah, I thought about that too, but I'm inclined not to do it;
> it seems like just excess cycles. The process is quitting anyway,
> so the only reason this would be useful is if the library thinks it's
> going to update external or shared state during _PG_fini ... and on
> the whole that sounds like a bad idea. Besides, if a library really
> needs this it can add its own on_proc_exit handler.

It seems a little dangerous for a dynamic library to register an
on_proc_exit() handler. If we ever add support for unloading a dynamic
library, we'll have to add an unregister_on_proc_exit() too. Otherwise,
a dynamic library might register a function pointer with on_proc_exit()
and then leave a dangling pointer when it gets unloaded.

Given that, I assume you don't feel the need to unload old shared
libraries if the user (a superuser) removes an entry from
backend_load_libraries, right?

In fact, it looks _PG_fini() is only called if you *reload* a library,
unless I'm missing something somwhere.

--
Korry Douglas korryd(at)enterprisedb(dot)com
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plugins redux (was Re: [PATCHES] PL instrumentation
Date: 2006-08-11 00:17:00
Message-ID: 3022.1155255420@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"korryd(at)enterprisedb(dot)com" <korryd(at)enterprisedb(dot)com> writes:
> It seems a little dangerous for a dynamic library to register an
> on_proc_exit() handler. If we ever add support for unloading a dynamic
> library, we'll have to add an unregister_on_proc_exit() too.

True, but there might be other uses for such a thing anyway. I can't
see any objection to adding that routine.

> In fact, it looks _PG_fini() is only called if you *reload* a library,
> unless I'm missing something somwhere.

Sure, because we don't currently have a routine for unloading a library
without immediately reloading it.

regards, tom lane