Re: [PATCH] Magic block for modules

Lists: pgsql-hackerspgsql-patches
From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: pgsql-patches(at)postgresql(dot)org
Subject: [PATCH] Magic block for modules
Date: 2006-05-07 21:17:05
Message-ID: 20060507211705.GB3808@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

This implements a proposal made last november:

http://archives.postgresql.org/pgsql-hackers/2005-11/msg00578.php

Basically, it tries to catch people loading modules which belong to the
wrong version or have had certain constants changed, or architechture
mismatches. It's a bit more fine grained though, it currently catches
changes in any of the following:

PG_VERSION_NUM
CATALOG_VERSION_NO
the size of 8 basic C types
BLCKSZ
NAMEDATALEN
HAVE_INT64_TIMESTAMP
INDEX_MAX_KEYS
FUNC_MAX_ARGS
VARHDRSZ
MAXDIM
The compiler used (only brand, not version)

It may be overkill, but better safe than sorry. The only one I'm
ambivalent about is the first one. We don't require a recompile between
minor version changes, or do we?

All it requires is to include the header "pgmagic.h" and to put
somewhere in their source:

PG_MODULE_MAGIC

Currently, modules without a magic block are merely logged at LOG
level. This needs some discussion though.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment Content-Type Size
magic2.diff text/plain 10.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-08 00:21:43
Message-ID: 15948.1147047703@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> This implements a proposal made last november:
> http://archives.postgresql.org/pgsql-hackers/2005-11/msg00578.php

Ah, good, I'd been meaning to do this.

> changes in any of the following:

> PG_VERSION_NUM
> CATALOG_VERSION_NO
> the size of 8 basic C types
> BLCKSZ=20
> NAMEDATALEN=20
> HAVE_INT64_TIMESTAMP
> INDEX_MAX_KEYS
> FUNC_MAX_ARGS
> VARHDRSZ
> MAXDIM
> The compiler used (only brand, not version)

That seems way overkill to me. FUNC_MAX_ARGS is good to check, but
most of those other things are noncritical for typical add-on modules.
In particular I strongly object to the check on compiler. Some of us do
use systems where gcc and vendor compilers are supposed to interoperate
... and aren't all those Windows compilers supposed to, too? AFAIK
it's considered the linker's job to prevent loading 32-bit code into
a 64-bit executable or vice versa, so I don't think we need to be
checking for common assumptions about sizeof(long).

> Currently, modules without a magic block are merely logged at LOG
> level. This needs some discussion though.

I'm pretty sure we had agreed that magic blocks should be required;
otherwise this check will accomplish little.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-08 14:13:45
Message-ID: 20060508141345.GA19351@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Sun, May 07, 2006 at 08:21:43PM -0400, Tom Lane wrote:
> > changes in any of the following:
>
> > PG_VERSION_NUM
> > CATALOG_VERSION_NO
> > the size of 8 basic C types
> > BLCKSZ=20
> > NAMEDATALEN=20
> > HAVE_INT64_TIMESTAMP
> > INDEX_MAX_KEYS
> > FUNC_MAX_ARGS
> > VARHDRSZ
> > MAXDIM
> > The compiler used (only brand, not version)
>
> That seems way overkill to me. FUNC_MAX_ARGS is good to check, but
> most of those other things are noncritical for typical add-on modules.

I was trying to find variables that when changed would make some things
corrupt. For example, a changed NAMEDATALEN will make any use of the
syscache a source of errors. A change in INDEX_MAX_KEYS will break the
GiST interface, etc. I wondered about letting module writers to select
which parts are relevent to them but that just seems like handing
people a footgun.

> In particular I strongly object to the check on compiler. Some of us do
> use systems where gcc and vendor compilers are supposed to interoperate
> ... and aren't all those Windows compilers supposed to, too? AFAIK

Maybe that's the case now, it didn't used to be. I seem to remember
people having difficulties because they compiled the server with MinGW
and the modules with VC++. I'll take it out though, it's not like it
costs anything.

> it's considered the linker's job to prevent loading 32-bit code into
> a 64-bit executable or vice versa, so I don't think we need to be
> checking for common assumptions about sizeof(long).

I know ELF headers contain some of this info, and unix in general
doesn't try to allow different bit sizes in one binary. Windows used to
(maybe still has) a mechanism to allow 32-bit code to call 16-bit
libraries. Do they allow the same for 64-bit libs?

> I'm pretty sure we had agreed that magic blocks should be required;
> otherwise this check will accomplish little.

Sure, I just didn't want to break every module in one weekend. I was
thinking of adding it with LOG level now, send a message on -announce
saying that at the beginning of the 8.2 freeze it will be an ERROR.
Give people time to react.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-08 14:32:47
Message-ID: 26671.1147098767@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> On Sun, May 07, 2006 at 08:21:43PM -0400, Tom Lane wrote:
>> That seems way overkill to me. FUNC_MAX_ARGS is good to check, but
>> most of those other things are noncritical for typical add-on modules.

> I was trying to find variables that when changed would make some things
> corrupt. For example, a changed NAMEDATALEN will make any use of the
> syscache a source of errors. A change in INDEX_MAX_KEYS will break the
> GiST interface, etc.

By that rationale you'd have to record just about every #define in the
system headers. And it still wouldn't be bulletproof --- what of
custom-modified code with, say, extra fields inserted into some widely
used struct?

But you're missing the larger point, which is that in many cases this
would be breaking stuff without any need at all. The majority of
catversion bumps, for instance, are for things that don't affect the
typical add-on module. So checking for identical catversion won't
accomplish much except to force additional recompile churn on people
doing development against CVS HEAD. The original proposal was just
to check for major PG version match. I can see checking FUNC_MAX_ARGS
too, because that has a very direct impact on the ABI that every
external function sees, but I think the cost/benefit ratio rises pretty
darn steeply after that.

Another problem with an expansive list of stuff-to-check is where does
the add-on module find it out from? AFAICS your proposal would make for
a large laundry list of random headers that every add-on would now have
to #include. If it's not defined by postgres.h or fmgr.h (which are two
things that every backend addon is surely including already) then I'm
dubious about using it in the magic block.

> Sure, I just didn't want to break every module in one weekend. I was
> thinking of adding it with LOG level now, send a message on -announce
> saying that at the beginning of the 8.2 freeze it will be an ERROR.
> Give people time to react.

I think that will just mean that we'll break every module at the start
of 8.2 freeze ;-). Unless we forget to change it to error, which IMHO
is way too likely.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-08 14:48:44
Message-ID: 20060508144844.GB19351@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, May 08, 2006 at 10:32:47AM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > I was trying to find variables that when changed would make some things
> > corrupt. For example, a changed NAMEDATALEN will make any use of the
> > syscache a source of errors. A change in INDEX_MAX_KEYS will break the
> > GiST interface, etc.
>
> By that rationale you'd have to record just about every #define in the
> system headers. And it still wouldn't be bulletproof --- what of
> custom-modified code with, say, extra fields inserted into some widely
> used struct?

I can see that. That's why I specifically aimed at the ones defined in
pg_config_manual.h, ie, the ones marked "twiddle me".

> ... So checking for identical catversion won't
> accomplish much except to force additional recompile churn on people
> doing development against CVS HEAD. The original proposal was just
> to check for major PG version match.

Ok, I've taken out CATVERSION and cut PG version to just the major
version. I've also dropped the compiler and several others.

> Another problem with an expansive list of stuff-to-check is where does
> the add-on module find it out from?

All these symbols are defined by including c.h only, which is included
by postgres.h, so this is not an issue. I obviously didn't include any
symbols that a module would need to add special includes for. The only
outlier was CATVERSION but we're dropping that test.

> I think that will just mean that we'll break every module at the start
> of 8.2 freeze ;-). Unless we forget to change it to error, which IMHO
> is way too likely.

Ok, one week then. Not everyone follows -patches and will be mighty
confused when a CVS update suddenly breaks everything.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [PATCHES] Magic block for modules
Date: 2006-05-30 22:20:33
Message-ID: 4582.1149027633@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> On Sun, May 07, 2006 at 08:21:43PM -0400, Tom Lane wrote:
>> I'm pretty sure we had agreed that magic blocks should be required;
>> otherwise this check will accomplish little.

> Sure, I just didn't want to break every module in one weekend. I was
> thinking of adding it with LOG level now, send a message on -announce
> saying that at the beginning of the 8.2 freeze it will be an ERROR.
> Give people time to react.

Now that the magic-block patch is in, we need to revisit this bit of the
discussion. I'm for making lack of a magic block an ERROR immediately.
I don't see the point of waiting; in fact, if we wait till freeze we'll
just make the breakage more concentrated. At the very least it ought
to be a WARNING immediately, because a LOG message is just not visible
enough.

Comments?

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [PATCHES] Magic block for modules
Date: 2006-05-31 06:44:16
Message-ID: 20060531064416.GA23169@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, May 30, 2006 at 06:20:33PM -0400, Tom Lane wrote:
> Now that the magic-block patch is in, we need to revisit this bit of the
> discussion. I'm for making lack of a magic block an ERROR immediately.
> I don't see the point of waiting; in fact, if we wait till freeze we'll
> just make the breakage more concentrated. At the very least it ought
> to be a WARNING immediately, because a LOG message is just not visible
> enough.

If you like I can send a patch that adds it to all of contrib and some
of the other places required so that make check passes...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-31 10:08:41
Message-ID: e51f66da0605310308j74942a2arbabc485c1129a37f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 5/8/06, Martijn van Oosterhout <kleptog(at)svana(dot)org> wrote:
> This implements a proposal made last november:
>
> http://archives.postgresql.org/pgsql-hackers/2005-11/msg00578.php

> All it requires is to include the header "pgmagic.h" and to put
> somewhere in their source:
>
> PG_MODULE_MAGIC

Could you serve this as special docstring instead? Eg:

PG_MODULE(foomodule)

is mandatory, there you can to your magic, and optional:

PG_MODULE_DESC("Do foo")
PG_MODULE_AUTHOR("FooMan <baz(at)foo>")

This provides more motivation for module authors and also creates
(visually) smooth path to provide automatic install, uninstall and registration:

PG_MODULE_INSTALL(inst_sql)
PG_MODULE_UNINSTALL(uninst_sql)

create module foo from '$libdir/foo';
drop module foo;

This seems like worthwhile direction to move, especially
as it requires pretty small amount of changes.

--
marko


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-31 11:04:11
Message-ID: 20060531110411.GF23169@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, May 31, 2006 at 01:08:41PM +0300, Marko Kreen wrote:
> On 5/8/06, Martijn van Oosterhout <kleptog(at)svana(dot)org> wrote:
> >All it requires is to include the header "pgmagic.h" and to put
> >somewhere in their source:
> >
> >PG_MODULE_MAGIC
>
> Could you serve this as special docstring instead? Eg:
>
> PG_MODULE(foomodule)
>
> is mandatory, there you can to your magic, and optional:

<snip>

I like it, but I'm not sure there's enough consensus for that. I've
suggested before including install info inside the modules themselves
but there doesn't seem to be much interest in that.

Apart from that there's issues with implementation. The Linux kernel
can do it easily because it knows it will be using ELF, thus can use
sections to store this info. Postgresql has to support many more types,
making things like this tricky (but not impossible).

Personally I'd like postgres to move to a system where external modules
can easily be installed, uninstalled and upgraded. However, I've not
seen the demand yet.

Have a nice day
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] Magic block for modules
Date: 2006-05-31 14:15:35
Message-ID: 12007.1149084935@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> If you like I can send a patch that adds it to all of contrib and some
> of the other places required so that make check passes...

Think I got them all already:
http://archives.postgresql.org/pgsql-committers/2006-05/msg00384.php
but if you see any I missed...

regards, tom lane


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-31 14:57:47
Message-ID: e51f66da0605310757u65470f69k3b68bbfdb8ba11b1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 5/31/06, Martijn van Oosterhout <kleptog(at)svana(dot)org> wrote:
> On Wed, May 31, 2006 at 01:08:41PM +0300, Marko Kreen wrote:
> > On 5/8/06, Martijn van Oosterhout <kleptog(at)svana(dot)org> wrote:
> > >All it requires is to include the header "pgmagic.h" and to put
> > >somewhere in their source:
> > >
> > >PG_MODULE_MAGIC
> >
> > Could you serve this as special docstring instead? Eg:
> >
> > PG_MODULE(foomodule)
> >
> > is mandatory, there you can to your magic, and optional:
>
> <snip>
>
> I like it, but I'm not sure there's enough consensus for that. I've
> suggested before including install info inside the modules themselves
> but there doesn't seem to be much interest in that.

I am not suggesting to try to go all the way, just to make sure that
your current patch fits into that direction.

> Apart from that there's issues with implementation. The Linux kernel
> can do it easily because it knows it will be using ELF, thus can use
> sections to store this info. Postgresql has to support many more types,
> making things like this tricky (but not impossible).

PostgreSQL already requires symbol loading functionality
for V1 function signatures, so per-module symbols won't be
much burden.

> Personally I'd like postgres to move to a system where external modules
> can easily be installed, uninstalled and upgraded. However, I've not
> seen the demand yet.

Demand happens only when users get used to such niceties on some
other databases. Considering that PostgreSQL is extensibility-wise
most advanced database and anything we offer is worlds best,
there won't be any demand in years to come.

I rather think we should create that demand. Tasks like

- see what modules are installed in database.
- install module
- remove module

are rather clunky in current setup. Making them easier would be good thing.

Ofcourse, its easy to tell others to do things. I'll try to hack on that area
myself also. If not earlier then maybe on Summit Code Sprint at least.

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Marko Kreen" <markokr(at)gmail(dot)com>
Cc: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-31 15:14:27
Message-ID: 12544.1149088467@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Marko Kreen" <markokr(at)gmail(dot)com> writes:
>>> Could you serve this as special docstring instead? Eg:
>>> PG_MODULE(foomodule)

I have no objection to that, and see no real implementation problem with
it: we just add a "const char *" field to the magic block. The other
stuff seems too blue-sky, and I'm not even sure that it's the right
direction to proceed in. Marko seems to be envisioning a future where
an extension module is this binary blob with install/deinstall/etc code
all hardwired into it. I don't like that a bit. I think the current
scheme with separate SQL scripts is a *good* thing, because it makes it
a lot easier for users to tweak the SQL definitions, eg, install the
functions into a non-default schema. Also, I don't have a problem
imagining extension modules that contain no C code, just PL functions
--- so the SQL script needs to be considered the primary piece of the
module, not the shared library.

Is it worth adding a module name to the magic block, or should we just
leave well enough alone? It's certainly not something foreseen as part
of the purpose of that block. In the absence of some fairly concrete
ideas what to do with it, I'm probably going to vote keep-it-simple.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-31 17:24:59
Message-ID: 20060531172459.GG23169@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, May 31, 2006 at 11:14:27AM -0400, Tom Lane wrote:
> Is it worth adding a module name to the magic block, or should we just
> leave well enough alone? It's certainly not something foreseen as part
> of the purpose of that block. In the absence of some fairly concrete
> ideas what to do with it, I'm probably going to vote keep-it-simple.

I actually considered it while writing the patch but decided against
given the general tendancy against putting extra info into the modules
in general...

Personally I think it's a good idea, except: where is this info going
to be displayed or used?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-31 18:35:23
Message-ID: e51f66da0605311135y5e22fd57q7de87d49524406b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 5/31/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Marko Kreen" <markokr(at)gmail(dot)com> writes:
> >>> Could you serve this as special docstring instead? Eg:
> >>> PG_MODULE(foomodule)
>
> I have no objection to that, and see no real implementation problem with
> it: we just add a "const char *" field to the magic block. The other
> stuff seems too blue-sky, and I'm not even sure that it's the right
> direction to proceed in.

It was not blue-sky, it was handwaving :)

> Marko seems to be envisioning a future where
> an extension module is this binary blob with install/deinstall/etc code
> all hardwired into it. I don't like that a bit. I think the current
> scheme with separate SQL scripts is a *good* thing, because it makes it
> a lot easier for users to tweak the SQL definitions, eg, install the
> functions into a non-default schema. Also, I don't have a problem
> imagining extension modules that contain no C code, just PL functions
> --- so the SQL script needs to be considered the primary piece of the
> module, not the shared library.

I'll later post a list of ideas that we can hopefully agree on
and discuss them further.

> Is it worth adding a module name to the magic block, or should we just
> leave well enough alone? It's certainly not something foreseen as part
> of the purpose of that block. In the absence of some fairly concrete
> ideas what to do with it, I'm probably going to vote keep-it-simple.

Yes, if we want to keep separate SQL for modules then
putting stuff into .so is pointless.

--
marko


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [PATCH] Magic block for modules
Date: 2006-05-31 21:11:50
Message-ID: 20060531211150.GJ23169@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, May 31, 2006 at 11:14:27AM -0400, Tom Lane wrote:
<snip>
> ... The other
> stuff seems too blue-sky, and I'm not even sure that it's the right
> direction to proceed in. Marko seems to be envisioning a future where
> an extension module is this binary blob with install/deinstall/etc code
> all hardwired into it. I don't like that a bit. I think the current
> scheme with separate SQL scripts is a *good* thing, because it makes it
> a lot easier for users to tweak the SQL definitions, eg, install the
> functions into a non-default schema. Also, I don't have a problem
> imagining extension modules that contain no C code, just PL functions
> --- so the SQL script needs to be considered the primary piece of the
> module, not the shared library.

While you do have a good point about non-binary modules, our module
handling need some help IMHO. For example, the current hack for CREATE
LANGUAGE to fix things caused by old pg_dumps. I think that's the
totally wrong approach long term, I think the pg_dump shouldn't be
including the CREATE LANGUAGE statement at all, but should be saying
something like "INSTALL plpgsql" and pg_restore works out what is
needed for that module.

The above requires getting a few bits straight:

1. When given the name of an external module, you need to be able to
find the SQL commands needed to make it work.

2. You need to be able to tell if something is installed already or
not.

3. You need to be able to uninstall it again. Why do we rely on
hand-written uninstall scripts when we have a perfectly functional
dependancy mechanism that can adequatly track what was added and remove
it again on demand.

With these in place, upgrades across versions of postgres could become
a lot easier. People using tsearch2 now would get only "INSTALL
tsearch2" in their dumps and when they upgrade to 8.2 they get the new
definitions for tsearch using GIN. No old definitions to confuse people
or the database. (Note: I'm not sure if tsearch would be compatable at
the query level, but that's not relevent to the point I'm making).

We could get straight into discussions of mechanism, but it would be
nice to know if people think the above is a worthwhile idea.

Have a ncie day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Generalized concept of modules
Date: 2006-05-31 21:33:44
Message-ID: 22380.1149111224@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

[ moving this thread to -hackers ]

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> While you do have a good point about non-binary modules, our module
> handling need some help IMHO. For example, the current hack for CREATE
> LANGUAGE to fix things caused by old pg_dumps. I think that's the
> totally wrong approach long term, I think the pg_dump shouldn't be
> including the CREATE LANGUAGE statement at all, but should be saying
> something like "INSTALL plpgsql" and pg_restore works out what is
> needed for that module.

There's a lot to be said for this, but I keep having the nagging
feeling that people are equating "module" with "shared library", which
seems far from sufficiently general. I'd like to see "module" mean
"an arbitrary collection of SQL objects". So I think the raw definition
sought by your "INSTALL" would always be a SQL script, and any shared
libs that might come along with that are secondary. The idea of using
pg_depend to manage UNINSTALL is an excellent one.

> 1. When given the name of an external module, you need to be able to
> find the SQL commands needed to make it work.

No problem, the name is the name of a SQL script file stored in a specific
installation directory.

> 2. You need to be able to tell if something is installed already or
> not.

pg_module system catalog. You'd need this anyway since there has to be
some representation of the "module object" in the catalogs for its
component objects to have pg_depend dependencies on.

> With these in place, upgrades across versions of postgres could become
> a lot easier. People using tsearch2 now would get only "INSTALL
> tsearch2" in their dumps and when they upgrade to 8.2 they get the new
> definitions for tsearch using GIN. No old definitions to confuse people
> or the database. (Note: I'm not sure if tsearch would be compatable at
> the query level, but that's not relevent to the point I'm making).

Let's see, I guess pg_dump would have to be taught to ignore any objects
that it can see are directly dependent on a module object. What about
indirect dependencies though? The exact semantics don't seem clear to me.

Also, this seems to be getting into territory that Oracle has already
trod --- someone should study exactly what they do for PL/SQL modules
and whether we want to be compatible or not. Perhaps there's even
something in SQL2003 about it?

regards, tom lane


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-patches(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Kreen <markokr(at)gmail(dot)com>
Subject: Re: [PATCH] Magic block for modules
Date: 2006-06-01 11:46:56
Message-ID: 200606010746.56867.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wednesday 31 May 2006 13:24, Martijn van Oosterhout wrote:
> On Wed, May 31, 2006 at 11:14:27AM -0400, Tom Lane wrote:
> > Is it worth adding a module name to the magic block, or should we just
> > leave well enough alone? It's certainly not something foreseen as part
> > of the purpose of that block. In the absence of some fairly concrete
> > ideas what to do with it, I'm probably going to vote keep-it-simple.
>
> I actually considered it while writing the patch but decided against
> given the general tendancy against putting extra info into the modules
> in general...
>
> Personally I think it's a good idea, except: where is this info going
> to be displayed or used?
>

Marko's suggestion on producing a list of installed modules comes to mind, and
I suspect tools like pgadmin or ppa will want to be able to show this
information.

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Generalized concept of modules
Date: 2006-06-01 20:45:39
Message-ID: 20060601204539.GE12689@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, May 31, 2006 at 05:33:44PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > While you do have a good point about non-binary modules, our module
> > handling need some help IMHO. For example, the current hack for CREATE
> > LANGUAGE to fix things caused by old pg_dumps. I think that's the
> > totally wrong approach long term, I think the pg_dump shouldn't be
> > including the CREATE LANGUAGE statement at all, but should be saying
> > something like "INSTALL plpgsql" and pg_restore works out what is
> > needed for that module.
>
> There's a lot to be said for this, but I keep having the nagging
> feeling that people are equating "module" with "shared library", which
> seems far from sufficiently general. I'd like to see "module" mean
> "an arbitrary collection of SQL objects".

I agree that module is often used interchangably with shared library.
We need to handle the other case too. It would be a lot easier if we
had an example of an SQL only module, since contrib doesn't appear to
have one (at first glance anyway).

> So I think the raw definition
> sought by your "INSTALL" would always be a SQL script, and any shared
> libs that might come along with that are secondary. The idea of using
> pg_depend to manage UNINSTALL is an excellent one.

Well, in that case I'd like to give some concrete suggestions:

1. The $libdir in future may be used to find SQL scripts as well as
shared libraries. They'll have different extensions so no possibility
of conflict.

2. Create something like "BEGIN MODULE xxx" which starts a transaction
and marks any objects created within it as owned by module "xxx". I
think it should be tied to a transaction level to avoid half installed
things, but maybe people would prefer it to work more like schemas.

> pg_module system catalog. You'd need this anyway since there has to be
> some representation of the "module object" in the catalogs for its
> component objects to have pg_depend dependencies on.

Ack. "Owned by" in the above sense means that the object depends on the
module. You could do it the other way round (module depends on object)
but that makes it harder to change things manually. DROP MODULE would
work easier too.

> Let's see, I guess pg_dump would have to be taught to ignore any objects
> that it can see are directly dependent on a module object. What about
> indirect dependencies though? The exact semantics don't seem clear to me.

At a base level, you could definitly drop the functions. Dropping types
is harder because columns might be using them. Normally we use CASCADE
to specify that.

> Also, this seems to be getting into territory that Oracle has already
> trod --- someone should study exactly what they do for PL/SQL modules
> and whether we want to be compatible or not. Perhaps there's even
> something in SQL2003 about it?

Probably a good idea...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Generalized concept of modules
Date: 2006-06-01 21:21:03
Message-ID: 21517.1149196863@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> Well, in that case I'd like to give some concrete suggestions:

> 1. The $libdir in future may be used to find SQL scripts as well as
> shared libraries. They'll have different extensions so no possibility
> of conflict.

No, it needs to be a separate directory, and the reason is that SQL
scripts are architecture-independent and belong under the share/
directory not the lib/ directory. This is a minor point, but the
packagers will scream at us if we get it wrong.

> 2. Create something like "BEGIN MODULE xxx" which starts a transaction
> and marks any objects created within it as owned by module "xxx". I
> think it should be tied to a transaction level to avoid half installed
> things, but maybe people would prefer it to work more like schemas.

I think I'd sooner keep it decoupled from transactions, which means you
need both a BEGIN MODULE xxx and an END MODULE xxx. Also, a module
developer might want to go back and add more stuff to an existing
module. So there should be separate commands:
CREATE MODULE xxx;
BEGIN MODULE xxx;
... anything created here belongs to the module
END MODULE xxx;
An alternative possibility is to make it work like schemas: you set
a GUC variable to indicate that created objects should be associated
with that module. This might be the best choice since it avoids having
two very different ways of doing very similar things.

(Come to think of it, what is the relation between modules and schemas
anyway? Should a module's objects be required to be in schemas also
owned by the module? If not, what happens when you try to INSTALL a
module whose objects lived in a schema you don't have? This gets back
to the fact that we don't have a very nice answer for installing
existing contrib modules into user-chosen schemas.)

>> Let's see, I guess pg_dump would have to be taught to ignore any objects
>> that it can see are directly dependent on a module object. What about
>> indirect dependencies though? The exact semantics don't seem clear to me.

> At a base level, you could definitly drop the functions. Dropping types
> is harder because columns might be using them. Normally we use CASCADE
> to specify that.

I think we're talking at cross purposes. It sounds like you're thinking
about "how do I remove one object in a module?" AFAICS you just drop it.
What I was wondering was what is pg_dump's behavior. ISTM we want two
modes:

* normal behavior: dump module objects as "INSTALL foo" commands. Do
not dump any objects that are owned by a module; assume they will be
created by the INSTALL. Use the dependency chains to make sure that
INSTALL commands are ordered properly relative to everything else.

* "dump module foo": dump the module object as a CREATE MODULE command,
and then dump creation commands for all the objects that are owned by
it. Ignore all else. This is an easy way to generate an updated module
definition script.

Something that's not at all clear to me is object ownership. Should
objects belonging to a module be marked as being owned by the person
executing INSTALL, or should module dump/install try to preserve
original ownership? I think in most scenarios trying to preserve the
original ownership is wrong, since commonly INSTALL will be used to
transfer objects to new databases where the original owner might not
exist at all. But it's a bit non-orthogonal.

Also, what privileges are needed to execute either CREATE MODULE or
INSTALL? Conservative design would say superuser-only, but I can't put
my finger on any specific reason for that, at least if none of the
contained objects need superuser privs to create.

regards, tom lane


From: Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Kreen <markokr(at)gmail(dot)com>
Subject: Re: [PATCH] Magic block for modules
Date: 2006-06-02 01:38:54
Message-ID: 447F96AE.4020007@calorieking.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> Marko's suggestion on producing a list of installed modules comes to mind, and
> I suspect tools like pgadmin or ppa will want to be able to show this
> information.

My request for phpPgAdmin is to somehow be able to check if the .so file
for a module is present.

For instance, I'd like to 'enable slony support' if the slony shared
library is present. PPA's slony support automatically executes the .sql
files, so all I need to know is if the .so is there.

Chris


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-patches(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Marko Kreen <markokr(at)gmail(dot)com>
Subject: Re: [PATCH] Magic block for modules
Date: 2006-06-02 02:27:39
Message-ID: 23323.1149215259@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com> writes:
> My request for phpPgAdmin is to somehow be able to check if the .so file
> for a module is present.

> For instance, I'd like to 'enable slony support' if the slony shared
> library is present. PPA's slony support automatically executes the .sql
> files, so all I need to know is if the .so is there.

I really think this is backwards: you should be looking for the .sql
files. Every module will have a .sql file, not every one will need a
.so file. See followup thread in -hackers where we're trying to hash
out design details.

regards, tom lane


From: Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-patches(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Marko Kreen <markokr(at)gmail(dot)com>
Subject: Re: [PATCH] Magic block for modules
Date: 2006-06-02 02:39:20
Message-ID: 447FA4D8.7030303@calorieking.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

>> For instance, I'd like to 'enable slony support' if the slony shared
>> library is present. PPA's slony support automatically executes the .sql
>> files, so all I need to know is if the .so is there.
>
> I really think this is backwards: you should be looking for the .sql
> files. Every module will have a .sql file, not every one will need a
> .so file. See followup thread in -hackers where we're trying to hash
> out design details.

Not in this case.

Basically Slony has the concept of installing a node into a server. You
can have multiple ones of them - different schemas. So, I'd like to be
able to detect that the .so is there, and then offer an "install node"
feature where WE execute the SQL on their behalf, with all the
complicated string substitions already done.

The trick is that Slony currently requires you to use a command line
tool to execute these scripts for you.

At the moment, people have to indicate in our config while that Slony is
available, and also point us to where the Slony SQL scripts are located.
We do the rest.

It's not too important, but it's just an idea.

Chris


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-patches(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Marko Kreen <markokr(at)gmail(dot)com>
Subject: Re: [PATCH] Magic block for modules
Date: 2006-06-02 02:49:30
Message-ID: 23504.1149216570@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com> writes:
>> I really think this is backwards: you should be looking for the .sql
>> files. Every module will have a .sql file, not every one will need a
>> .so file. See followup thread in -hackers where we're trying to hash
>> out design details.

> Not in this case.

> Basically Slony has the concept of installing a node into a server. You
> can have multiple ones of them - different schemas. So, I'd like to be
> able to detect that the .so is there, and then offer an "install node"
> feature where WE execute the SQL on their behalf, with all the
> complicated string substitions already done.

No, Slony is going to have to adapt to modules, not vice versa. We are
*not* designing the module feature on the assumption that every module
has some C functions at its core. That would be a shameful restriction
of the potential applications.

It might be that some way to parameterize the SQL scripts would be handy
(the question about which schema to install into comes to mind) ... but
that doesn't justify making a .so file the central part of the module
concept.

But again, this is the wrong list. Please contribute to the
"Generalized concept of modules" thread in -hackers.

regards, tom lane


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Kreen <markokr(at)gmail(dot)com>
Subject: Re: [PATCH] Magic block for modules
Date: 2006-06-02 02:55:03
Message-ID: 200606012255.03798.xzilla@users.sourceforge.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thursday 01 June 2006 21:38, Christopher Kings-Lynne wrote:
> > Marko's suggestion on producing a list of installed modules comes to
> > mind, and I suspect tools like pgadmin or ppa will want to be able to
> > show this information.
>
> My request for phpPgAdmin is to somehow be able to check if the .so file
> for a module is present.
>
> For instance, I'd like to 'enable slony support' if the slony shared
> library is present. PPA's slony support automatically executes the .sql
> files, so all I need to know is if the .so is there.
>

While I agree with the above (having that for tsearch2 would be nice too) I
think we ought to keep in mind the idea of sql based modules. Nothing jumps
to mind here ppa wise, but I could see an application looking to see if
mysqlcompat was installed before running if it had a good way to do so.

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


From: PFC <lists(at)peufeu(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: "Marko Kreen" <markokr(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Generalized concept of modules
Date: 2006-06-03 08:06:03
Message-ID: op.taj8sdifcigqcu@apollo13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Think about version API compatibility.

Suppose you have a working database on server A which uses module foo
version 1.
Some time passes, you buy another server B and install postgres on it.
Meanwhile the module foo has evolved into version 2 which is cooler, but
has some minor API incompatibilities.
You dump the database on server A and reload it on server B. pg_dump
issues an INSTALL MODULE which installs foo version 2 on the new server.
Due to the "minor API incompatibilities", your database breaks.

It's really cool not to pollute the dumps (and the global namespace...)
with all the module functions, however implementing module functionality
can be tricky.

So don't forget about versions and possible incompatibilities ; also
versions means you might need an UPGRADE MODULE which does more than
uninstall + reinstall. Suppose a module has created some tables for its
use, these shouldn't be dumped when upgrading to a new version ; however
maybe the new version will want to add a column...

Think gentoo portage, for instance.
This excellent package system is a lot more evolved than the module
system needs to be, but having a look at the feature list would be a good
inspiration maybe.