REVIEW: Extensions support for pg_dump

Lists: pgsql-hackers
From: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Subject: REVIEW: Extensions support for pg_dump
Date: 2011-01-17 15:41:25
Message-ID: 4D346325.2090706@thl.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I used the patch from CommitFest application and applied the following
commit to fix a known issue:

http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=commitdiff;h=d4991d35283ae0ceeb7f9e4203cf6a9dfb5d128d

Is the patch in context diff format?
Yes.

Does the patch apply cleanly?
No:
patching file src/include/commands/defrem.h
Hunk #2 succeeded at 107 with fuzz 2 (offset 27 lines).
Hunk #3 FAILED at 105.
1 out of 5 hunks FAILED -- saving rejects to file
src/include/commands/defrem.h.rej

I have used the git head of

http://git.postgresql.org/gitweb?p=postgresql-extension.git branch extensions

to do the rest of reviewing. There is one compiler warning:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith

-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -g -I../../../src/interfaces/libpq -I../../../src/include -D_GNU_
SOURCE -c -o pg_dump.o pg_dump.c
pg_dump.c: In function ‘getTables’:
pg_dump.c:3748: warning: too many arguments for format

And, make check gives me the following errors:
test sanity_check ... FAILED
test rules ... FAILED

Does it include reasonable tests, necessary doc patches, etc?

There is enough documentation, but I think the documentation needs some
polishing. I am not a native English speaker, so it might be it is my
English that is failing. The data is there, but the representation might
be a little off.

I didn't spot any tests. This could be that I don't know what to look for...

Usability review:

The patch implements a way to create extensions. While the patch is
labeled extensions support for pg_dump, it actually implements more. It
implements a new way to package and install extension, and changes
contrib extensions to use the new way.

I do think we want these features, and that we do not have those
already. As far as I understand, there is nothing in the standard
regarding this feature.

I wonder if the structure of having all the extensions in share/contrib/
is a good idea. It might be better if one could separate the contrib
extensions in one place, and user created extensions in another place.
This could make it easy to see what user created extensions is installed
in (or installable to) the cluster. I think this might be helpful to
DBAs upgrading PostgreSQL.

Overall, the system seems intuitive to use. It is relatively easy to
create extension using this feature, and loading contrib extensions is
really easy. I haven't tested how easy it is to create C-language
extensions using this. If I am not mistaken it just adds the compile &
install the C-functions before installing the extension.

Feature test:

The packaging feature works as advertised, expect for the following bugs
and inconsistencies.

When using the plain CREATE EXTENSION foobar command without schema
qualifier, the extension is created in schema public (by name) without
regard to the current search path. This is inconsistent with create
table, and if the public schema is renamed, the command gives error:

ERROR: schema "public" does not exist

This makes the name "public" have a special meaning, and I suspect that
is not wanted.

When issuing CREATE EXTENSION foo SCHEMA bar, and the extension foo is
not relocatable and it's control file uses the SET search_path TO
@extschema@, the search_path is set to bar for the session. That is, it
is not changed to the original search_path after the command has completed.

When trying to load extensions which contain identical signatured
functions, if the loading will succeed is dependent on the search path.
If there is a conflicting function in search path (first extension
loaded to schema public), then the second extension load will fail. But
if the order is reversed (first extension loaded to schema foo which is
not in search path), then the second extension can be loaded to the
public schema.

While it is not possible to drop functions in extensions, it is possible
to rename a function, and also to CREATE OR REPLACE a function in an
extension. After renaming or CORing a function, it is possible to drop
the function. I also wonder if alter function ... set parameter should
be allowed?

There is no validation for the extension names in share/contrib/. It is
possible to have extensions control files named ".control",
".name.control", "name''.control" etc. While it is stupid to name them
so, it might be better to give an explicit warning / error in these
cases. It is of course possible to use these extensions.

If there is no comment for a given extension in the .control file, then
\dx will not show the extension installed even if it is installed.

I was able to make it crash:

postgres=# alter extension foo.bar set schema baz;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Log contains this:
TRAP: FailedAssertion("!(list_length(name) == 1)", File: "extension.c",
Line: 1163)

It doesn't matter if the schema foo exists or not, and if the extension
is there or not.

I haven't done anything that could be called review on the code level,
but I have quickly glanced over the patch. Some things that caught my eye:

In file /src/backend/commands/extension.c:
+ * The extension control file format is the most simple name = value, we
+ * don't need anything more there. The SQL file to execute commands from is
+ * hardcoded to `pg_config --sharedir`/<extension>.install.sql.

This seems to be outdated.
In file src/bin/pg_dump/pg_dump.c:

! /*
! * So we want the namespaces, but we want to filter out any
! * namespace created by an extension's script. That's unless the
! * user went over his head and created objects into the extension's
! * schema: we now want the schema not to be filtered out to avoid:
! *
! * pg_dump: schema with OID 77869 does not exist
! */

I wonder what that last line is doing there...

I haven't had time to review the pg_dump part of the patch yet, will do
that next (tomorrow). I hope it is OK to post a partial review...

- Anssi


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-17 15:57:16
Message-ID: 1295279637-sup-9642@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Anssi Kääriäinen's message of lun ene 17 12:41:25 -0300 2011:

> While it is not possible to drop functions in extensions, it is possible
> to rename a function, and also to CREATE OR REPLACE a function in an
> extension. After renaming or CORing a function, it is possible to drop
> the function.

Hmm, this seems a serious problem. I imagine that what's going on is
that the function cannot be dropped because the extension depends on it;
but after the rename, the dependencies of the function are dropped and
recreated, but the dependency that relates it to the extension is
forgotten.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-17 16:02:51
Message-ID: AANLkTim_UNX3WbpToNLspo_jYLux6Sea-SH-gWoZF3-S@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 17, 2011 at 10:41 AM, Anssi Kääriäinen
<anssi(dot)kaariainen(at)thl(dot)fi> wrote:
> I haven't had time to review the pg_dump part of the patch yet, will do that
> next (tomorrow). I hope it is OK to post a partial review...

It is, and this is a very good and detailed review!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-17 16:53:39
Message-ID: 87lj2ja2uk.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Thanks for your review!

Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi> writes:
> Does the patch apply cleanly?
> No:

That was some bitrot, has been fixed, thanks you for working from the
git repository meanwhile.

> pg_dump.c:3748: warning: too many arguments for format

Fixed in v25 already sent this morning.

> And, make check gives me the following errors:
> test sanity_check ... FAILED
> test rules ... FAILED

Fixed too. Was due to git conflict solving where it adds a new line in
the tests and keep the embedded rowcount the same. I think.

> Does it include reasonable tests, necessary doc patches, etc?
>
> There is enough documentation, but I think the documentation needs some
> polishing. I am not a native English speaker, so it might be it is my
> English that is failing. The data is there, but the representation might be
> a little off.

We already made lots of improvements there thanks to David Wheeler
reviews in the previous commitfest (which shows up already), and I'll be
happy to continue improving as much as I can. If all it needs is
english native review, I guess that'll be part of the usual marathon
Bruce runs every year there?

> I didn't spot any tests. This could be that I don't know what to look for...

make -C contrib installcheck will exercise CREATE EXTENSION for each
contrib module.

> Usability review:
>
> The patch implements a way to create extensions. While the patch is labeled
> extensions support for pg_dump, it actually implements more. It implements a
> new way to package and install extension, and changes contrib extensions to
> use the new way.

Well, all there's in the patch is infrastructure to be able to issue
those single lines in your dump :)

> I do think we want these features, and that we do not have those already. As
> far as I understand, there is nothing in the standard regarding this
> feature.
>
> I wonder if the structure of having all the extensions in share/contrib/ is
> a good idea. It might be better if one could separate the contrib extensions
> in one place, and user created extensions in another place. This could make
> it easy to see what user created extensions is installed in (or installable
> to) the cluster. I think this might be helpful to DBAs upgrading PostgreSQL.

It's always been this way and I though it wouldn't be in this patch
scope to re-organise things. Also I think we should include the UPGRADE
needs when we discuss file system level layout.

> Overall, the system seems intuitive to use. It is relatively easy to create
> extension using this feature, and loading contrib extensions is really
> easy. I haven't tested how easy it is to create C-language extensions using
> this. If I am not mistaken it just adds the compile & install the
> C-functions before installing the extension.

It's using PGXS which existed before, all you have to do that's new is
care about the Makefile EXTENSION variable and the control file. Even
when doing C coded extension work.

> When using the plain CREATE EXTENSION foobar command without schema
> qualifier, the extension is created in schema public (by name) without
> regard to the current search path. This is inconsistent with create table,
> and if the public schema is renamed, the command gives error:
>
> ERROR: schema "public" does not exist
>
> This makes the name "public" have a special meaning, and I suspect that is
> not wanted.

Fixed in git, thanks for reporting!

~:5490=# set client_min_messages to debug1;
SET
~:5490=# set search_path to utils;
SET
~:5490=# create extension unaccent;
DEBUG: parse_extension_control_file(unaccent, '/home/dfontaine/pgsql/exts/share/contrib/unaccent.control')
DEBUG: CreateExtensionStmt: with user data, schema = 'utils', encoding = ''
DEBUG: Installing extension 'unaccent' from '/home/dfontaine/pgsql/exts/share/contrib/unaccent.sql', in schema utils, with user data
CREATE EXTENSION
~:5490=# \dx
List of extensions
Schema | Name | Version | Description
--------+---------------+----------+-------------------------------------------------
utils | citext | 9.1devel | case-insensitive character string type
utils | hstore | 9.1devel | storing sets of key/value pairs
utils | int_aggregate | 9.1devel | integer aggregator and an enumerator (obsolete)
utils | lo | 9.1devel | managing Large Objects
utils | unaccent | 9.1devel | text search dictionary that removes accents
(5 rows)

> When issuing CREATE EXTENSION foo SCHEMA bar, and the extension foo is not
> relocatable and it's control file uses the SET search_path TO @extschema@,
> the search_path is set to bar for the session. That is, it is not changed to
> the original search_path after the command has completed.

It used to work this way with \i, obviously. Should the extension patch
care about that and how? Do we want to RESET search_path or to manually
manage it like we do for the client_min_messages and log_min_messages?

> When trying to load extensions which contain identical signatured functions,
> if the loading will succeed is dependent on the search path. If there is a
> conflicting function in search path (first extension loaded to schema
> public), then the second extension load will fail. But if the order is
> reversed (first extension loaded to schema foo which is not in search path),
> then the second extension can be loaded to the public schema.

Well I think that's an expected limitation here. In the future we might
want to add suport for inter-extension dependencies and conflicts, but
we're certainly not there yet.

> While it is not possible to drop functions in extensions, it is possible to
> rename a function, and also to CREATE OR REPLACE a function in an
> extension. After renaming or CORing a function, it is possible to drop the
> function. I also wonder if alter function ... set parameter should be
> allowed?

Well there's no specific restrictions in what you can put in an
extension's SQL file. I see that as a feature… think upgrade scripts, too.

> There is no validation for the extension names in share/contrib/. It is
> possible to have extensions control files named ".control", ".name.control",
> "name''.control" etc. While it is stupid to name them so, it might be better
> to give an explicit warning / error in these cases. It is of course possible
> to use these extensions.

I don't have a strong opinion here, will wait for some votes.

> If there is no comment for a given extension in the .control file, then \dx
> will not show the extension installed even if it is installed.

Fixed in the git repository.

> I was able to make it crash:
>
> postgres=# alter extension foo.bar set schema baz;

Fixed:

~:5490=# alter extension foo.bar set schema baz;
ERROR: syntax error at or near "."
LINE 1: alter extension foo.bar set schema baz;
^

~:5490=# alter extension bar set schema baz;
ERROR: extension "bar" does not exist

> I haven't done anything that could be called review on the code level, but I
> have quickly glanced over the patch. Some things that caught my eye:
>
> In file /src/backend/commands/extension.c:
> + * The extension control file format is the most simple name = value, we
> + * don't need anything more there. The SQL file to execute commands from is
> + * hardcoded to `pg_config --sharedir`/<extension>.install.sql.
>
> This seems to be outdated.

Yes it is. Updated, but now that is covered extensively in the
documentation, maybe it could just be removed from the file here?

> In file src/bin/pg_dump/pg_dump.c:
>
> ! /*
> ! * So we want the namespaces, but we want to filter out any
> ! * namespace created by an extension's script. That's unless the
> ! * user went over his head and created objects into the extension's
> ! * schema: we now want the schema not to be filtered out to avoid:
> ! *
> ! * pg_dump: schema with OID 77869 does not exist
> ! */
>
> I wonder what that last line is doing there...

That's the error message you can easily get when you want to have
something more simple than the provided SQL…

> I haven't had time to review the pg_dump part of the patch yet, will do that
> next (tomorrow). I hope it is OK to post a partial review...

From here, it's very good! Will you continue from the git repository,
where the fixes are available, or do you want a v26 already?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-17 17:08:54
Message-ID: 877he3a255.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:

> Excerpts from Anssi Kääriäinen's message of lun ene 17 12:41:25 -0300 2011:
>
>> While it is not possible to drop functions in extensions, it is possible
>> to rename a function, and also to CREATE OR REPLACE a function in an
>> extension. After renaming or CORing a function, it is possible to drop
>> the function.
>
> Hmm, this seems a serious problem. I imagine that what's going on is
> that the function cannot be dropped because the extension depends on it;
> but after the rename, the dependencies of the function are dropped and
> recreated, but the dependency that relates it to the extension is
> forgotten.

Well I'm not seeing that here:

~:5490=# drop function utils.lo_manage_d();
ERROR: cannot drop function utils.lo_manage_d() because extension lo requires it
HINT: You can drop extension lo instead.

src/backend/commands/functioncmds.c

/* rename */
namestrcpy(&(procForm->proname), newname);
simple_heap_update(rel, &tup->t_self, tup);
CatalogUpdateIndexes(rel, tup);

But here:

src/backend/catalog/pg_proc.c

/*
* Create dependencies for the new function. If we are updating an
* existing function, first delete any existing pg_depend entries.
* (However, since we are not changing ownership or permissions, the
* shared dependencies do *not* need to change, and we leave them alone.)
*/
if (is_update)
deleteDependencyRecordsFor(ProcedureRelationId, retval);

[ ... adding all dependencies back ... ]

/* dependency on extension */
if (create_extension)
{
recordDependencyOn(&myself, &CreateExtensionAddress, DEPENDENCY_INTERNAL);
}

Will investigate some more later.
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Kääriäinen Anssi <anssi(dot)kaariainen(at)thl(dot)fi>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-17 17:58:48
Message-ID: BC19EF15D84DC143A22D6A8F2590F0A76515EE97CD@EXMAIL.stakes.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Well I'm not seeing that here

I am not at work at the moment and I don't have the possibility to compile PostgreSQL on this computer, so the example here is from memory.

The issue I saw was this: assume you have an extension foo, containing one function, test().

CREATE EXTENSION foo;
DROP FUNCTION test();
-- restricted due to dependency

ALTER FUNCTION test() RENAME TO test2;
DROP FUNCTION test2();
-- not restricted!

The same can be done using CREATE OR REPLACE.

I hope this is not an error on my part. It is possible because I had a lot of schemas and my search_path might have been wrong...

- Anssi
PS: Using web email client, I hope this comes out in somewhat sane format.


From: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 08:37:51
Message-ID: 4D35515F.5050708@thl.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/17/2011 07:58 PM, Kääriäinen Anssi wrote:
> The issue I saw was this: assume you have an extension foo, containing one function, test().
>
> CREATE EXTENSION foo;
> DROP FUNCTION test();
> -- restricted due to dependency
>
> ALTER FUNCTION test() RENAME TO test2;
> DROP FUNCTION test2();
> -- not restricted!
>
> The same can be done using CREATE OR REPLACE.
>
> I hope this is not an error on my part. It is possible because I had a lot of schemas and my search_path might have been wrong...
The rename is an error on my part, sorry for that. Renaming can be done,
but dropping is not possible even after rename. But a function in a
package can be CREATE OR REPLACEd, and after that the function can be
dropped. COR should be restricted in the same way as DROP is. I will
check if this is still a problem with the latest patch.

Another problem is that you can ALTER FUNCTION test() SET SCHEMA =
something_else, and you can alter the functions search_path, which could
be a problem for non-relocatable extensions. Even if the schema is
changed, dropping extension / altering extension will work as expected.
The function is just in different schema than the extension. But, both
of these IMO fall in the category "don't do that".

- Anssi


From: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 09:09:27
Message-ID: 4D3558C7.1080704@thl.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/17/2011 06:53 PM, Dimitri Fontaine wrote:

>> Usability review:
>>
>> The patch implements a way to create extensions. While the patch is labeled
>> extensions support for pg_dump, it actually implements more. It implements a
>> new way to package and install extension, and changes contrib extensions to
>> use the new way.
> Well, all there's in the patch is infrastructure to be able to issue
> those single lines in your dump :)
Is this supposed to be used mainly by contrib and PGXN extensions? When
I saw the documentation, I immediately thought that this is a nice way
to package my application's stored procedures. If this is not one of the
intended usages, it should be documented. I can see that this could be
problematic when updating PostgreSQL and when recovering from backups.

When recovering from backup, you need to have the locally created
extension available. But it might be that the extension files are lost
when the system went down in flames. Now, the backup is unusable
(right?) until extension files can be recovered from source control or
where ever they might be stored. This is why I suggested having multiple
locations for the extensions. It would be easy to backup locally created
extensions if those were in a single directory. All in all, I have a
nervous feeling that somebody someday will have an unusable dump because
they used this feature, but do not have the extension files available...

Also, this part of documentation:

The goal of using extensions is so that <application>pg_dump</> knows
not to dump all the object definitions into your backups, but rather
issue a single <xref linkend="SQL-CREATEEXTENSION"> command.

From that, it is not entirely clear to me why this is actually wanted
in PostgreSQL. I suppose this will make dump/restore easier to use. But
from that paragraph, I get the feeling the only advantage is that your
dump will be smaller. And I still have a feeling this implements more.
Not that it is a bad thing at all.
> It used to work this way with \i, obviously. Should the extension patch
> care about that and how? Do we want to RESET search_path or to manually
> manage it like we do for the client_min_messages and log_min_messages?
It was unintuitive to me to have search_path changed by SQL command as a
side effect. When using \i, not so much.
>> When trying to load extensions which contain identical signatured functions,
>> if the loading will succeed is dependent on the search path. If there is a
>> conflicting function in search path (first extension loaded to schema
>> public), then the second extension load will fail. But if the order is
>> reversed (first extension loaded to schema foo which is not in search path),
>> then the second extension can be loaded to the public schema.
> Well I think that's an expected limitation here. In the future we might
> want to add suport for inter-extension dependencies and conflicts, but
> we're certainly not there yet.
OK with this as is. It is just a bit surprising that you can create the
extensions in one order but not in another.
>> There is no validation for the extension names in share/contrib/. It is
>> possible to have extensions control files named ".control", ".name.control",
>> "name''.control" etc. While it is stupid to name them so, it might be better
>> to give an explicit warning / error in these cases. It is of course possible
>> to use these extensions.
> I don't have a strong opinion here, will wait for some votes.
Yeah, this is not a big problem in practice. Just don't name them like
that. And if you do, you will find out soon enough that you made a
mistake. By the way, in my comment above "It is of course possible to
use these extensions." -> "It is of course NOT possible ...".
>> I haven't had time to review the pg_dump part of the patch yet, will do that
>> next (tomorrow). I hope it is OK to post a partial review...
> From here, it's very good! Will you continue from the git repository,
> where the fixes are available, or do you want a v26 already?
It is easy for me to continue from the Git repo. I will next continue
doing the pg_dump part of the review. I hope I have time to complete
that today.

- Anssi


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 09:42:16
Message-ID: 87d3nuzgxz.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi> writes:
> The rename is an error on my part, sorry for that. Renaming can be done, but
> dropping is not possible even after rename.

Ok, that matches my tests and reading fo the code.

> But a function in a package can
> be CREATE OR REPLACEd, and after that the function can be dropped. COR
> should be restricted in the same way as DROP is. I will check if this is
> still a problem with the latest patch.

There, I've been missing exactly what Alvaro has been telling us about
here, that CREATE OR REPLACE on pg_proc will drop all known dependencies
then recreate them, using deleteDependencyRecordsFor(). As Tom
explained this week, DEPENDENCY_INTERNAL are working in the reverse way
than usual, which means that the function/extension dependency were lost
here.

I've fixed the case by having the code remember the function's extension
if any, and restore it along with the other dependencies. Here's my
test case that now passes:

~:5490=# \dx+ lo
Objects in extension "lo"
Object Class | Object OID | Object Description
--------------+------------+---------------------------------
pg_proc | 33082 | function utils.lo_manage()
pg_proc | 33081 | function utils.lo_oid(utils.lo)
pg_type | 33080 | type utils.lo
(3 rows)

~:5490=# CREATE OR REPLACE FUNCTION utils.lo_manage()
RETURNS pg_catalog.trigger
AS '$libdir/lo'
LANGUAGE C;
CREATE FUNCTION
~:5490=# \dx+ lo
Objects in extension "lo"
Object Class | Object OID | Object Description
--------------+------------+---------------------------------
pg_proc | 33082 | function utils.lo_manage()
pg_proc | 33081 | function utils.lo_oid(utils.lo)
pg_type | 33080 | type utils.lo
(3 rows)

(before the fix it would have missed the utils.lo_manage() function in
this second listing).

The fix is in my git repository and in the attached v26 patch,
containing also the fixes from yesterday. Thanks again for your
complete testing!

> Another problem is that you can ALTER FUNCTION test() SET SCHEMA =
> something_else, and you can alter the functions search_path, which could be
> a problem for non-relocatable extensions. Even if the schema is changed,
> dropping extension / altering extension will work as expected. The function
> is just in different schema than the extension. But, both of these IMO fall
> in the category "don't do that".

Agreed. And on the other hand, I can image cases where as a work around
you'll be glad that you still have "full power" on the extension's objects.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
extension.v26.patch.gz application/octet-stream 58.5 KB

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 10:01:55
Message-ID: 87bp3ey1gs.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi> writes:
> Is this supposed to be used mainly by contrib and PGXN extensions? When I
> saw the documentation, I immediately thought that this is a nice way to
> package my application's stored procedures. If this is not one of the
> intended usages, it should be documented. I can see that this could be
> problematic when updating PostgreSQL and when recovering from backups.

Sure, private application's stored procedure are meant to be fully
supported by the extension's facility.

> When recovering from backup, you need to have the locally created extension
> available. But it might be that the extension files are lost when the system
> went down in flames. Now, the backup is unusable (right?) until extension
> files can be recovered from source control or where ever they might be
> stored. This is why I suggested having multiple locations for the
> extensions. It would be easy to backup locally created extensions if those
> were in a single directory. All in all, I have a nervous feeling that
> somebody someday will have an unusable dump because they used this feature,
> but do not have the extension files available...

Well, as said in the documentation, extensions are to be used for
objects you are *not* maintaining in your database, but elsewhere.
Typically you are maintaining your stored procedure code in some VCS,
and you have some "packaging" (cat *.sql > my-ext.sql in the Makefile
would be the simpler to imagine).

So yes if you tell PostgreSQL that your procedures are managed elsewhere
so that their code won't be part of your dumps, and then fail to manage
them anywhere else, you're hosed.

My viewpoint here is that when you want to use extensions, you want to
package them for your OS of choice (mine is debian, and I've been
working on easing things on this side too with pg_buildext to be found
in the postgresql-server-dev-all package). If you're an occasional user
just wanting to use new shining facilities… well, think twice…

> Also, this part of documentation:
>
> The goal of using extensions is so that <application>pg_dump</> knows
> not to dump all the object definitions into your backups, but rather
> issue a single <xref linkend="SQL-CREATEEXTENSION"> command.

So maybe we want to extend this little sentence to add the warnings
around it, that if you're not packaging your extension's delivery to
your servers, you're likely shooting yourself in the foot?

> From that, it is not entirely clear to me why this is actually wanted in
> PostgreSQL. I suppose this will make dump/restore easier to use. But from
> that paragraph, I get the feeling the only advantage is that your dump will
> be smaller. And I still have a feeling this implements more. Not that it is
> a bad thing at all.

Well try to upgrade from 8.4 to 9.0 with some "extensions" installed in
there and used in your tables. Pick any contrib, such as hstore or
ltree or cube, or some external code, such as ip4r or prefix or such.
Then compare to upgrade with the extension facility, and tell me what's
best :)

Hint: the dump contains the extension's script, but does not contain the
shared object file. If you're upgrading the OS and the contribs, as
you often do when upgrading major versions, you're hosed. You would
think that pg_upgrade alleviate the concerns here, but you still have
to upgrade manually the extension's .so.

All in all, those extensions (contribs, ip4r, etc) are *not*
maintained in your database and pretending they are by putting their
scripts in your dumps is only building problems. This patch aims to
offer a solution here.

>> It used to work this way with \i, obviously. Should the extension patch
>> care about that and how? Do we want to RESET search_path or to manually
>> manage it like we do for the client_min_messages and log_min_messages?
> It was unintuitive to me to have search_path changed by SQL command as a
> side effect. When using \i, not so much.

Agreed. Will code the manual management way (that is already used for
log settings) later today, unless told to see RESET and how to do that
at the statement level rather than the transaction one.

> It is easy for me to continue from the Git repo. I will next continue doing
> the pg_dump part of the review. I hope I have time to complete that today.

Excellent, will try to continue following your pace :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 10:11:08
Message-ID: 4D35673C.1080900@thl.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/18/2011 11:42 AM, Dimitri Fontaine wrote:
> I've fixed the case by having the code remember the function's extension
> if any, and restore it along with the other dependencies.
The only question here is should CREATE OR REPLACE be allowed. I just
realized this could present a new problem. If I am not mistaken, when
loading from dump, you suddenly get the extension's version back, not
the one you defined in CREATE OR REPLACE. If this is the case, this
should NOT be allowed. And by the same reasoning, ALTER FUNCTION
[anything] should not be allowed either. Or at least then the
function/(or any object for that matter) should be restored somehow from
the backup, not from the extension files.

I still haven't had the time to start pg_dump reviewing, so I haven't
verified if this is really a problem. But I suspect so...

- Anssi


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 10:49:03
Message-ID: 87zkqywkps.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi> writes:
> The only question here is should CREATE OR REPLACE be allowed. I just

Yes. Think ALTER EXTENSION UPGRADE, the next patch in the series
(already proposed for this CF too).

> realized this could present a new problem. If I am not mistaken, when
> loading from dump, you suddenly get the extension's version back, not the
> one you defined in CREATE OR REPLACE. If this is the case, this should NOT
> be allowed. And by the same reasoning, ALTER FUNCTION [anything] should not
> be allowed either. Or at least then the function/(or any object for that
> matter) should be restored somehow from the backup, not from the extension
> files.

Well ideally those will get into extension's upgrade scripts, not be
typed interactively by superusers. But I don't think we should limit
the capability of superusers to quickly fix a packaging mistake…

> I still haven't had the time to start pg_dump reviewing, so I haven't
> verified if this is really a problem. But I suspect so...

Both a problem when badly used and a good thing to have sometime, as in
the upgrade scripts :)
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 10:49:35
Message-ID: 4D35703F.9090404@thl.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/18/2011 12:11 PM, Anssi Kääriäinen wrote:
> The only question here is should CREATE OR REPLACE be allowed. I just
> realized this could present a new problem. If I am not mistaken, when
> loading from dump, you suddenly get the extension's version back, not
> the one you defined in CREATE OR REPLACE. If this is the case, this
> should NOT be allowed. And by the same reasoning, ALTER FUNCTION
> [anything] should not be allowed either. Or at least then the
> function/(or any object for that matter) should be restored somehow from
> the backup, not from the extension files.
>
> I still haven't had the time to start pg_dump reviewing, so I haven't
> verified if this is really a problem. But I suspect so...
Ok, verified at least for CREATE OR REPLACE, ALTER FUNCTION RENAME and
ALTER FUNCTION SET search_path. You will get the extensions version back
when restoring from plain sql dump, not the CORed function, rename is
lost and same for search_path. I suspect this is a problem for any
object type supported in extensions. But unfortunately I do not have
time to verify that.

One more problem with pg_dump. If you have CREATE EXTENSION in you
extensions .sql file, you will have problems restoring. I know it is
stupid to do so, but still CREATE EXTENSION inside CREATE EXTENSION
should be disallowed, as it is possible you find out too late that this
is stupid thing to do. Also, the functions created in the "recursive"
CREATE EXTENSION will be dumped, and the dependencies are not created
correctly.

Unfortunately I have used up all the time I have for reviewing this
patch. I can arrange some more time, maybe late this week, maybe a bit
later. So, I do not have time to do the pg_dump part review in full
detail right now. Still, I hope the work I have done is helpful.

Should I write up a post that contains all the current outstanding
issues in one post, or is it enough to just link this thread in the
CommitFest application?

- Anssi


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 11:03:44
Message-ID: 877he2wk1b.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi> writes:
> Ok, verified at least for CREATE OR REPLACE, ALTER FUNCTION RENAME and ALTER
> FUNCTION SET search_path. You will get the extensions version back when
> restoring from plain sql dump, not the CORed function, rename is lost and
> same for search_path. I suspect this is a problem for any object type
> supported in extensions. But unfortunately I do not have time to verify
> that.

Yes it's the same, if you edit an extension's object directly the edited
version is not in the dump. There's provision to have those objects in
the dump but the function to make it so is currently restricted to only
be called from the extension's script.

pg_extension_flag_dump() changes the dependency type between an
extension's and one of its object, given by oid. The default behavior
of an extension is to skip objects in pg_dump, but in some cases
that's not what you want to achieve, as an extension author. If your
extension provides user data (in a table, for example), then flag this
table so that it's part of the backups, like so:

SELECT pg_extension_flag_dump('schema.tablename'::regclass);

Maybe we should open this function so that it's usable even outside of
the extension's script, but I'm not keen on doing so.

Again, editing the extension's objects out of the scripts is still
limited to superusers and not the only way to shoot yourself in the
foot…

> One more problem with pg_dump. If you have CREATE EXTENSION in you
> extensions .sql file, you will have problems restoring. I know it is stupid
> to do so, but still CREATE EXTENSION inside CREATE EXTENSION should be
> disallowed, as it is possible you find out too late that this is stupid
> thing to do. Also, the functions created in the "recursive" CREATE EXTENSION
> will be dumped, and the dependencies are not created correctly.

That will be handled later, it's called inter-extension dependencies.
We said we won't address that yet…

> Unfortunately I have used up all the time I have for reviewing this patch. I
> can arrange some more time, maybe late this week, maybe a bit later. So, I
> do not have time to do the pg_dump part review in full detail right
> now. Still, I hope the work I have done is helpful.

Very much so, thanks a lot for the time you already spent on it!

> Should I write up a post that contains all the current outstanding issues in
> one post, or is it enough to just link this thread in the CommitFest
> application?

I'd appreciate a list of yet-to-fix items. What I have is the
search_path issue where CREATE EXTENSION foo; can leave it changed for
the current session, I intend to fix that later today.

Other than that, I have no further already agreed on code fix to make.
What's your list?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 12:29:42
Message-ID: 4D3587B6.3030508@thl.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/18/2011 01:03 PM, Dimitri Fontaine wrote:
> I'd appreciate a list of yet-to-fix items. What I have is the
> search_path issue where CREATE EXTENSION foo; can leave it changed for
> the current session, I intend to fix that later today.
>
> Other than that, I have no further already agreed on code fix to make.
> What's your list?
There is only documentation fixes, and I am not sure if even those are
agreed to be necessary. It might be good if the documentation contained:

- A warning that you need to have the files for your extensions
readily available to be able to restore from a dump. This might be
obvious, but better safe than sorry...
- A warning that you will be restored to the extension's version if
you ALTER or CREATE OR REPLACE a function.

From the current documentation, it is maybe too easy to miss these
risks. I am seeing this from non-experienced user's angle, and thus see
these as potential foot guns.

Other than that, I don't think there is anything more. I am a little
nervous of restoring to extension's version of a function when the
function has been CREATE OR REPLACEd, but that might be just me over
thinking this. Also, from the previous posts, there is just the control
file naming issue, and the issue of load order if two extensions contain
similarly named and signatured functions. But these were agreed to be
issues not needing any further work.

Now, I need help what to do next. Should I leave the status as Needs
Review as the pg_dump part is almost completely non-reviewed? And then
attach this thread as a comment? Or as a review?

- Anssi


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 12:29:45
Message-ID: 1295353728-sup-5169@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Dimitri Fontaine's message of mar ene 18 07:01:55 -0300 2011:
> Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi> writes:

> >> It used to work this way with \i, obviously. Should the extension patch
> >> care about that and how? Do we want to RESET search_path or to manually
> >> manage it like we do for the client_min_messages and log_min_messages?
> > It was unintuitive to me to have search_path changed by SQL command as a
> > side effect. When using \i, not so much.

If the CREATE EXTENSION stuff all works in a transaction, perhaps it
would be easier if you used SET LOCAL. At the end of the transaction it
would reset to the original value automatically.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: REVIEW: Extensions support for pg_dump
Date: 2011-01-18 13:25:26
Message-ID: 878vyiuywp.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi> writes:

> On 01/18/2011 01:03 PM, Dimitri Fontaine wrote:
>> I'd appreciate a list of yet-to-fix items. What I have is the
>> search_path issue where CREATE EXTENSION foo; can leave it changed for
>> the current session, I intend to fix that later today.

After some reading of backend/catalog/namespace.c and some testing, my
take would be to have extension authors use the following form when that
is necessary:

SET LOCAL search_path TO public;

Now, none of the contribs use that form (they all are relocatable except
for adminpack which forces its objects to get installed in pg_catalog),
so I've only updated the documentation.

> There is only documentation fixes, and I am not sure if even those are
> agreed to be necessary. It might be good if the documentation contained:
>
> - A warning that you need to have the files for your extensions readily
> available to be able to restore from a dump. This might be obvious, but
> better safe than sorry...

Added a § about that in the docs.

> - A warning that you will be restored to the extension's version if you
> ALTER or CREATE OR REPLACE a function.

Well I don't want to encourage people to manually edit any extension
objects directly, so I've added a note about not doing that.

> Other than that, I don't think there is anything more. I am a little nervous
> of restoring to extension's version of a function when the function has been
> CREATE OR REPLACEd, but that might be just me over thinking this. Also, from

Well with the next patch in the series, ALTER EXTENSION UPGRADE, you
will have a way to edit extension's objects and have the new version
installed next time, but there's no magic bullet here, it means work to
do by the extension's author (preparing upgrade scripts and new
version's install-from-scratch scripts).

> the previous posts, there is just the control file naming issue, and the
> issue of load order if two extensions contain similarly named and signatured
> functions. But these were agreed to be issues not needing any further work.

Yes, extension dependencies and conflicts are not meant to be covered yet.

> Now, I need help what to do next. Should I leave the status as Needs Review
> as the pg_dump part is almost completely non-reviewed? And then attach this
> thread as a comment? Or as a review?

My guess would be to leave it as Needs Review and add this thread as a
review.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support