Re: Command Triggers

Lists: pgsql-hackers
From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Command Triggers
Date: 2011-11-08 17:47:13
Message-ID: m2pqh2mrq6.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

As proposed by Jan Wieck on hackers and discussed in Ottawa at the
Clusters Hackers Meeting, the most (only?) workable way to ever have DDL
triggers is to have "command triggers" instead.

Rather than talking about catalogs and the like, it's all about firing a
registered user-defined function before or after processing the utility
command, or instead of processing it. This naturally involves a new
catalog (pg_cmdtrigger) and some new subcommands of CREATE and DROP
TRIGGER, and some support code for calling the function at the right
time.

The right place to hook this code in seems to be ProcessUtility(), which
is a central place for about all the commands that we handle. An
exception to that rule would be SELECT INTO and CREATE TABLE AS SELECT,
and my proposal would be to add specific call sites to the functions
I've provided in the attached patch rather than try to contort them into
being a good citizen as a utility command.

The ProcessUtility() integration currently looks like that:

const char *commandTag = CreateCommandTag(parsetree);
if (ExecBeforeOrInsteadOfCommandTriggers(parsetree, commandTag) > 0)
return;

... current code ...

ExecAfterCommandTriggers(parsetree, commandTag);

So I guess adding some call sites is manageable.

Now, the patch contains a Proof Of Concept implementation with a small
level of documentation and regression tests in order to get an agreement
on the principles that we already discussed.

The other part of the command trigger facility is what information we
should give to the user-defined functions, and in which form. I've
settled on passing always 4 text arguments: the command string, the
parse tree as a nodeToString() representation (Jan believes that this is
the easiest form we can provide for code consumption, and I tend to
agree), the schema name or NULL if the targeted object of the command is
not qualified, and the object name (or NULL if that does not apply).

CREATE FUNCTION cmdtrigger_notice
(
IN cmd_string text,
IN cmd_nodestring text,
IN schemaname text,
IN relname text
)
RETURNS void
LANGUAGE plpgsql
AS $$
BEGIN
RAISE NOTICE 'cmd_string: %', cmd_string;
END;
$$;

CREATE TRIGGER cmdtrigger_notice
AFTER COMMAND CREATE TABLE
EXECUTE PROCEDURE cmdtrigger_notice();

The v1 patch attached contains implementation for some commands only.
We need to add rewriting facilities for those commands we want to
support in the proposed model, because of multi-queries support in the
protocol and dynamic queries in EXECUTE e.g. (though I admit not having
had a look at EXECUTE implementation).

So each supported command has a cost, and the ones I claim to support in
the grammar in the patch are not seeing a complete support: first, I'm
missing some outfuncs and readfuncs (but I believe we can complete that
using a script on the source code) so that the cmd_nodestring argument
is currently always NULL. Second, I didn't complete the rewriting of
some more complex commands such as CREATE TABLE and ALTER TABLE.

Note that we can't reuse that much of ruleutils.c because it's written
to work from what we store in the catalogs rather than from a parsetree
object.

So, any comment on the approach before I complete the rewriting of the
commands, the out/read funcs, and add more commands to the trigger
support code?

https://github.com/dimitri/postgres/compare/master...command_triggers

$ git diff --stat master..
doc/src/sgml/ref/create_trigger.sgml | 97 +++++-
doc/src/sgml/ref/drop_trigger.sgml | 19 +-
src/backend/catalog/Makefile | 2 +-
src/backend/catalog/dependency.c | 47 ++-
src/backend/catalog/objectaddress.c | 8 +
src/backend/commands/Makefile | 4 +-
src/backend/commands/cmdtrigger.c | 580 ++++++++++++++++++++++++++
src/backend/nodes/copyfuncs.c | 32 ++
src/backend/nodes/equalfuncs.c | 28 ++
src/backend/nodes/outfuncs.c | 405 ++++++++++++++++++
src/backend/parser/gram.y | 70 +++-
src/backend/tcop/utility.c | 44 ++
src/backend/utils/adt/ruleutils.c | 613 +++++++++++++++++++++++++++-
src/include/catalog/dependency.h | 1 +
src/include/catalog/indexing.h | 5 +
src/include/catalog/pg_cmdtrigger.h | 59 +++
src/include/commands/cmdtrigger.h | 43 ++
src/include/commands/defrem.h | 14 +
src/include/nodes/nodes.h | 2 +
src/include/nodes/parsenodes.h | 28 ++
src/include/parser/kwlist.h | 1 +
src/test/regress/expected/sanity_check.out | 3 +-
src/test/regress/expected/triggers.out | 35 ++
src/test/regress/sql/triggers.sql | 35 ++
24 files changed, 2157 insertions(+), 18 deletions(-)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v1.patch text/x-patch 78.1 KB

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-11-26 19:20:58
Message-ID: m2zkfi7krp.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> As proposed by Jan Wieck on hackers and discussed in Ottawa at the
> Clusters Hackers Meeting, the most (only?) workable way to ever have DDL
> triggers is to have "command triggers" instead.
[...]
> The v1 patch attached contains implementation for some commands only.

Here's a v2 which is only about cleaning up the merge from master. The
recent DROP rework conflicted badly with the command triggers patch,
quite obviously. I didn't reimplement DROP COMMAND TRIGGER in terms of
the new facilities yet, though.

> So, any comment on the approach before I complete the rewriting of the
> commands, the out/read funcs, and add more commands to the trigger
> support code?
>
> https://github.com/dimitri/postgres/compare/master...command_triggers

FWIW (scheduling mainly), I don't think I'm going to spend more time on
this work until I get some comments, because all that remains to be done
is about building on top of what I've already been doing here.

Well, I'd like to implement command triggers on replica too, but baring
psql support and a bunch of commands not there yet, that's about it as
far as features go.

Oh, and have expressions rewriting from the parsetree (default, check)
actually work (rather than segfault) is still a TODO too.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v2.patch text/x-patch 93.6 KB

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-01 16:32:10
Message-ID: 201112011732.11126.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Dimitri, Hi all,

On Tuesday, November 08, 2011 06:47:13 PM Dimitri Fontaine wrote:
> As proposed by Jan Wieck on hackers and discussed in Ottawa at the
> Clusters Hackers Meeting, the most (only?) workable way to ever have DDL
> triggers is to have "command triggers" instead.
> Rather than talking about catalogs and the like, it's all about firing a
> registered user-defined function before or after processing the utility
> command, or instead of processing it. This naturally involves a new
> catalog (pg_cmdtrigger) and some new subcommands of CREATE and DROP
> TRIGGER, and some support code for calling the function at the right
> time.
Great ;)

fwiw I think thats the way forward as well.

The patch obviously isn't thought to be commitable yet, so I am not going to
do a very detailed code level review. Attached is a patch with low level
targeted comments and such I noticed while reading it.

At this state the important things are highlevel so I will try to concentrate
on those:

== the trigger function ==
I don't like the set of options parsed to the trigger functions very much. To
cite an example of yours those currently are:
* cmd_string text
* cmd_nodestring text
* schemaname text
* relname text

I think at least a
* command_tag text
is missing. Sure, you can disambiguate it by creating a function for every
command type but that seems pointlessly complex for many applications. And
adding it should be trivial as its already tracked ;)

Currently the examples refer to relname as relname which I dislike as that
seems to be too restricted to one use case. The code is already doing it
correctly (as objectname) though ;)

Why is there explicit documentation of not checking the arguments of said
trigger function? That seems to be a bit strange to me.

schemaname currently is mightily unusable because whether its sent or not
depends wether the table was created with a schemaname specified or not. That
doesn't seem to be a good approach.

That brings me to the next point:

== normalization of statements ==
Currently the normalization is a bit confusing. E.g. for:

CREATE SCHEMA barf;
SET search_path = barf;
CREATE TYPE bart AS ENUM ('a', 'b');
CREATE TABLE bar(a int, b bigint, c serial, d text, e varchar, "F" text, g
bart, h int4);

one gets

CREATE TABLE bar (a pg_catalog.int4,b pg_catalog.int8,c serial,d text,e
pg_catalog.varchar,F text,g bart,h int4);

Which points out two problems:
* inconsistent schema prefixing
* no quoting

Imo the output should fully schema qualify anything including operators. Yes,
thats rather ugly but anything else seems to be too likely to lead to
problems.

As an alternative it would be possible to pass the current search path but
that doesn't seem to the best approach to me;

Currently CHECK() constraints are not decodable, but inside those the same
quoting/qualifying rules should apply.

Then there is nice stuff like CREATE TABLE .... (LIKE ...);
What shall that return in future? I vote for showing it as the plain CREATE
TABLE where all columns are specified.

That touches another related topic. Dim's work in adding support for utility
cmd support in nodeToString and friends possibly will make the code somewhat
command trigger specific. Is there any other usage we envision?

== grammar ==

* multiple actions:
I think it would sensible to allow multiple actions on which to trigger to be
specified just as you can for normal triggers. I also would like an ALL option

* options:
Currently the grammar allows options to be passed to command triggers. Do we
want to keep that? If yes, with the same arcane set of datatypes as in data
triggers?
If yes it needs to be wired up.

* schema qualification:
An option to add triggers only to a specific schema would be rather neat for
many scenarios.
I am not totally sure if its worth the hassle of defining what happens in the
edge cases of e.g. moving from one into another schema though.

== locking ==

Currently there is no locking strategy at all. Which e.g. means that there is
no protection against two sessions changing stuff concurrently or such.

Was that just left out - which is ok - or did you miss that?

I think it would be ok to just always grab row level locks there.

== permissions ==

Command triggers should only be allowed for the database owner.

== recursion ==
I contrast to data triggers command triggers cannot change what is done. They
can either prevent it from running or not. Which imo is fine.
The question I have is whether it would be a good idea to make it easy to
prevent recursion.... Especially if the triggers wont be per schema.

== calling the trigger ==

Imo the current callsite in ProcessUtility isn't that good. I think it would
make more sense moving it into standard_ProcessUtility. It should be *after*
the check_xact_readonly check in there in my opinion.

I am also don't think its a good idea to run the
ExecBeforeOrInsteadOfCommandTriggers stuff there because thats allso the path
that COMMIT/BEGIN and other stuff take. Those are pretty "fast path".

I suggest making two switches in standard_ProcessUtility, one for the non-
command trigger stuff which returns on success and one after that. Only after
the first triggers are checked.

I wonder whether its the right choice to run triggers on untransformed input.
I.e. CREATE TABLE ... (LIKE ...) seems to only make sense to me after
transforming.

Also youre very repeatedly transforming the parstree into a string. It would
be better doing that only once instead of every trigger...

Ok, thats the first round of high level stuff...

Cool Work!

Some lower level annotations:

* many functions should be static but are not. Especially in cmdtrigger.c
* Either tgenable's datatype or its readfunc is wrong (watch the compiler ;))
* ruleutils.c:
* generic routine for IF EXISTS, CASCADE, ...

Greetings,

Andres

Attachment Content-Type Size
0001-Fix-dependency-recording-for-command-triggers.patch text/x-patch 1.4 KB
0002-Fix-file-headers-of-new-cmdtrigger.c-file.patch text/x-patch 1.1 KB
0003-Add-some-review-comments.patch text/x-patch 7.6 KB

From: Andres Freund <andres(at)anarazel(dot)de>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-01 18:12:18
Message-ID: 201112011912.18635.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Tuesday, November 08, 2011 06:47:13 PM Dimitri Fontaine wrote:
> exception to that rule would be SELECT INTO and CREATE TABLE AS SELECT,
> and my proposal would be to add specific call sites to the functions
> I've provided in the attached patch rather than try to contort them into
> being a good citizen as a utility command.
I would very much prefer to make them utility commands and rip out the
executor support for that.
It doesn't look that complicated and would allow us to get rid of the
partially duplicated defineRelation and related stuff from the executor where
its violating my aestetic feelings ;)

Is there something making that especially hard that I overlooked? The
infrastructure for doing so seems to be there.

Andres


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-01 18:21:25
Message-ID: 8901.1322763685@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On Tuesday, November 08, 2011 06:47:13 PM Dimitri Fontaine wrote:
>> exception to that rule would be SELECT INTO and CREATE TABLE AS SELECT,
>> and my proposal would be to add specific call sites to the functions
>> I've provided in the attached patch rather than try to contort them into
>> being a good citizen as a utility command.

> I would very much prefer to make them utility commands and rip out the
> executor support for that.
> Is there something making that especially hard that I overlooked? The
> infrastructure for doing so seems to be there.

Well, I think the main problem is going to be shunting the query down
the right parsing track (SELECT versus utility-statement) early enough.
Making this work cleanly would be a bigger deal than I think you're
thinking.

But yeah, it would be nicer if the executor didn't have to worry about
this. +1 if you're willing to take on the work.

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-01 19:18:41
Message-ID: 201112012018.41522.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, December 01, 2011 07:21:25 PM Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On Tuesday, November 08, 2011 06:47:13 PM Dimitri Fontaine wrote:
> >> exception to that rule would be SELECT INTO and CREATE TABLE AS SELECT,
> >> and my proposal would be to add specific call sites to the functions
> >> I've provided in the attached patch rather than try to contort them into
> >> being a good citizen as a utility command.
> >
> > I would very much prefer to make them utility commands and rip out the
> > executor support for that.
> > Is there something making that especially hard that I overlooked? The
> > infrastructure for doing so seems to be there.
>
> Well, I think the main problem is going to be shunting the query down
> the right parsing track (SELECT versus utility-statement) early enough.
> Making this work cleanly would be a bigger deal than I think you're
> thinking.
Yes. I forgot how ugly SELECT INTO is...

> But yeah, it would be nicer if the executor didn't have to worry about
> this. +1 if you're willing to take on the work.
I won't promise anything but I will play around with the grammar and see if I
find a nice enough way.

Andres


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-01 20:57:40
Message-ID: 201112012157.40912.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, December 01, 2011 07:21:25 PM Tom Lane wrote:
> Well, I think the main problem is going to be shunting the query down
> the right parsing track (SELECT versus utility-statement) early enough.
> Making this work cleanly would be a bigger deal than I think you're
> thinking.
Obviously that depends on the definition of clean...

Changing the grammar to make that explicit seems to involve a bit too many
changes on a first glance. The cheap way out seems to be to make the decision
in analyze.c:transformQuery.
Would that be an acceptable way forward?

Andres


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-01 21:02:18
Message-ID: CA+U5nM+8KSKx+VfdRGj3qmDSBaxRzir4RnzTqOFuQo6-3KBHqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Nov 26, 2011 at 7:20 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:

> FWIW (scheduling mainly), I don't think I'm going to spend more time on
> this work until I get some comments, because all that remains to be done
> is about building on top of what I've already been doing here.

+1 overall

One of the main use cases for me is the ability to
* prevent all commands
* prevent extension commands - to maintain stable execution environment

Those are especially important because in 9.2 DDL commands will cause
additional locking overheads, so preventing DDL will be essential to
keeping performance stable in high txn rate databases.

So I'd like to see a few more triggers that work out of the box for
those cases (perhaps wrapped by a function?). It would also allow a
more useful man page example of how to use this.

On the patch, header comment for cmdtrigger.c needs updating.
Comments for DropCmdTrigger etc look like you forgot to update them
after cut and paste.

Comments say "For any given command tag, you can have either Before
and After triggers, or
Instead Of triggers, not both.", which seems like a great thing to
put in the docs.

Looks good.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-02 02:09:55
Message-ID: 19167.1322791795@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On Thursday, December 01, 2011 07:21:25 PM Tom Lane wrote:
>> Making this work cleanly would be a bigger deal than I think you're
>> thinking.

> Obviously that depends on the definition of clean...

> Changing the grammar to make that explicit seems to involve a bit too many
> changes on a first glance. The cheap way out seems to be to make the decision
> in analyze.c:transformQuery.
> Would that be an acceptable way forward?

ITYM transformStmt, but yeah, somewhere around there is probably reasonable.
The way I'm imagining this would work is that IntoClause disappears from
Query altogether: analyze.c would build a utility statement
CreateTableAs, pull the IntoClause out of the SelectStmt structure and
put it into the utility statement, and then proceed much as we do for
ExplainStmt (and for the same reasons).

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-02 21:01:47
Message-ID: 201112022201.47609.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Hm. I just noticed a relatively big hole in the patch: The handling of
deletion of dependent objects currently is nonexistant because they don't go
through ProcessUtility...

Currently thinking what the best way to nicely implement that is. For other
commands the original command string is passed - that obviously won't work for
that case...

Andres


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-02 22:09:47
Message-ID: 201112022309.47363.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

There is also the point about how permission checks on the actual commands (in
comparison of modifying command triggers) and such are handled:

BEFORE and INSTEAD will currently be called independently of the fact whether
the user is actually allowed to do said action (which is inconsistent with
data triggers) and indepentent of whether the object they concern exists.

I wonder if anybody considers that a problem?

Andres


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-02 23:04:57
Message-ID: m262hylgme.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

First thing first: thank you Andres for a great review, I do appreciate
it. Please find attached a newer version of the patch. The github
repository is also updated.

Andres Freund <andres(at)anarazel(dot)de> writes:
> I think at least a
> * command_tag text

Added.

> Why is there explicit documentation of not checking the arguments of said
> trigger function? That seems to be a bit strange to me.

The code is searching for a function with the given name and 5 text
arguments, whatever you say in the command. That could be changed later
on if there's a need.

> schemaname currently is mightily unusable because whether its sent or not
> depends wether the table was created with a schemaname specified or not. That
> doesn't seem to be a good approach.

I'm not sure about that, we're spiting out what the user entered.

> Imo the output should fully schema qualify anything including operators. Yes,
> thats rather ugly but anything else seems to be too likely to lead to
> problems.

Will look into qualifying names.

> As an alternative it would be possible to pass the current search path but
> that doesn't seem to the best approach to me;

The trigger runs with the same search_path settings as the command, right?

> Then there is nice stuff like CREATE TABLE .... (LIKE ...);
> What shall that return in future? I vote for showing it as the plain CREATE
> TABLE where all columns are specified.

I don't think so. Think about the main use cases for this patch,
auditing and replication. Both cases you want to deal with what the
user said, not what the system understood.

> I think it would sensible to allow multiple actions on which to trigger to be
> specified just as you can for normal triggers. I also would like an ALL option

Both are now implemented.

When dealing with an ANY command trigger, your procedure can get fired
on a command for which we don't have internal support for back parsing
etc, so you only get the command tag not null. I think that's the way to
go here, as I don't want to think we need to implement full support for
command triggers on ALTER OPERATOR FAMILY from the get go.

> Currently the grammar allows options to be passed to command triggers. Do we
> want to keep that? If yes, with the same arcane set of datatypes as in data
> triggers?
> If yes it needs to be wired up.

In fact it's not the case, you always get called with the same 5
arguments, all nullable now that we have ANY COMMAND support.

> * schema qualification:
> An option to add triggers only to a specific schema would be rather neat for
> many scenarios.
> I am not totally sure if its worth the hassle of defining what happens in the
> edge cases of e.g. moving from one into another schema though.

I had written down support for a WHEN clause in command triggers, but
the exact design has yet to be worked out. I'd like to have this
facility but I'd like it even more if that could happen in a later
patch.

> Currently there is no locking strategy at all. Which e.g. means that there is
> no protection against two sessions changing stuff concurrently or such.
>
> Was that just left out - which is ok - or did you miss that?

I left out the locking as of now. I tried to fix some of it in the new
patch though, but I will need to spend more time on that down the road.

> Command triggers should only be allowed for the database owner.

Yes, that needs to happen soon, along with pg_dump and psql support.

> I contrast to data triggers command triggers cannot change what is done. They
> can either prevent it from running or not. Which imo is fine.
> The question I have is whether it would be a good idea to make it easy to
> prevent recursion.... Especially if the triggers wont be per schema.

You already have

ATER TRIGGER foo ON COMMAND create table DISABLE;

that you can use from the trigger procedure itself. Of course, locking
is an issue if you want to go parallel on running commands with
recursive triggers attached. I think we might be able to skip solving
that problem in the first run.

> Imo the current callsite in ProcessUtility isn't that good. I think it would
> make more sense moving it into standard_ProcessUtility. It should be *after*
> the check_xact_readonly check in there in my opinion.

Makes sense, will fix in the next revision.

> I am also don't think its a good idea to run the
> ExecBeforeOrInsteadOfCommandTriggers stuff there because thats allso the path
> that COMMIT/BEGIN and other stuff take. Those are pretty "fast path".
>
> I suggest making two switches in standard_ProcessUtility, one for the non-
> command trigger stuff which returns on success and one after that. Only after
> the first triggers are checked.

Agreed.

> Also youre very repeatedly transforming the parstree into a string. It would
> be better doing that only once instead of every trigger...

It was only done once per before and instead of triggers (you can't have
both on the same command), and another time for any and all after
triggers, meaning at most twice. It's now down to only once, at most.

> * many functions should be static but are not. Especially in cmdtrigger.c

Fixed.

> * Either tgenable's datatype or its readfunc is wrong (watch the compiler ;))
> * ruleutils.c:
> * generic routine for IF EXISTS, CASCADE, ...

Will have to wait until next revision.

Thanks you again for a great review, regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v3.patch.gz application/octet-stream 22.7 KB

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-03 00:09:48
Message-ID: 1322870898-sup-2292@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Andres Freund's message of vie dic 02 19:09:47 -0300 2011:
> Hi all,
>
> There is also the point about how permission checks on the actual commands (in
> comparison of modifying command triggers) and such are handled:
>
> BEFORE and INSTEAD will currently be called independently of the fact whether
> the user is actually allowed to do said action (which is inconsistent with
> data triggers) and indepentent of whether the object they concern exists.
>
> I wonder if anybody considers that a problem?

Hmm, we currently even have a patch (or is it already committed?) to
avoid locking objects before we know the user has permission on the
object. Getting to the point of calling the trigger would surely be
even worse.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-03 00:16:47
Message-ID: CA+TgmoYKjniV1QtM_tM8C80FeTR_Z=d4peM3qOTMxQ10HModPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Dec 2, 2011 at 7:09 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Hmm, we currently even have a patch (or is it already committed?) to
> avoid locking objects before we know the user has permission on the
> object.  Getting to the point of calling the trigger would surely be
> even worse.

I committed a first round of cleanup in that area, but unfortunately
there is a lot more to be done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-03 00:26:22
Message-ID: 201112030126.22485.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Saturday, December 03, 2011 01:09:48 AM Alvaro Herrera wrote:
> Excerpts from Andres Freund's message of vie dic 02 19:09:47 -0300 2011:
> > Hi all,
> >
> > There is also the point about how permission checks on the actual
> > commands (in comparison of modifying command triggers) and such are
> > handled:
> >
> > BEFORE and INSTEAD will currently be called independently of the fact
> > whether the user is actually allowed to do said action (which is
> > inconsistent with data triggers) and indepentent of whether the object
> > they concern exists.
> >
> > I wonder if anybody considers that a problem?
>
> Hmm, we currently even have a patch (or is it already committed?) to
> avoid locking objects before we know the user has permission on the
> object. Getting to the point of calling the trigger would surely be
> even worse.
Well, calling the trigger won't allow them to lock the object. It doesn't even
confirm the existance of the table.

Andres


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-03 13:23:48
Message-ID: m2mxb9kcuz.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> Those are especially important because in 9.2 DDL commands will cause
> additional locking overheads, so preventing DDL will be essential to
> keeping performance stable in high txn rate databases.

The patch now implements "any command" triggers, and you have the
command tag to branch in the function if you need to.

CREATE TRIGGER noddl
INSTEAD OF ANY COMMAND
EXECUTE PROCEDURE cancel_any_ddl();

> So I'd like to see a few more triggers that work out of the box for
> those cases (perhaps wrapped by a function?). It would also allow a
> more useful man page example of how to use this.

We could solve that by providing an extension implementing command
triggers ready for use. One that allows to easily switch on and off the
capability of running commands seems like a good candidate.

That said, with the current coding, you can't have both a instead of
trigger on "any command" and a before or after trigger on any given
command, so installing that extension would prevent you from any other
usage of command triggers.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-03 18:17:14
Message-ID: 201112031917.14709.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Saturday, December 03, 2011 12:04:57 AM Dimitri Fontaine wrote:
> Hi,
>
> First thing first: thank you Andres for a great review, I do appreciate
> it. Please find attached a newer version of the patch. The github
> repository is also updated.
>
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > I think at least a
> > * command_tag text
>
> Added.
>
> > Why is there explicit documentation of not checking the arguments of said
> > trigger function? That seems to be a bit strange to me.
> The code is searching for a function with the given name and 5 text
> arguments, whatever you say in the command. That could be changed later
> on if there's a need.
Forget it, i was too dumb to remember the code when I wrote the above ;)

> > schemaname currently is mightily unusable because whether its sent or not
> > depends wether the table was created with a schemaname specified or not.
> > That doesn't seem to be a good approach.
> I'm not sure about that, we're spiting out what the user entered.
No, were not. Were writing out something that semantically the same what the
user entered. At several places NULL/NOT NULL and such get added.

> > As an alternative it would be possible to pass the current search path
> > but that doesn't seem to the best approach to me;
> The trigger runs with the same search_path settings as the command, right?
For many things it makes sense to specify the search path of a function uppon
creation of the function. Even moreso if you have security definer
functions...
Otherwise you can get into troubles via operators and such...

> > Then there is nice stuff like CREATE TABLE .... (LIKE ...);
> > What shall that return in future? I vote for showing it as the plain
> > CREATE TABLE where all columns are specified.
> I don't think so. Think about the main use cases for this patch,
> auditing and replication. Both cases you want to deal with what the
> user said, not what the system understood.
Is that so? In the replication case the origin table very well may look
differently on the master compared to the standby. Which is about impossible
to diagnose with the above behaviour.

> > I think it would sensible to allow multiple actions on which to trigger
> > to be specified just as you can for normal triggers. I also would like
> > an ALL option
> Both are now implemented.
Cool, will check.

> When dealing with an ANY command trigger, your procedure can get fired
> on a command for which we don't have internal support for back parsing
> etc, so you only get the command tag not null. I think that's the way to
> go here, as I don't want to think we need to implement full support for
> command triggers on ALTER OPERATOR FAMILY from the get go.
Thats ok with me.

> > Currently the grammar allows options to be passed to command triggers. Do
> > we want to keep that? If yes, with the same arcane set of datatypes as
> > in data triggers?
> > If yes it needs to be wired up.
> In fact it's not the case, you always get called with the same 5
> arguments, all nullable now that we have ANY COMMAND support.
I think you misunderstood me. Check the following excerpt from gram.y:
CreateCmdTrigStmt:
CREATE TRIGGER name TriggerActionTime COMMAND trigger_command_list
EXECUTE PROCEDURE func_name '(' TriggerFuncArgs ')'

You have TriggerfuncArgs there but you ignore them.

> > * schema qualification:
> > An option to add triggers only to a specific schema would be rather neat
> > for many scenarios.
> > I am not totally sure if its worth the hassle of defining what happens in
> > the edge cases of e.g. moving from one into another schema though.
> I had written down support for a WHEN clause in command triggers, but
> the exact design has yet to be worked out. I'd like to have this
> facility but I'd like it even more if that could happen in a later
> patch.
Hm. I am not sure a generic WHEN clause is needed. In my opinion schema
qualification would be enough...
A later patch is fine imo.

> > I contrast to data triggers command triggers cannot change what is done.
> > They can either prevent it from running or not. Which imo is fine.
> > The question I have is whether it would be a good idea to make it easy to
> > prevent recursion.... Especially if the triggers wont be per schema.
>
> You already have
>
> ATER TRIGGER foo ON COMMAND create table DISABLE;
>
> that you can use from the trigger procedure itself. Of course, locking
> is an issue if you want to go parallel on running commands with
> recursive triggers attached. I think we might be able to skip solving
> that problem in the first run.
hm. Not yet convinced. I need to think about it a bit.

Have fun ;)

Andres


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-03 21:57:49
Message-ID: 201112032257.49396.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday, December 02, 2011 03:09:55 AM Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On Thursday, December 01, 2011 07:21:25 PM Tom Lane wrote:
> >> Making this work cleanly would be a bigger deal than I think you're
> >> thinking.
> >
> > Obviously that depends on the definition of clean...
> >
> > Changing the grammar to make that explicit seems to involve a bit too
> > many changes on a first glance. The cheap way out seems to be to make
> > the decision in analyze.c:transformQuery.
> > Would that be an acceptable way forward?
>
> ITYM transformStmt, but yeah, somewhere around there is probably
> reasonable. The way I'm imagining this would work is that IntoClause
> disappears from Query altogether: analyze.c would build a utility
> statement
> CreateTableAs, pull the IntoClause out of the SelectStmt structure and
> put it into the utility statement, and then proceed much as we do for
> ExplainStmt (and for the same reasons).
Ok. Because my book turned out to be boring I started looking at this. I
wonder if wouldn't be better to do check in directly in raw_parser(). The
advantage would be that that magically would fix the issue of not logging
CTAS/select when using log_statement = ddl and we don't need a snapshot or
such so that shouldn't be a problem.

Any arguments against doing so?

Andres


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-04 01:09:08
Message-ID: 201112040209.09333.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday, December 02, 2011 03:09:55 AM Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On Thursday, December 01, 2011 07:21:25 PM Tom Lane wrote:
> >> Making this work cleanly would be a bigger deal than I think you're
> >> thinking.
> >
> > Obviously that depends on the definition of clean...
> >
> > Changing the grammar to make that explicit seems to involve a bit too
> > many changes on a first glance. The cheap way out seems to be to make
> > the decision in analyze.c:transformQuery.
> > Would that be an acceptable way forward?
>
> ITYM transformStmt, but yeah, somewhere around there is probably
> reasonable. The way I'm imagining this would work is that IntoClause
> disappears from Query altogether: analyze.c would build a utility
> statement
> CreateTableAs, pull the IntoClause out of the SelectStmt structure and
> put it into the utility statement, and then proceed much as we do for
> ExplainStmt (and for the same reasons).
I have a patch which basically does all this minus some cleanup. I currently
do the the differentiation for SELECT INTO (not CREATE AS, thats done in
gram.y) in raw_parser but thats quick to move if others don't aggree on that.

I have two questions now:

First, does anybody think it would be worth getting rid of the duplication
from OpenIntoRel (formerly from execMain.c) in regard to DefineRelation()?
I noticed that there already is some diversion between both. E.g. CREATE TABLE
frak TABLESPACE pg_global AS SELECT 1; is possible while it would be forbidden
via a plain CREATE TABLE. (I will send a fix for this separately).

Secondly, I am currently wondering whether it would be a good idea to use the
ModifyTable infrastructure for doing the insertion instead an own DestReceiver
infrastructure thats only used for CTAS.
It looks a bit too complicated to do this without removing the bulk insert and
HEAP_INSERT_SKIP_WAL optimizations.

Andres


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-04 16:34:44
Message-ID: 23507.1323016484@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> I have two questions now:

> First, does anybody think it would be worth getting rid of the duplication
> from OpenIntoRel (formerly from execMain.c) in regard to DefineRelation()?

That's probably reasonable to do, since as you say it would remove the
opportunity for bugs-of-omission in the CTAS table creation step.
OTOH, if you find yourself having to make any significant changes to
DefineRelation, then maybe not.

> Secondly, I am currently wondering whether it would be a good idea to use the
> ModifyTable infrastructure for doing the insertion instead an own DestReceiver
> infrastructure thats only used for CTAS.

I think this is probably a bad idea; it will complicate matters and buy
little. There's not a lot of stuff needed for the actual data insertion
step, since we know the table can't have any defaults, constraints,
triggers, etc as yet.

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-04 19:01:34
Message-ID: 201112042001.34822.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Attached is a first version of the patch.

On Sunday, December 04, 2011 05:34:44 PM Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > I have two questions now:
> >
> > First, does anybody think it would be worth getting rid of the
> > duplication from OpenIntoRel (formerly from execMain.c) in regard to
> > DefineRelation()?
>
> That's probably reasonable to do, since as you say it would remove the
> opportunity for bugs-of-omission in the CTAS table creation step.
> OTOH, if you find yourself having to make any significant changes to
> DefineRelation, then maybe not.
Building a CreateStmt seems to work well enough so far.

The only problem with that approach so far that I found is that:

CREATE TABLE collate_test2 (
a int,
b text COLLATE "POSIX"
);

CREATE TABLE collate_test1 (
a int,
b text COLLATE "C" NOT NULL
);

CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a, b
FROM collate_test2; -- fail

failed with:
ERROR: no collation was derived for column "b" with collatable type text
HINT: Use the COLLATE clause to set the collation explicitly.

"works" now.

I am currently setting ColumnDef.collOid of new collumns to attcollation of
the QueryDesc's column. Unfortunately they have a different meaning...

> > Secondly, I am currently wondering whether it would be a good idea to use
> > the ModifyTable infrastructure for doing the insertion instead an own
> > DestReceiver infrastructure thats only used for CTAS.
> I think this is probably a bad idea; it will complicate matters and buy
> little. There's not a lot of stuff needed for the actual data insertion
> step, since we know the table can't have any defaults, constraints,
> triggers, etc as yet.
I got to the same conclusion.

Remaining problems are:
* how to tell ExecContextForcesOids which oid we want
* implementing CREATE TABLE AS ... EXECUTE without duplicating ExecuteQuery
* the attcollation setting problems from above
* returning better error messages for using INTO at places its not allowed

Comments about the direction of the patch?

Andres

Attachment Content-Type Size
0001-Transform-CREATE-TABLE-AS-SELECT-INTO-into-a-utility.patch text/x-patch 62.7 KB

From: Ross Reedstrom <reedstrm(at)rice(dot)edu>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-05 15:29:53
Message-ID: 20111205152953.GA21009@rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Dec 03, 2011 at 01:26:22AM +0100, Andres Freund wrote:
> On Saturday, December 03, 2011 01:09:48 AM Alvaro Herrera wrote:
> > Excerpts from Andres Freund's message of vie dic 02 19:09:47 -0300 2011:
> > > Hi all,
> > >
> > > There is also the point about how permission checks on the actual
> > > commands (in comparison of modifying command triggers) and such are
> > > handled:
> > >
> > > BEFORE and INSTEAD will currently be called independently of the fact
> > > whether the user is actually allowed to do said action (which is
> > > inconsistent with data triggers) and indepentent of whether the object
> > > they concern exists.
> > >
> > > I wonder if anybody considers that a problem?
> >
> > Hmm, we currently even have a patch (or is it already committed?) to
> > avoid locking objects before we know the user has permission on the
> > object. Getting to the point of calling the trigger would surely be
> > even worse.
> Well, calling the trigger won't allow them to lock the object. It doesn't even
> confirm the existance of the table.
>
didn't I see a discussion in passing about the possibility of using these command
triggers to implement some aspects of se-pgsql? In that case, you'd need the above
behavior.

Ross
--
Ross Reedstrom, Ph.D. reedstrm(at)rice(dot)edu
Systems Engineer & Admin, Research Scientist phone: 713-348-6166
Connexions http://cnx.org fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-05 21:15:20
Message-ID: m2ipluhg9j.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Please find an update attached, v4, fixing most remaining items. Next
steps are better docs and more commands support (along with finishing
currently supported ones), and a review locking behavior.

If you want to just scroll over the patch to get an impression of what's
in there rather than try out the attachment, follow this URL:

https://github.com/dimitri/postgres/compare/master...command_triggers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> Will look into qualifying names.

I'm now qualifying relation names even if they have not been entered
with a namespace qualifier. What do you think? The other components
are left alone, I think the internal APIs for qualifying all kind of
objects from the parse tree and current context are mostly missing.

>> As an alternative it would be possible to pass the current search path but
>> that doesn't seem to the best approach to me;
>
> The trigger runs with the same search_path settings as the command, right?

Maybe that's good enough for command triggers?

>> Command triggers should only be allowed for the database owner.
>
> Yes, that needs to happen soon, along with pg_dump and psql support.

All three are implemented in the attached new revision of the patch.

>> Imo the current callsite in ProcessUtility isn't that good. I think it would
>> make more sense moving it into standard_ProcessUtility. It should be *after*
>> the check_xact_readonly check in there in my opinion.
>
> Makes sense, will fix in the next revision.

Done too. It's better this way, thank you.

>> I am also don't think its a good idea to run the
>> ExecBeforeOrInsteadOfCommandTriggers stuff there because thats allso the path
>> that COMMIT/BEGIN and other stuff take. Those are pretty "fast path".
>>
>> I suggest making two switches in standard_ProcessUtility, one for the non-
>> command trigger stuff which returns on success and one after that. Only after
>> the first triggers are checked.
>
> Agreed.

That's about the way I've done it. Please note that doing it this way
means that a ProcessUtility_hook can decide whether or not the command
triggers are going to be fired or not, and that's true for BEFORE, AFTER
and INSTEAD OF command triggers. I think that's the way to go, though.

>> * Either tgenable's datatype or its readfunc is wrong (watch the compiler ;))
>> * ruleutils.c:
>> * generic routine for IF EXISTS, CASCADE, ...
>
> Will have to wait until next revision.

Fixed. Well, the generic routine would only be called twice and would
only share a rather short expression, so will have to wait until I add
support for more commands.

There's a regression tests gotcha. Namely that the parallel running of
triggers against inheritance makes it impossible to predict if the
trigger on the command CREATE TABLE will spit out a notice in the
inherit tests. I don't know how that is usually avoided, but I guess it
involves picking some other command example that don't conflict with the
parallel tests of that section?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v4.patch.gz application/octet-stream 26.9 KB

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-11 03:26:01
Message-ID: 201112110426.01570.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Peter,

On Sunday, December 04, 2011 08:01:34 PM Andres Freund wrote:
> On Sunday, December 04, 2011 05:34:44 PM Tom Lane wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> > > I have two questions now:
> > >
> > > First, does anybody think it would be worth getting rid of the
> > > duplication from OpenIntoRel (formerly from execMain.c) in regard to
> > > DefineRelation()?
> >
> > That's probably reasonable to do, since as you say it would remove the
> > opportunity for bugs-of-omission in the CTAS table creation step.
> > OTOH, if you find yourself having to make any significant changes to
> > DefineRelation, then maybe not.
> Building a CreateStmt seems to work well enough so far.
> The only problem with that approach so far that I found is that:

> CREATE TABLE collate_test2
> (
> a int,
> b text COLLATE "POSIX"
> );
>
> CREATE TABLE collate_test1
> (
> a int,
> b text COLLATE "C" NOT NULL
> );
>
> CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a,
> b FROM collate_test2; -- fail
>
> failed with:
> ERROR: no collation was derived for column "b" with collatable type text
> HINT: Use the COLLATE clause to set the collation explicitly.
> "works" now.
Could you explain why the above should fail? After all the UNION is valid
outside the CREATE TABLE and you can even sort on b.

That tidbit is he only thing I couldn't quickly solve since the last
submission...

Andres


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-11 18:55:39
Message-ID: m2pqfv9bv8.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Andres Freund <andres(at)anarazel(dot)de> writes:
> Hm. I just noticed a relatively big hole in the patch: The handling of
> deletion of dependent objects currently is nonexistant because they don't go
> through ProcessUtility...

That's the reason why we're talking about “command triggers” rather than
“DDL triggers”. We don't intend to fire the triggers at each DDL
operation happening on the server, but for each command.

This restriction still allows us to provide a very useful feature when
checked against the main use cases we target here. Those are auditing,
and replication (the replay will also CASCADEs), and a generic enough
SUDO facility (because the trigger function can well be SECURITY
DEFINER).

We could also add a 'cascading bool' parameter to the trigger function
API and have that always false in 9.2, then choose what to fill the
other parameters with in a later release. The obvious risk would be to
decide that we need another API, then we didn't make a good move after
all.

My current feeling and vote is thus to leave that alone and document the
restriction.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-11 19:09:36
Message-ID: 1323630576.24785.2.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On sön, 2011-12-11 at 04:26 +0100, Andres Freund wrote:
> > Building a CreateStmt seems to work well enough so far.
> > The only problem with that approach so far that I found is that:
>
> > CREATE TABLE collate_test2
> > (
> > a int,
> > b text COLLATE "POSIX"
> > );
> >
> > CREATE TABLE collate_test1
> > (
> > a int,
> > b text COLLATE "C" NOT NULL
> > );
> >
> > CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT a,
> > b FROM collate_test2; -- fail
> >
> > failed with:
> > ERROR: no collation was derived for column "b" with collatable type text
> > HINT: Use the COLLATE clause to set the collation explicitly.
> > "works" now.
> Could you explain why the above should fail? After all the UNION is valid
> outside the CREATE TABLE and you can even sort on b.

That would be strange, because earlier in the test file there is also

SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test2 ORDER BY 2; -- fail

The union itself is valid, but because it combines two different
collations, the collation derivation for the column is "unknown", and so
it cannot be ordered. And we made the implementation decision to not
allow creating columns with unknown collation.


From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2011-12-11 22:46:07
Message-ID: 201112112346.07611.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday, December 11, 2011 08:09:36 PM Peter Eisentraut wrote:
> On sön, 2011-12-11 at 04:26 +0100, Andres Freund wrote:
> > > Building a CreateStmt seems to work well enough so far.
> > > The only problem with that approach so far that I found is that:
> > >
> > > CREATE TABLE collate_test2
> > > (
> > >
> > > a int,
> > >
> > > b text COLLATE "POSIX"
> > >
> > > );
> > >
> > > CREATE TABLE collate_test1
> > > (
> > >
> > > a int,
> > >
> > > b text COLLATE "C" NOT NULL
> > >
> > > );
> > >
> > > CREATE TABLE test_u AS SELECT a, b FROM collate_test1 UNION ALL SELECT
> > > a, b FROM collate_test2; -- fail
> > >
> > > failed with:
> > > ERROR: no collation was derived for column "b" with collatable type
> > > text HINT: Use the COLLATE clause to set the collation explicitly.
> > > "works" now.
> >
> > Could you explain why the above should fail? After all the UNION is valid
> > outside the CREATE TABLE and you can even sort on b.
>
> That would be strange, because earlier in the test file there is also
>
> SELECT a, b FROM collate_test1 UNION ALL SELECT a, b FROM collate_test2
> ORDER BY 2; -- fail
Yea. I didn't realize that it only fails during execution if there actually
are rows present (confusing behaviour btw).

> The union itself is valid, but because it combines two different
> collations, the collation derivation for the column is "unknown", and so
> it cannot be ordered. And we made the implementation decision to not
> allow creating columns with unknown collation.
I copied that behaviour. Its a bit complicated by the fact that
GetColumnDefCollation cannot be taught to accept an invalidOid so I had to
duplicate the check in CreateTableAs...

I attached the - from my side - final version of the patch. I dislike two
things about it:
* code duplication due to error handling. Before making the error message for
various illegal SELECT INTOs the patch actually shrank the code size... If
anybody has a good idea to avoid duplicating that loop around SelectStmt->ops
I would be happy.
* new executor flags to define whether oids should be returned

Andres

Attachment Content-Type Size
0001-Transform-CREATE-TABLE-AS-SELECT-INTO-into-a-utility.patch text/x-patch 75.7 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-12 16:32:45
Message-ID: CA+TgmoZ2WrhL1kFg=A86NDUb=vrjNZd2h3Wxp47=QadOaEFqpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Dec 11, 2011 at 1:55 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
>> Hm. I just noticed a relatively big hole in the patch: The handling of
>> deletion of dependent objects currently is nonexistant because they don't go
>> through ProcessUtility...
>
> That's the reason why we're talking about “command triggers” rather than
> “DDL triggers”.  We don't intend to fire the triggers at each DDL
> operation happening on the server, but for each command.
>
> This restriction still allows us to provide a very useful feature when
> checked against the main use cases we target here. Those are auditing,
> and replication (the replay will also CASCADEs), and a generic enough
> SUDO facility (because the trigger function can well be SECURITY
> DEFINER).

I haven't yet thought about your specific proposal here in enough to
have a fully-formed opinion, but I am a little nervous that this may
turn out to be one of those cases where the obvious API ends up
working less well than might have been supposed. For example,
consider the "auditing" use case, and suppose that the user wants to
log a message (to a table, or some other separate logging facility)
every time someone creates an index. In order to do that, they're
going to need to trap not only CREATE INDEX but also CREATE TABLE and
ALTER CONSTRAINT, and in the latter two cases they're then going to
have to then write some grotty logic to grovel through the statement
or its node tree in order to figure out what indexes are being
created. Also, if they want to log the name of the new index in cases
where the user doesn't specify it, they're going to have to duplicate
the backend logic which generates the index name, which is going to be
a pain in the ass and a recipe for future bugs (e.g. because we might
change the algorithm at some point, or the trigger might see a
different view of the world than the actual command, due to
intervening DDL).

This is something that has come up a lot in the context of sepgsql:
it's not really enough to know what the toplevel command is in some
cases, because whether or not the operation is allowed depends on
exactly what it's doing, which depends on things like which schema
gets used to create the table, and which tablespaces get used to
create the table and its indexes, and perhaps (given Peter's pending
work) what types are involved. You can deduce all of these things
from the command text, but it requires duplicating nontrivial amounts
of backend code, which is undesirable from a maintainability point of
view. I think the same issues are likely to arise in the context of
auditing, as in the example above, and even for replication: if, for
example, the index name that is chosen on the master happens to exist
on the standby with a different definition, replaying the actual
command text might succeed or fail depending on how the command is
phrased. Maybe it's OK for the slave to just choose a different name
for that index, but now a subsequent DROP INDEX command that gets
replayed across all servers might end up dropping logically unrelated
indexes on different machines.

Now, on the other hand, the conceptual simplicity of this approach is
very appealing, and I can certainly see people who would never
consider writing a ProcessUtility_hook using this type of facility.
However, I'm worried that it will be difficult to use as a foundation
for robust, general-purpose tools. Is that enough reason to reject
the whole approach? I'm not sure.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-12 16:45:54
Message-ID: 1323707987-sup-1264@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Robert Haas's message of lun dic 12 13:32:45 -0300 2011:
> On Sun, Dec 11, 2011 at 1:55 PM, Dimitri Fontaine
> <dimitri(at)2ndquadrant(dot)fr> wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> >> Hm. I just noticed a relatively big hole in the patch: The handling of
> >> deletion of dependent objects currently is nonexistant because they don't go
> >> through ProcessUtility...
> >
> > That's the reason why we're talking about “command triggers” rather than
> > “DDL triggers”.  We don't intend to fire the triggers at each DDL
> > operation happening on the server, but for each command.
> >
> > This restriction still allows us to provide a very useful feature when
> > checked against the main use cases we target here. Those are auditing,
> > and replication (the replay will also CASCADEs), and a generic enough
> > SUDO facility (because the trigger function can well be SECURITY
> > DEFINER).
>
> I haven't yet thought about your specific proposal here in enough to
> have a fully-formed opinion, but I am a little nervous that this may
> turn out to be one of those cases where the obvious API ends up
> working less well than might have been supposed. For example,
> consider the "auditing" use case, and suppose that the user wants to
> log a message (to a table, or some other separate logging facility)
> every time someone creates an index. In order to do that, they're
> going to need to trap not only CREATE INDEX but also CREATE TABLE and
> ALTER CONSTRAINT,

Yeah. I remember mentioning the ability of CREATE SCHEMA to embed all
sort of object creation commands in a single top-level command, and
being handwaved away by Jan. "Nobody knows about that", I was told.
However, if this is to work in a robust way, these things should not be
ignored. In particular, firing triggers on each top-level command _is_
going to get some objects ignored; the people writing the triggers _are_
going to forget to handle CREATE SCHEMA properly most of the time.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-12 16:49:59
Message-ID: 201112121750.00031.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, December 12, 2011 05:32:45 PM Robert Haas wrote:
> On Sun, Dec 11, 2011 at 1:55 PM, Dimitri Fontaine
>
> <dimitri(at)2ndquadrant(dot)fr> wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> >> Hm. I just noticed a relatively big hole in the patch: The handling of
> >> deletion of dependent objects currently is nonexistant because they
> >> don't go through ProcessUtility...
> >
> > That's the reason why we're talking about “command triggers” rather than
> > “DDL triggers”. We don't intend to fire the triggers at each DDL
> > operation happening on the server, but for each command.
> >
> > This restriction still allows us to provide a very useful feature when
> > checked against the main use cases we target here. Those are auditing,
> > and replication (the replay will also CASCADEs), and a generic enough
> > SUDO facility (because the trigger function can well be SECURITY
> > DEFINER).
>
> I haven't yet thought about your specific proposal here in enough to
> have a fully-formed opinion, but I am a little nervous that this may
> turn out to be one of those cases where the obvious API ends up
> working less well than might have been supposed.
What are your thoughts about a "not-obvious api"?

> For example,
> consider the "auditing" use case, and suppose that the user wants to
> log a message (to a table, or some other separate logging facility)
> every time someone creates an index. In order to do that, they're
> going to need to trap not only CREATE INDEX but also CREATE TABLE and
> ALTER CONSTRAINT, and in the latter two cases they're then going to
> have to then write some grotty logic to grovel through the statement
> or its node tree in order to figure out what indexes are being
> created. Also, if they want to log the name of the new index in cases
> where the user doesn't specify it, they're going to have to duplicate
> the backend logic which generates the index name, which is going to be
> a pain in the ass and a recipe for future bugs (e.g. because we might
> change the algorithm at some point, or the trigger might see a
> different view of the world than the actual command, due to
> intervening DDL).
I thought about using transformed statements before passing them to the
triggers to avoid exactly that issue. That would make stuff like CREATE TABLE
blub (LIKE), constraints and such transparent. It would mean quite some
additional effort though (because for several statements the transformation is
not really separated).

> This is something that has come up a lot in the context of sepgsql:
> it's not really enough to know what the toplevel command is in some
> cases, because whether or not the operation is allowed depends on
> exactly what it's doing, which depends on things like which schema
> gets used to create the table, and which tablespaces get used to
> create the table and its indexes, and perhaps (given Peter's pending
> work) what types are involved. You can deduce all of these things
> from the command text, but it requires duplicating nontrivial amounts
> of backend code, which is undesirable from a maintainability point of
> view. I think the same issues are likely to arise in the context of
> auditing, as in the example above, and even for replication: if, for
> example, the index name that is chosen on the master happens to exist
> on the standby with a different definition, replaying the actual
> command text might succeed or fail depending on how the command is
> phrased. Maybe it's OK for the slave to just choose a different name
> for that index, but now a subsequent DROP INDEX command that gets
> replayed across all servers might end up dropping logically unrelated
> indexes on different machines.
Well, you shouldn't grovel through the text - you do get passed the parsetree
which should make accessing that far easier. I can imagine somebody writing
some shiny layer which lets you grovel trough that with some xpath alike api.

> Now, on the other hand, the conceptual simplicity of this approach is
> very appealing, and I can certainly see people who would never
> consider writing a ProcessUtility_hook using this type of facility.
> However, I'm worried that it will be difficult to use as a foundation
> for robust, general-purpose tools. Is that enough reason to reject
> the whole approach? I'm not sure.
Thats where I am a bit unsure as well. I think the basic approach is sound but
it might need some work in several areas (transformed statements,
dependencies, ...). On the other hand: perfect is the enemy of good...

Having looked at it I don't think dependency handling is that much effort.
Havent looked at separation of the transformation phase.

Andres


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-12 17:12:25
Message-ID: 877h21ww7a.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Yeah. I remember mentioning the ability of CREATE SCHEMA to embed all
> sort of object creation commands in a single top-level command, and
> being handwaved away by Jan. "Nobody knows about that", I was told.

In fact CREATE SCHEMA implementation will fire a process utility command
per element, see src/backend/commands/schemacmds.c:122

* Execute each command contained in the CREATE SCHEMA. Since the grammar
* allows only utility commands in CREATE SCHEMA, there is no need to pass
* them through parse_analyze() or the rewriter; we can just hand them
* straight to ProcessUtility.

> However, if this is to work in a robust way, these things should not be
> ignored. In particular, firing triggers on each top-level command _is_
> going to get some objects ignored; the people writing the triggers _are_
> going to forget to handle CREATE SCHEMA properly most of the time.

My current thinking is that the tool as proposed in the patch is useful
enough now already, and that we should cover its limitations in the
documentation. And also be somehow verbose about the nodeToString()
output where to find almost all you need.

Down the road, we might decide that either what we've done is covering
enough needs (after all, we managed not to have the features for years)
or decide to have internal commands implementation (create index from
inside a create table statement, say) go through ProcessUtility.

Also note that currently, anyone motivated enough to write a
ProcessUtility hook (in C) will face the same problem here, the hook
will not get called for any event where the command trigger would not
get called either.

So while I understand your concerns here, I think they are rather
theoretical, as we can already add support for create table, alter
table, and drop table to slony, londiste and bucardo with the existing
patch, and also implement create extension whitelisting and allow
non-superuser to install selected C coded extensions.

FWIW, those are the two main use cases I'm after.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-12 19:19:45
Message-ID: CA+TgmoYfO6Nq4LRRh6LCeT8ATJFTuuvQYrL3_L5=6nH7FCY=Rg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 12, 2011 at 11:49 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> I haven't yet thought about your specific proposal here in enough to
>> have a fully-formed opinion, but I am a little nervous that this may
>> turn out to be one of those cases where the obvious API ends up
>> working less well than might have been supposed.
> What are your thoughts about a "not-obvious api"?

It seems to me (and it may seem differently to other people) that what
most people who want to trap DDL events really want to do is either
(a) get control at the permissions-checking stage so that they can
override the system's default access control policy, either to allow
something that would normally be denied (as in Dimitri's sudo example)
or deny something that would normally be allowed (as with sepgsql, or
the example in the documentation for Dimitri's patch showing how to
enforce relation naming conventions) or else (b) perform some
additional steps after completing the operation (e.g. audit it,
replicate it, apply a security label to it). Much of the
selinux-related work that KaiGai Kohei and I have been doing over the
last couple of years has revolved around where to put those hooks and
what information to pass to them.

For example, we added some post-creation hooks where the system
basically says "we just created a <object-type> and its OID is
<newly-assigned-oid>". I'm not going to present this as a model of
usability (because it isn't). However, I think the underlying model
is worth considering. It's not a command trigger, because it doesn't
care what command the user executed that led to the object getting
created. It's more like an event system - whenever a table gets
created (no matter how exactly that happens), you get a callback. I
think that's probably closer to what people want for the
additional-steps-after-completing-the-operation use case. Typically,
you're not really interested in what the user typed: you want to know
what the system decided to do about it.

The "get control at the permissions checking stage" use case is a bit
harder. We've added some hooks for that as well, but really only for
DML thus far, and I'm not convinced we've got the right design even
there. Part of the complexity here is that there are actually a lot
of different permissions rules depending on exactly what operation you
are performing - you might think of REINDEX TABLE, ALTER TABLE ..
ALTER COLUMN .. RENAME and DROP TABLE as requiring the same
permissions (i.e. ownership) but in fact they're all slightly
different. It would be nice if the permissions checking machinery
were better separated from the code that actually does the work, so
that you could rip and replace just that part. Dimitri's idea of
using a security definer function as a DDL instead-of trigger is an
interesting one, but does that leave newly created objects owned by
the wrong user? Or what if you want user A to be able to do some but
not all of the things user B can do? Drilling an optimally-sized hole
through the existing permissions system is unfortunately not very
simple.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-12 20:28:53
Message-ID: m262hl8rga.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> It seems to me (and it may seem differently to other people) that what
> most people who want to trap DDL events really want to do is either

[ detailed analysis, mostly right on spot ]

Yeah, I'm proposing a rather crude tool. I think it's still solving
real problems we have now, and that no other tool nor design is solving
for us.

So for me the real questions are:

- do we want the feature in that form?
- are we avoiding to paint ourselves into a corner?

I think both answers are positive, because I want the feature bad enough
to have spent time working on it, and it's able to solve two problems in
one stone already. Also, the only way to have that feature in an
extension implementing ProcessUtility_hook is duplicating what I've been
doing, minus grammar support (just because you can't).

Also that's not the kind of efforts that either slony or londiste will
put into their own project, the amount of work wouldn't be justified for
this only return, as history is telling us. (there's a game changer
here though, which is the idea of doing “command triggers” as opposed to
“ddl triggers” or even “catalog triggers”, thanks Jan)

Then, we can expand the trigger function signature down the road and
keep the current one as a compatibility support. For example we could
add a cascading boolean argument and decide whether or not to call
the trigger function when cascading based upon the trigger's procedure
signature.

So I believe it's somewhat coarse or crude, still useful enough, and not
painting us into a corner.

> using a security definer function as a DDL instead-of trigger is an
> interesting one, but does that leave newly created objects owned by
> the wrong user? Or what if you want user A to be able to do some but
> not all of the things user B can do? Drilling an optimally-sized hole
> through the existing permissions system is unfortunately not very
> simple.

You can still use ALTER OBJECT … OWNER TO …; from within the security
definer function to allow the setup you're interested into. I guess
that you can still have some conditional on current_user from the code
too, but I didn't check that we have user id and effective user id both
available from the SQL level (yet).

All in all, I understand your reticence here, but I'm not sharing it,

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-13 13:25:03
Message-ID: 4EE7522F.2090805@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/12/2011 11:32 AM, Robert Haas wrote:
> I haven't yet thought about your specific proposal here in enough to
> have a fully-formed opinion, but I am a little nervous that this may
> turn out to be one of those cases where the obvious API ends up
> working less well than might have been supposed.

There's another cautionary tale from the sepgsql history worth
mentioning here, which surely I don't have to remind you about. Making
the goal for a first shippable subset include proof you can solve the
hardest problems in that area can lead to a long road without committing
anything. With sepgsql, that was focusing on the worst of the ALTER
TABLE issues. As Dimitri was pointing out, the name change to Command
Triggers includes a sort of admission that DDL Triggers are too hard to
solve in all cases yet. We shouldn't be as afraid to introduce APIs
that are aimed at developers who currently have none.

Yes, there's a risk that will end with "...and this one has to be broken
in the next release because of this case we didn't see". We can't be so
afraid of that we don't do anything, especially when the users who would
be impacted by that theoretical case are currently suffering from an
even worse problem than that. To provide the big picture infrastructure
tools that people are desperate for now, PostgreSQL needs to get a lot
more agile when it comes to revving hooks whose main consumers are not
regular application programs. They're the administrators of the system
instead.

I know what I was just rallying against is not what you were arguing
for, you just triggered a stored rant of mine. [Bad trigger joke goes
here] Regardless, thoughts on where the holes are here are appreciated.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-13 14:59:48
Message-ID: CA+TgmobYG+D3kzcANw3GhMqPrv3AkKa+r_MEf+tMgPM-1Z8Pog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 13, 2011 at 8:25 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> There's another cautionary tale from the sepgsql history worth mentioning
> here, which surely I don't have to remind you about.  Making the goal for a
> first shippable subset include proof you can solve the hardest problems in
> that area can lead to a long road without committing anything.  With
> sepgsql, that was focusing on the worst of the ALTER TABLE issues.  As
> Dimitri was pointing out, the name change to Command Triggers includes a
> sort of admission that DDL Triggers are too hard to solve in all cases yet.
>  We shouldn't be as afraid to introduce APIs that are aimed at developers
> who currently have none.
>
> Yes, there's a risk that will end with "...and this one has to be broken in
> the next release because of this case we didn't see".

Well, the problem is that just because something better comes along
doesn't mean we'll actually deprecate and remove the old
functionality. We still have contrib/xml2, even though the docs say
we're "planning" to remove it in 8.4. Tom even rewrote the memory
handling, because it was easier to rewrite a module he probably
doesn't intrinsically care much about than to convince people we
should remove something that was both planned for deprecation anyway
and a huge security risk because it crashed if you looked at it
sideways. And we still have rules, so people read the documentation
and say to themselves "hmm, should i use triggers or rules for this
project?". And elsewhere we're discussing whether and under what
conditions it would be suitable to get rid of recovery.conf, which
almost everyone seems to agree is a poor design, largely AFAICT
because third-party tools find recovery.conf a convenient way to
circumvent the need to rewrite postgresql.conf, which is a pain in the
neck because of our insistence that it has to contain arbitrary user
comments. In other words, more often than not, we are extremely
reluctant to remove or modify features of even marginal utility
because there will certainly be somebody, somewhere who is relying on
the old behavior.

Of course, it does go the other way sometimes: we removed old-style
VACUUM FULL (which was useful if you were short of disk space and long
on time), flat-file authentication (which was used by third party
tools), and made removing OIDs require a table rewrite. Still, I
think it's entirely appropriate to be cautious about adding new
features that might not turn out to be the design we really want to
have. Odds are good that we will end up supporting them for backward
compatibility reasons for many, many years.

Now, all that having been said, I also agree that the perfect can be
the enemy of the good, and we go there frequently. The question I'm
asking is not whether the feature is perfect, but whether it's
adequate for even the most basic things people might want to do with
it. Dimitri says that he wants it so that we can add support for
CREATE TABLE, ALTER TABLE, and DROP TABLE to Slony, Bucardo, and
Londiste. My fear is that it won't turn out to be adequate to that
task, because there won't actually be enough information in the CREATE
TABLE statement to do the same thing on all servers. In particular,
you won't have the index or constraint names, and you might not have
the schema or tablespace information either. But maybe we ought to
put the question to the intended audience for the feature - is there a
Slony developer in the house?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-13 15:53:24
Message-ID: CAHyXU0w-VXSJ5YObNNaFC-1An5weq7RDFzNMoNnDk4-gTJzceg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 13, 2011 at 8:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Now, all that having been said, I also agree that the perfect can be
> the enemy of the good, and we go there frequently.  The question I'm
> asking is not whether the feature is perfect, but whether it's
> adequate for even the most basic things people might want to do with
> it.  Dimitri says that he wants it so that we can add support for
> CREATE TABLE, ALTER TABLE, and DROP TABLE to Slony, Bucardo, and
> Londiste.  My fear is that it won't turn out to be adequate to that
> task, because there won't actually be enough information in the CREATE
> TABLE statement to do the same thing on all servers.  In particular,
> you won't have the index or constraint names, and you might not have
> the schema or tablespace information either.

But, you could query all that out from the system catalogs right?
Maybe a better facility should exist to convert a table name to a
create table statement than hand rolling it or invoking pg_dump, but
that's a separate issue.

This feature fills an important niche given that you can't hook RI
triggers to system catalogs...it comes up (in short, +1).

merlin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-13 15:57:11
Message-ID: CA+TgmoYo4yiHNJOFUgcEL26YCTMai_YruDt0LEDJSbYoi1F0Tg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 13, 2011 at 10:53 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Tue, Dec 13, 2011 at 8:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Now, all that having been said, I also agree that the perfect can be
>> the enemy of the good, and we go there frequently.  The question I'm
>> asking is not whether the feature is perfect, but whether it's
>> adequate for even the most basic things people might want to do with
>> it.  Dimitri says that he wants it so that we can add support for
>> CREATE TABLE, ALTER TABLE, and DROP TABLE to Slony, Bucardo, and
>> Londiste.  My fear is that it won't turn out to be adequate to that
>> task, because there won't actually be enough information in the CREATE
>> TABLE statement to do the same thing on all servers.  In particular,
>> you won't have the index or constraint names, and you might not have
>> the schema or tablespace information either.
>
> But, you could query all that out from the system catalogs right?

You could probably get a lot of it that way, although first you'll
have to figure out which schema to look up the name in. It seems
likely that everyone who uses the trigger will need to write that
code, though, and they'll all have different implementations with
different bugs, because many of them probably really want the facility
that you write in your next sentence:

> Maybe a better facility should exist to convert a table name to a
> create table statement than hand rolling it or invoking pg_dump, but
> that's a separate issue.
>
> This feature fills an important niche given that you can't hook RI
> triggers to system catalogs...it comes up (in short, +1).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Mailing Lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-13 17:29:04
Message-ID: CAFNqd5ULdwH_j73MU+awW=C9gGL-fdY5STQW2P7Op4o77VzRnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 13, 2011 at 9:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Now, all that having been said, I also agree that the perfect can be
> the enemy of the good, and we go there frequently.  The question I'm
> asking is not whether the feature is perfect, but whether it's
> adequate for even the most basic things people might want to do with
> it.  Dimitri says that he wants it so that we can add support for
> CREATE TABLE, ALTER TABLE, and DROP TABLE to Slony, Bucardo, and
> Londiste.  My fear is that it won't turn out to be adequate to that
> task, because there won't actually be enough information in the CREATE
> TABLE statement to do the same thing on all servers.  In particular,
> you won't have the index or constraint names, and you might not have
> the schema or tablespace information either.  But maybe we ought to
> put the question to the intended audience for the feature - is there a
> Slony developer in the house?

Yeah, I'm not certain yet what is being provided, and the
correspondence with what would be needed.

- It's probably not sufficient to capture the raw statement, as that
gets invoked within a context replete with GUC values, and you may
need to duplicate that context/environment on a replica. Mind you, a
command trigger is doubtless capable of querying GUCs to duplicate
relevant portions of the environment.

- What I'd much rather have is a form of the query that is replete
with Full Qualification, so that "create table foo (id serial primary
key, data text not null unique default 'better replace this', dns_data
dnsrr not null);" may be transformed into a safer form like: "create
table public.foo (id serial primary key, data text not null unique
default 'better replace this'::text, dns_data dns_rr.dnsrr not null);"

What's not clear, yet, is which transformations are troublesome. For
instance, if there's already a sequence called "foo_id_seq", then the
sequence defined for that table winds up being "foo_id_seq1", and it's
not quite guaranteed that *that* would be identical across databases.

But perhaps it's sufficient to implement what, of COMMAND TRIGGERS,
can be done, and we'll see, as we proceed, whether or not it's enough.

It's conceivable that a first implementation won't be enough to
implement DDL triggers for Slony, and that we'd need to ask for
additional functionality that doesn't make it in until 9.3. That
seems better, to me, than putting it on the shelf, and having
functionality in neither 9.2 nor 9.3...
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: PostgreSQL Mailing Lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-13 17:35:46
Message-ID: CA+TgmoaRVmhO17GV4HVBuciupsSMeb8N+9WwJffYdi_PWouXUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 13, 2011 at 12:29 PM, Christopher Browne <cbbrowne(at)gmail(dot)com> wrote:
> But perhaps it's sufficient to implement what, of COMMAND TRIGGERS,
> can be done, and we'll see, as we proceed, whether or not it's enough.
>
> It's conceivable that a first implementation won't be enough to
> implement DDL triggers for Slony, and that we'd need to ask for
> additional functionality that doesn't make it in until 9.3.  That
> seems better, to me, than putting it on the shelf, and having
> functionality in neither 9.2 nor 9.3...

The thing is, I don't really see the approach Dimitri is taking as
being something that we can extend to meet the requirement you just
laid out. So it's not like, OK, let's do this, and we'll improve it
later. It's, let's do this, and then later do something completely
different, and that other thing will be the one that really solves the
problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-13 17:58:05
Message-ID: 4EE7922D.20604@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/13/2011 09:59 AM, Robert Haas wrote:
> Well, the problem is that just because something better comes along
> doesn't mean we'll actually deprecate and remove the old
> functionality.

The examples you gave fall into three groups, and I think it's useful to
demarcate how they're different. Please pardon me going wildly off
topic before returning back.

If you drop xml2 or rules, people lose something. It's primarily the
PostgreSQL developers who gain something. You can make a case that
people who won't get sucked into doing something wrong with rules one
day will gain something, but those people are a future speculation;
they're not here asking to be saved for a problem they don't know will
happen yet. This sort of deprecation battle is nearly impossible to
win. One of the reasons I placed a small bet helping sponsor PGXN is
that I hope it allows some of this should be deprecated stuff to move
there usefully. Let the people who use it maintain it moving forward if
they feel it's important.

The recovery.conf change and other attempts to reorganize the
postgresql.conf are in a second category. These break scripts, without
providing an immediate and obvious gain to everyone. Some say it's
better, some say it's worse, from the list traffic it seems like a
zero-sum game. The burden is on the person advocating the change to
justify it if there's not a clear win for everyone. You might note that
my latest attitude toward this area is to provide the mechanism I want
as a new option, and not even try to argue about removing the old thing
anymore. This lets implementation ideas battle it out in the wild.
Let's say a year from now everyone who hasn't switched to using a conf.d
dirctory approach looks like an old stick in the mud. Then maybe I pull
the sheet I have an enormous bikeshed hidden behind, waiting for just
that day.[1]

When VACUUM FULL was rewritten, it took a recurring large problem that
has impacted a lot of people, and replaced with a much better thing for
most cases. A significantly smaller number of people lost something
that was slightly useful. There weren't as many complaints because the
thing that was lost was replaced with something better by most metrics.
Different, but better. This third category of changes are much easier
to sell. We have another such discontinuity coming with
pg_stat_activity. The changes Scott Mead's patch kicked off make it
different and better. Anyone who has a tool using the old thing can
look at the new design and say "well, that makes the whole 'what state
is the connection in' logic I used to worry about go away; that's
progress even if it breaks my old scripts". People need to get
something that offsets the breakage to keep griping down. Anyone who
argues against those sort of changes has a challenging battle on their
hands.

If there is a Command Trigger implementation that Slony etc. use, and we
discover a limitation that requires an API break, that's OK so long as
it's expected that will fall into the last category. Breakage to add
support for something new should be a feature clearly gained, something
lost, and with a net benefit to most consumers of that feature. People
accept it or block obvious forward progress. We don't want to get too
confused between what makes for good progress on that sort of thing with
the hard deprecation problems. (Not that I'm saying you are here, just
pointing out it happens)

> Dimitri says that he wants it so that we can add support for
> CREATE TABLE, ALTER TABLE, and DROP TABLE to Slony, Bucardo, and
> Londiste. My fear is that it won't turn out to be adequate to that
> task, because there won't actually be enough information in the CREATE
> TABLE statement to do the same thing on all servers.

These are all good things to look into, please keep those test set ideas
coming and hopefully we'll get some more input on this. But let's say
this rolls out seeming good enough, and later someone discovers some
weird index thing that isn't supported. If that's followed by "here's a
new API; it breaks your old code, but it allows supporting that index
you couldn't deal with before", that's unlikely to get shot down by that
API's consumers. What you wouldn't be able to do is say "this new API
doesn't work right, let's just yank it out". Your concerns about making
sure at least the fundamentals hold here are on point though.

[1] Look at that, I can now say that 100% of the programs I compose
e-mail with now have "bikeshed" added to their dictionary. I don't
bother with this often, but there's entries for "PostgreSQL" and
"committer" there too.[2]

[2] Would you believe a Google search for "committer" shows the
PostgreSQL wiki page as its second hit? That's only behind the
Wikipedia link, and ahead of the FreeBSD, Chromium, Apache, and Mozilla
pages on that topic.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-14 02:12:56
Message-ID: 4EE80628.5020104@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/13/2011 9:59 AM, Robert Haas wrote:
> it. Dimitri says that he wants it so that we can add support for
> CREATE TABLE, ALTER TABLE, and DROP TABLE to Slony, Bucardo, and
> Londiste. My fear is that it won't turn out to be adequate to that
> task, because there won't actually be enough information in the CREATE
> TABLE statement to do the same thing on all servers. In particular,
> you won't have the index or constraint names, and you might not have
> the schema or tablespace information either. But maybe we ought to
> put the question to the intended audience for the feature - is there a
> Slony developer in the house?

I agree. While it is one of the most "asked for" features among the
trigger based replication systems, I fear that an incomplete solution
will cause more problems than it solves. It is far easier to tell people
"DDL doesn't propagate automatically, do this instead ..." than to try
to support a limited list of commands, that may or may not propagate as
intended. And all sorts of side effects, like search_path, user names
and even the existing schema in the replica can cause any given DDL
"string" to actually do something completely different than what
happened on the origin.

On top of that, the PostgreSQL main project has a built in replication
solution that doesn't need any of this. There is no need for anyone, but
us trigger replication folks, to keep command triggers in sync with all
other features.

I don't think it is going to be reliable enough any time soon to make
this the default for any of the trigger based replication systems.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-14 10:22:21
Message-ID: 87liqfv4f6.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> it. Dimitri says that he wants it so that we can add support for
> CREATE TABLE, ALTER TABLE, and DROP TABLE to Slony, Bucardo, and
> Londiste. My fear is that it won't turn out to be adequate to that
> task, because there won't actually be enough information in the CREATE
> TABLE statement to do the same thing on all servers. In particular,
> you won't have the index or constraint names, and you might not have
> the schema or tablespace information either.

In my experience of managing lots of trigger based replications (more
than 100 nodes in half a dozen different projects), what I can tell from
the field is that I don't care about index and constraint names. Being
able to replicate the same CREATE TABLE statement that the provider just
executed on the subscriber is perfectly fine for my use cases.

Again, that's a caveat of the first implementation, you can't have sub
commands support without forcing them through ProcessUtility and that's
a much more invasive patch. Maybe we will need that later.

Also it's quite easy to add support for the CREATE INDEX command,
including index name support, and ALTER TABLE is already on the go. So
we can document how to organize your DDL scripts for them to just work
with the replication system. And you can even implement a command
trigger that enforces respecting the limits (RAISE EXCEPTION when the
CREATE TABLE command is embedding primary key creation rather than using
a separate command for that).

As for the schema, you can easily get the current search_path setting
from the command trigger and force it to the same value on the
subscriber before replaying the commands (hint: add current search_path
to the event you're queuing for replay).

select setting from pg_settings where name = 'search_path';

I appreciate that some use cases won't be possible to implement with the
first version of this patch, really, but I believe we have enough use
cases that are possible to implement with it that it's worth providing
the feature.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Mailing Lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-14 10:41:58
Message-ID: 87zkevtoy1.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Christopher Browne <cbbrowne(at)gmail(dot)com> writes:
> - What I'd much rather have is a form of the query that is replete
> with Full Qualification, so that "create table foo (id serial primary
> key, data text not null unique default 'better replace this', dns_data
> dnsrr not null);" may be transformed into a safer form like: "create
> table public.foo (id serial primary key, data text not null unique
> default 'better replace this'::text, dns_data dns_rr.dnsrr not null);"

Andres did the same comment and I've begun working on that. The
facility for fully qualifying object names are not always available in a
form that we can use from the parsetree, but that's a SMOP.

> What's not clear, yet, is which transformations are troublesome. For
> instance, if there's already a sequence called "foo_id_seq", then the
> sequence defined for that table winds up being "foo_id_seq1", and it's
> not quite guaranteed that *that* would be identical across databases.
>
> But perhaps it's sufficient to implement what, of COMMAND TRIGGERS,
> can be done, and we'll see, as we proceed, whether or not it's enough.

I'm not convinced that having the same sequence names on the subscribers
is something we need here. Let me detail that, because maybe I just
understood a major misunderstanding in the use case I'm interested into.

I mostly use trigger based replication in cases where it's not possible
to implement failover or switchover with that technique, because I'm
doing cross-replication or more complex architectures. Failover is
handled with WAL based techniques (wal shipping, streaming rep, etc).

So I don't care that much about the sub object names (constraints,
indexes, sequences): I need them to get created on the subscriber and
that's about it. I think that's an important enough use case here.

> It's conceivable that a first implementation won't be enough to
> implement DDL triggers for Slony, and that we'd need to ask for
> additional functionality that doesn't make it in until 9.3. That
> seems better, to me, than putting it on the shelf, and having
> functionality in neither 9.2 nor 9.3...

Or maybe Slony would end up relying partly on the command trigger
facility and implementing the missing pieces in its own code base. Like
it did with the txid_snapshot data type some years ago, for example.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2011-12-14 10:58:28
Message-ID: 878vmfto6j.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> I agree. While it is one of the most "asked for" features among the trigger
> based replication systems, I fear that an incomplete solution will cause
> more problems than it solves. It is far easier to tell people "DDL doesn't
> propagate automatically, do this instead ..." than to try to support a
> limited list of commands, that may or may not propagate as intended. And all

Nothing stops you from checking that the command you want to replicate
is indeed supported and refuse to run it on the provider when not,
that's what command triggers are for :)

> sorts of side effects, like search_path, user names and even the existing
> schema in the replica can cause any given DDL "string" to actually do
> something completely different than what happened on the origin.

Grab those on the provider from pg_settings and the like in the command
trigger and restore them on the subscriber before applying the command?

> On top of that, the PostgreSQL main project has a built in replication
> solution that doesn't need any of this. There is no need for anyone, but us
> trigger replication folks, to keep command triggers in sync with all other
> features.

You can't implement cross replication with built-in replication. I'm yet
to work on a medium sized project where I don't need both streaming
replication, wal archiving, and a trigger based replication system.

> I don't think it is going to be reliable enough any time soon to make this
> the default for any of the trigger based replication systems.

You need to add "yet" and "without some work in the client implementation".

Last time I read such comments, it was about extensions. We still
shipped something in 9.1, and by your measure, it's quite broken. When
you implement an SQL only extension (no .so) you still have to use a
Makefile and you need to deploy your scripts on the server filesystem
before hand, it's not possible to install or update the extension from
an SQL connection (even when doing only self-contained SQL commands).

Also, the dependency system is not solving much real world problems,
apart from the very simplest one that we can play with in contrib/.

You can't even list shared objects (.so, .dll) that have been loaded so
far in a session and reference the extension they belong to.

Yet, I bet that some people will find that the extension system as we
have it in 9.1 still is useful enough to have been released in there.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

PS: yes I intend to be working on fixing those extension limitations and
caveats with a series of patches.


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-14 14:05:24
Message-ID: 1323871368-sup-4185@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Dimitri Fontaine's message of mié dic 14 07:22:21 -0300 2011:

> Again, that's a caveat of the first implementation, you can't have sub
> commands support without forcing them through ProcessUtility and that's
> a much more invasive patch. Maybe we will need that later.

I can get behind this argument: force all stuff through ProcessUtility
for regularity, and not necessarily in the first patch for this feature.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-14 21:22:47
Message-ID: CA+TgmobKRKRLv3Kb758wH1WrQ93g6cOorYhV-i_xCVCtnFwdKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 14, 2011 at 9:05 AM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Dimitri Fontaine's message of mié dic 14 07:22:21 -0300 2011:
>> Again, that's a caveat of the first implementation, you can't have sub
>> commands support without forcing them through ProcessUtility and that's
>> a much more invasive patch.  Maybe we will need that later.
>
> I can get behind this argument: force all stuff through ProcessUtility
> for regularity, and not necessarily in the first patch for this feature.

That seems like a pretty heavy dependency on an implementation detail
that we might want to change at some point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-14 22:44:32
Message-ID: m2d3bq7oz3.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I can get behind this argument: force all stuff through ProcessUtility
>> for regularity, and not necessarily in the first patch for this feature.
>
> That seems like a pretty heavy dependency on an implementation detail
> that we might want to change at some point.

Given ProcessUtility_hook, how much of an implementation detail rather
than a public API are we talking about?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-15 14:36:25
Message-ID: CA+TgmoYAUf6_nG5nTp23i38qpWNtdcG1HUtPWm=iubay5rrMXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 14, 2011 at 5:44 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> I can get behind this argument: force all stuff through ProcessUtility
>>> for regularity, and not necessarily in the first patch for this feature.
>>
>> That seems like a pretty heavy dependency on an implementation detail
>> that we might want to change at some point.
>
> Given ProcessUtility_hook, how much of an implementation detail rather
> than a public API are we talking about?

I think that a hook and an SQL command are not on the same level.
Hooks are not documented; SQL commands are. You may, of course,
disagree.

But the basic point is that it seems pretty weird to say, on the one
hand, these are command triggers. They fire when you execute a
command. So the CREATE TABLE trigger will fire when someone says
CREATE TABLE. But then we say, oh, well if you use CREATE SCHEMA foo
CREATE TABLE blah ... we will fire the trigger anyway. Now it's not a
command trigger any more; it's a trigger that fires when you perform a
certain operation - e.g. create a table. Unless, of course, you
create the table using CREATE TABLE AS SELECT or SELECT .. INTO. Then
it doesn't fire. Unless we decide to make those utility commands,
which I think was just recently under discussion, in which case it
will suddenly start firing for those operations. So now something
that is, right now, an essentially an implementation detail which we
can rearrange as we like turns into a user-visible behavior change.
And what if some day we want to change it back?

I think it would be a much better idea to decree from the beginning
that we're trapping the *operation* of creating a table, rather than
the *command* create table. Then, the behavior is clear and
well-defined from day one, and it doesn't depend on how we happen to
implement things internally right at the moment. If there's some way
of creating a table without firing the trigger, it's a bug. If
there's some way of firing the trigger without attempting to create a
table, it's a bug. That might require some more thought about what
information to pass to the trigger function (e.g. if it's a SELECT ..
INTO, you're not going to have pregenerated SQL that starts with the
words "CREATE TABLE") but the fact that gives much more clear
definition to the core feature seems like a big plus in my book.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject: Re: Command Triggers
Date: 2011-12-15 14:50:55
Message-ID: 201112151550.55853.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, December 15, 2011 03:36:25 PM Robert Haas wrote:
> On Wed, Dec 14, 2011 at 5:44 PM, Dimitri Fontaine
>
> <dimitri(at)2ndquadrant(dot)fr> wrote:
> > Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> >>> I can get behind this argument: force all stuff through ProcessUtility
> >>> for regularity, and not necessarily in the first patch for this
> >>> feature.
> >> That seems like a pretty heavy dependency on an implementation detail
> >> that we might want to change at some point.
> > Given ProcessUtility_hook, how much of an implementation detail rather
> > than a public API are we talking about?
> I think that a hook and an SQL command are not on the same level.
> Hooks are not documented; SQL commands are. You may, of course,
> disagree.
I am not disagreeing.

> But the basic point is that it seems pretty weird to say, on the one
> hand, these are command triggers. They fire when you execute a
> command. So the CREATE TABLE trigger will fire when someone says
> CREATE TABLE. But then we say, oh, well if you use CREATE SCHEMA foo
> CREATE TABLE blah ... we will fire the trigger anyway. Now it's not a
> command trigger any more; it's a trigger that fires when you perform a
> certain operation - e.g. create a table. Unless, of course, you
> create the table using CREATE TABLE AS SELECT or SELECT .. INTO. Then
> it doesn't fire. Unless we decide to make those utility commands,
> which I think was just recently under discussion, in which case it
> will suddenly start firing for those operations.
Command triggers were the initial reason for me doing that conversion.

> So now something
> that is, right now, an essentially an implementation detail which we
> can rearrange as we like turns into a user-visible behavior change.
> And what if some day we want to change it back?

> I think it would be a much better idea to decree from the beginning
> that we're trapping the *operation* of creating a table, rather than
> the *command* create table. Then, the behavior is clear and
> well-defined from day one, and it doesn't depend on how we happen to
> implement things internally right at the moment.
I am with you there.

> If there's some way
> of creating a table without firing the trigger, it's a bug. If
> there's some way of firing the trigger without attempting to create a
> table, it's a bug.
Again.

> That might require some more thought about what
> information to pass to the trigger function (e.g. if it's a SELECT ..
> INTO, you're not going to have pregenerated SQL that starts with the
> words "CREATE TABLE") but the fact that gives much more clear
> definition to the core feature seems like a big plus in my book.
Not sure what youre getting at here? The command tag in Dim's patch is alread
independent of the form of actual statement that was run.

A big +1 on the general direction of this email. I dislike reducing the scope
of command triggers to "top level commands run by the actual user" because imo
that would reduce the possible scope of the patch rather much.

On the other hand I think delivering a complete patch covering just about
anything triggered anywhere is not really realistic. Not sure whats the best
way to continue is.

Andres


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-15 15:53:15
Message-ID: 87bor9q1as.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Thank your Robert for this continued review, I think we're making good
progress in a community discussion that needs to happen!

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Given ProcessUtility_hook, how much of an implementation detail rather
>> than a public API are we talking about?
>
> I think that a hook and an SQL command are not on the same level.
> Hooks are not documented; SQL commands are. You may, of course,
> disagree.

Mmmm, yes. I think that we should document hooks, and I'm going to
propose soon a pg_extension_module() SRF that lists all currently loaded
modules and which extension implements them, and I've begun thinking
about what it would take to be able to list what hooks each module is
implementing.

That would require an hook registration API, and would allow a much
easier security auditing of a setup you don't know beforehand.

So I think that hooks not being documented is an implementation detail
here :)

> But the basic point is that it seems pretty weird to say, on the one
> hand, these are command triggers. They fire when you execute a
> command. So the CREATE TABLE trigger will fire when someone says
> CREATE TABLE. But then we say, oh, well if you use CREATE SCHEMA foo
> CREATE TABLE blah ... we will fire the trigger anyway. Now it's not a

Yes, this CREATE SCHEMA <any create command> is weird, and unique, so
it's not that difficult to document, I think. And if we fix things by
having all subcommands go through ProcessUtility and command triggers,
then it's easy to document as the new rule.

My documentation idea would then be explaining what is a command and
what is a subcommand, and then the current rule (command triggers do not
support subcommands) and then the exception to that rule (CREATE SCHEMA
subcommands do, in fact, support command triggers).

If we decide that we should in fact support (nested) subcommands in
command triggers, then we will be in a position to prepare a patch
implementing that with a documentation change that the rule is now that
command triggers do in fact support subcommands.

When the command trigger is called on a subcommand, I think you will
want both the main command string and the subcommand string, and we have
to research if what you need isn't a whole stack of commands, because
I'm not sure you can only have 1 level deep nesting here.

That would be done with a special API for the trigger functions, and
with a specific syntax, because it seems to me that having a trigger
code that is only ever called on a top-level command is a good thing to
have.

create trigger foo after command CREATE TABLE …
create trigger foo after top level command CREATE TABLE …
create trigger foo after nested command CREATE TABLE …

Either we add support for that kind of syntax now (and not implementing
it in 9.3 would seem weird as hell) or we instruct pg_dump and
pg_upgrade to switch from current syntax to the new one (add in the “top
level” keywords) when we do implement the feature down the road.

> command trigger any more; it's a trigger that fires when you perform a
> certain operation - e.g. create a table. Unless, of course, you
> create the table using CREATE TABLE AS SELECT or SELECT .. INTO. Then
> it doesn't fire. Unless we decide to make those utility commands,
> which I think was just recently under discussion, in which case it
> will suddenly start firing for those operations. So now something

Andres has been sending a complete patch to fix that old oddity of the
parser. And again, that's a very narrow exception to the usual state of
things, and as such, easy to document in the list of things that don't
fall into the general rule.

Even when fixed, though, you have 3 different command tags to deal with,
and we identify the command triggers on command tags, so that a command
trigger on CREATE TABLE will not be called when doing SELECT INTO, nor
when doing CREATE TABLE AS.

Also, I think that the command triggers in 9.2 will not address all and
any command that PostgreSQL offers (think “alter operator family”), so
that we will need to maintain an exhaustive list of supported commands,
identified by command tags.

> that is, right now, an essentially an implementation detail which we
> can rearrange as we like turns into a user-visible behavior change.
> And what if some day we want to change it back?

Yes, we need to make a decision about that now. Do we want any
“operation” to go through ProcessUtility so that hooks and command
triggers can get called?

I think it's a good idea in the long run, and Alvaro seems to be
thinking it is too. That entails changing the backend code to build a
Statement then call ProcessUtility on it rather than calling the
internal functions directly, and that also means some more DDL are
subject to being logged on the server logs. Removing those
“cross-modules” #include might be a good think in the long run though.

And we don't have to do that unless we want to make subcommands
available to ProcessUtility_hook and command triggers.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject: Re: Command Triggers
Date: 2011-12-15 16:15:24
Message-ID: 201112151715.24497.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, December 15, 2011 04:53:15 PM Dimitri Fontaine wrote:
> Hi,
>
> Thank your Robert for this continued review, I think we're making good
> progress in a community discussion that needs to happen!
>
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> >> Given ProcessUtility_hook, how much of an implementation detail rather
> >> than a public API are we talking about?
> >
> > I think that a hook and an SQL command are not on the same level.
> > Hooks are not documented; SQL commands are. You may, of course,
> > disagree.

> Mmmm, yes. I think that we should document hooks, and I'm going to
> propose soon a pg_extension_module() SRF that lists all currently loaded
> modules and which extension implements them, and I've begun thinking
> about what it would take to be able to list what hooks each module is
> implementing.
I don't really see that as possible/realistic.

> > But the basic point is that it seems pretty weird to say, on the one
> > hand, these are command triggers. They fire when you execute a
> > command. So the CREATE TABLE trigger will fire when someone says
> > CREATE TABLE. But then we say, oh, well if you use CREATE SCHEMA foo
> > CREATE TABLE blah ... we will fire the trigger anyway. Now it's not a
>
> Yes, this CREATE SCHEMA <any create command> is weird, and unique, so
> it's not that difficult to document, I think. And if we fix things by
> having all subcommands go through ProcessUtility and command triggers,
> then it's easy to document as the new rule.
I don't think I ever saw that one in real worldddl scripts ;)

> My documentation idea would then be explaining what is a command and
> what is a subcommand, and then the current rule (command triggers do not
> support subcommands) and then the exception to that rule (CREATE SCHEMA
> subcommands do, in fact, support command triggers).

> If we decide that we should in fact support (nested) subcommands in
> command triggers, then we will be in a position to prepare a patch
> implementing that with a documentation change that the rule is now that
> command triggers do in fact support subcommands.

> When the command trigger is called on a subcommand, I think you will
> want both the main command string and the subcommand string, and we have
> to research if what you need isn't a whole stack of commands, because
> I'm not sure you can only have 1 level deep nesting here.
I don't think you need that. If needed you will have to build the data
structure in $pl.

> That would be done with a special API for the trigger functions, and
> with a specific syntax, because it seems to me that having a trigger
> code that is only ever called on a top-level command is a good thing to
> have.
>
> create trigger foo after command CREATE TABLE …
> create trigger foo after top level command CREATE TABLE …
> create trigger foo after nested command CREATE TABLE …
>
> Either we add support for that kind of syntax now (and not implementing
> it in 9.3 would seem weird as hell) or we instruct pg_dump and
> pg_upgrade to switch from current syntax to the new one (add in the “top
> level” keywords) when we do implement the feature down the road.
I personally think there should only be one variant which is always called.
With a parameter that indicates whether its a subcommand or not.

Why would you ever only want top level commands?

> > that is, right now, an essentially an implementation detail which we
> > can rearrange as we like turns into a user-visible behavior change.
> > And what if some day we want to change it back?
> Yes, we need to make a decision about that now. Do we want any
> “operation” to go through ProcessUtility so that hooks and command
> triggers can get called?
I personally think thats a good idea for most stuff.

I don't see that for alter table subcommands and such though.

> I think it's a good idea in the long run, and Alvaro seems to be
> thinking it is too. That entails changing the backend code to build a
> Statement then call ProcessUtility on it rather than calling the
> internal functions directly, and that also means some more DDL are
> subject to being logged on the server logs. Removing those
> “cross-modules” #include might be a good think in the long run though.
Uhm. I don't think building strings is the way to go here. I think building
*Stmt nodes is better.

Andres


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject: Re: Command Triggers
Date: 2011-12-15 16:46:05
Message-ID: 871us5pyuq.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
>> create trigger foo after command CREATE TABLE …
>> create trigger foo after top level command CREATE TABLE …
>> create trigger foo after nested command CREATE TABLE …
>>
>> Either we add support for that kind of syntax now (and not implementing
>> it in 9.3 would seem weird as hell) or we instruct pg_dump and
>> pg_upgrade to switch from current syntax to the new one (add in the “top
>> level” keywords) when we do implement the feature down the road.
> I personally think there should only be one variant which is always called.
> With a parameter that indicates whether its a subcommand or not.
>
> Why would you ever only want top level commands?

Because the command "create table foo(id primary key)" could then fire
your "index" and "constraint" command triggers twice and if all you do
is storing the command string in a table for auditing purposes, maybe
you just don't care.

>> Yes, we need to make a decision about that now. Do we want any
>> “operation” to go through ProcessUtility so that hooks and command
>> triggers can get called?
> I personally think thats a good idea for most stuff.
>
> I don't see that for alter table subcommands and such though.

By subcommand, I mean any operation that a main command do for you and
that you could have been doing manually with another command, such as
serial and sequences, primary key and its alter table form, embedded
checks or not null and its alter table form, etc.

I don't remember that ALTER TABLE implement facilities that are in turn
calling another command for you?

>> I think it's a good idea in the long run, and Alvaro seems to be
>> thinking it is too. That entails changing the backend code to build a
>> Statement then call ProcessUtility on it rather than calling the
>> internal functions directly, and that also means some more DDL are
>> subject to being logged on the server logs. Removing those
>> “cross-modules” #include might be a good think in the long run though.
> Uhm. I don't think building strings is the way to go here. I think building
> *Stmt nodes is better.

Agreed, I meant that exactly.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Noah Misch <noah(at)leadboat(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject: Re: Command Triggers
Date: 2011-12-15 19:29:49
Message-ID: 20111215192949.GA32454@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 15, 2011 at 05:46:05PM +0100, Dimitri Fontaine wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> >> Yes, we need to make a decision about that now. Do we want any
> >> ???operation??? to go through ProcessUtility so that hooks and command
> >> triggers can get called?
> > I personally think thats a good idea for most stuff.
> >
> > I don't see that for alter table subcommands and such though.
>
> By subcommand, I mean any operation that a main command do for you and
> that you could have been doing manually with another command, such as
> serial and sequences, primary key and its alter table form, embedded
> checks or not null and its alter table form, etc.
>
> I don't remember that ALTER TABLE implement facilities that are in turn
> calling another command for you?

When ALTER TABLE ALTER TYPE changes an indexed column, it updates the index by
extracting its SQL definition, dropping it, and running that SQL. (Not sure
whether it passes through the code paths to hit your trigger.) We ought not to
promise that ALTER TABLE will always do so. By comparison, we could implement
ALTER TABLE SET NOT NULL on an inheritance tree as several ALTER TABLE ONLY, but
we don't implement it that way. Triggering on top-level commands can be quite
well-defined; triggers on subcommands (as you have defined the term) will risk
firing at different times across major versions. That may just be something to
document.

I think we'll need a way to tell SPI whether a command should be considered a
subcommand or not. A PL/pgSQL function that calls CREATE TABLE had better
fire a trigger, but some SPI usage will qualify as subcommands.

nm


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-15 20:04:39
Message-ID: CA+TgmoagXCMd5nGqxFSzYcKa5YCr_FiL9XdWL=z-z-Sj=bNWKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 15, 2011 at 10:53 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Yes, we need to make a decision about that now. Do we want any
> “operation” to go through ProcessUtility so that hooks and command
> triggers can get called?

No. That's 100% backwards. We should first decide what functionality
we want, and then decide how we're going to implement it. If we
arbitrarily decide to force everything that someone might want to
write a trigger on through ProcessUtility_hook, then we're going to
end up being unable to add triggers for some things because they can't
be conveniently forced through ProcessUtility, or else we're going to
end up contorting the code in bizarre ways because we've drawn some
line in the sand that ProcessUtility is the place where triggers have
to get called.

In doing this project, I think we should pay a lot of attention to the
lessons that have been learned developing sepgsql. I can certainly
understand if your eyes roll back in your head when you hear that,
because that project has been exceedingly long and difficult and shows
no sign of reaching its full potential for at least another few
release cycles. But I think it's really worth the effort to avoid
pounding our heads against the brick wall twice. Two things that leap
out at me in this regard are:

(1) It's probably a mistake to assume that we only need one interface.
sepgsql has several hooks, and will need more before the dust
settles. We have one hook for checking permissions on table names
that appear in DML queries, a second for applying security labels just
after a new SQL object is created, and a third for adjusting the
security context when an sepgsql trusted procedure is invoked. In a
similar way, I think it's probably futile to think that we can come up
with a one-size-fits-all interface where every command (or operation)
trigger can accept the same arguments. CREATE TABLE is going to want
to know different stuff than LOCK TABLE or ALTER OPERATOR FAMILY, and
trigger writers are going to want a *convenient* API to that
information, not just the raw query text.

(2) It's almost certainly a mistake to assume that everything you want
to trigger on is a command. For example, somebody might want to get
control whenever a table gets added to a column, either at table
create time or later. I don't think most of us would consider CREATE
TABLE bob (a int, b int) to be a create-a-table-with-no-columns
operation plus two add-a-column-to-a-table operations. But being able
to enforce a column naming policy is no less useful than being able to
enforce a table naming policy, and there are other things you might
want to do there as well (logging, updating metadata, prohibiting use
of certain types, etc.).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-15 20:31:26
Message-ID: m2bor960gx.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> No. That's 100% backwards. We should first decide what functionality
> we want, and then decide how we're going to implement it. If we
> arbitrarily decide to force everything that someone might want to
> write a trigger on through ProcessUtility_hook, then we're going to
> end up being unable to add triggers for some things because they can't
> be conveniently forced through ProcessUtility, or else we're going to
> end up contorting the code in bizarre ways because we've drawn some
> line in the sand that ProcessUtility is the place where triggers have
> to get called.

In theory you're right. In practice, we're talking about utility
command triggers, that fires on a top-level command.

We're now enlarging the talk about what to do with sub-commands, that is
things done by a command as part of its implementation but that you
could have done separately with another finer grained dedicated
top-level command.

I'm not wanting to implement a general ”event” trigger mechanism where
anyone can come and help define the set of events, and I think what
you're talking about now amounts to be doing that.

Here again, trying to generalize before we have anything useful is a
recipe for failure. I concur that “Process Utility Top-Level Only
Command Triggers” is a pretty limited feature in scope, yet that's what
I want to obtain here, and I think it's useful enough on its own.

If you disagree, please propose a user level scheme where we can fit the
work I'm doing so that I can adapt my patch and implement a part of your
scheme in a future proof way. I'm ready to do that even when I have no
need for what you're talking about, if that's what it takes.

> (1) It's probably a mistake to assume that we only need one interface.
[... useful sepgsql history ...]
> trigger can accept the same arguments. CREATE TABLE is going to want
> to know different stuff than LOCK TABLE or ALTER OPERATOR FAMILY, and
> trigger writers are going to want a *convenient* API to that
> information, not just the raw query text.

Are you familiar with the nodeToString() output? That's close to a full
blown XML, JSON or YAML document that you can easily use from a program.
Command triggers functions are not given just a raw text.

When trying to define a more complex API in the line of what you're
referencing here, back at the Cluster Hackers Developer Meeting, it was
devised that the nodeToString() output is all you need. For having
written code that produces it and code that consumes it, Jan has been
very clear that you hardly can do better than that while still being
generic enough.

> to enforce a column naming policy is no less useful than being able to
> enforce a table naming policy, and there are other things you might
> want to do there as well (logging, updating metadata, prohibiting use
> of certain types, etc.).

Are you familiar with the nodeToString() output? What makes you think
that the uses cases you propose here are hard to implement in a command
trigger when given this parse tree string?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-18 05:04:19
Message-ID: 201112180504.pBI54Jg03411@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine wrote:
> Here again, trying to generalize before we have anything useful is a
> recipe for failure. I concur that ?Process Utility Top-Level Only
> Command Triggers? is a pretty limited feature in scope, yet that's what
> I want to obtain here, and I think it's useful enough on its own.
>
> If you disagree, please propose a user level scheme where we can fit the
> work I'm doing so that I can adapt my patch and implement a part of your
> scheme in a future proof way. I'm ready to do that even when I have no
> need for what you're talking about, if that's what it takes.

We have a big user community and what _you_ want for this feature is
only a small part of our decision on what is needed. Robert's concern
that this might not be useful enough for the general use-cases people
want is a legitimate, if difficult to hear, analysis.

The resistance we get when removing features is sobering. Imagine the
worst case, e.g. xml2, where we rightly implement it later, but there is
something that is better done with the old interface, and we end up
having to support both.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-18 11:20:44
Message-ID: m239ci3z3n.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> We have a big user community and what _you_ want for this feature is
> only a small part of our decision on what is needed. Robert's concern
> that this might not be useful enough for the general use-cases people
> want is a legitimate, if difficult to hear, analysis.

Agreed, his concern is legitimate. Now, I've never been trying to
implement a generic event trigger system and I don't know what it would
take to implement such a beast.

Transaction BEGIN, COMMIT, and ROLLBACK triggers anyone? not me :)

Exploring if my proposal would be a pain to maintain once we have a fully
generic event trigger system someday is legitimate, asking me to design
both the generic event system and the command triggers so that they fit
in is just asking for too much.

The main part of my answer, though, is that all the more complex use
cases involving command triggers that Robert is offering are in fact
possible to implement with what my patch is providing, as soon as you're
ok with understanding the content and format of the nodeToString()
output.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-18 14:21:31
Message-ID: 201112181421.pBIELVX28380@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > We have a big user community and what _you_ want for this feature is
> > only a small part of our decision on what is needed. Robert's concern
> > that this might not be useful enough for the general use-cases people
> > want is a legitimate, if difficult to hear, analysis.
>
> Agreed, his concern is legitimate. Now, I've never been trying to
> implement a generic event trigger system and I don't know what it would
> take to implement such a beast.
>
> Transaction BEGIN, COMMIT, and ROLLBACK triggers anyone? not me?:)
>
> Exploring if my proposal would be a pain to maintain once we have a fully
> generic event trigger system someday is legitimate, asking me to design
> both the generic event system and the command triggers so that they fit
> in is just asking for too much.

Agreed. I am not against this addition, just pointing out that we have
to be careful.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-18 18:52:32
Message-ID: 3434.1324234352@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> The main part of my answer, though, is that all the more complex use
> cases involving command triggers that Robert is offering are in fact
> possible to implement with what my patch is providing, as soon as you're
> ok with understanding the content and format of the nodeToString()
> output.

Hmm ... I don't think that I *am* ok with that. ISTM that we'd then
find ourselves with any changes in utility statement parse trees
amounting to a user-visible API break, and that's not an acceptable
situation.

We already have this issue of course with respect to C-code add-ons,
but (1) we've established an understanding that people should have to
recompile those for every major release, and (2) changes such as adding
a new field, or even changing an existing field that you don't care
about, don't break C source code. I don't know exactly what you're
imagining that user-written triggers would do with nodeToString strings,
but I'd bet a good lunch that people will use ad-hoc interpretation
methods that are not robust against changes at all. And then they'll
blame us when their triggers break --- not unreasonably, because we
failed to provide a sane API for them to use.

We really need some higher-level API than the raw parse tree, and
I have to admit that I have no idea what that would look like.
But exposing parse trees to user-written triggers is a decision
that we will come to regret, probably as soon as the next release.

regards, tom lane


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-18 19:54:11
Message-ID: m2vcpd1wrg.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Hmm ... I don't think that I *am* ok with that. ISTM that we'd then
> find ourselves with any changes in utility statement parse trees
> amounting to a user-visible API break, and that's not an acceptable
> situation.

Oh, you mean like exposing the parser for syntax coloring etc. I failed
to see it's the same case. Do we have an acceptable proposal on that
front yet?

> We already have this issue of course with respect to C-code add-ons,
> but (1) we've established an understanding that people should have to
> recompile those for every major release, and (2) changes such as adding
> a new field, or even changing an existing field that you don't care
> about, don't break C source code. I don't know exactly what you're
> imagining that user-written triggers would do with nodeToString strings,
> but I'd bet a good lunch that people will use ad-hoc interpretation
> methods that are not robust against changes at all. And then they'll
> blame us when their triggers break --- not unreasonably, because we
> failed to provide a sane API for them to use.

Could we offer people a sane API?

Another way could be to bypass BEFORE triggers and let people look at
the catalogs in the AFTER command trigger, and give them the object oid,
name and schemaname for them to do their lookups.

You can still RAISE EXCEPTION in an AFTER command trigger to cancel the
command execution, what you can not do anymore is canceling the command
without killing the current transaction.

> We really need some higher-level API than the raw parse tree, and
> I have to admit that I have no idea what that would look like.
> But exposing parse trees to user-written triggers is a decision
> that we will come to regret, probably as soon as the next release.

I was under the illusion that providing users with ready to tweak
examples of robust-against-changes code would cut it. I'd like the
command triggers patch not to depend on designing this API we need.

What do you think of removing the parsetree and the BEFORE trigger
support (so that trigger function can query the catalogs)?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-18 21:17:57
Message-ID: 22282.1324243077@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> We really need some higher-level API than the raw parse tree, and
>> I have to admit that I have no idea what that would look like.
>> But exposing parse trees to user-written triggers is a decision
>> that we will come to regret, probably as soon as the next release.

> I was under the illusion that providing users with ready to tweak
> examples of robust-against-changes code would cut it. I'd like the
> command triggers patch not to depend on designing this API we need.

Well, we don't have any such examples, because frankly the nodeToString
representation is pretty damn unfriendly. The only code we have that
does anything with it at all is the readfuncs.c code that turns it back
into trees of C structs, and that's no help for triggers not themselves
written in C. Besides which, readfuncs.c is the poster child for code
that does have to change every time we tweak the struct definitions.
We can't tell people to copy that approach.

> What do you think of removing the parsetree and the BEFORE trigger
> support (so that trigger function can query the catalogs)?

Well, it gets us out of the business of inventing a suitable API,
but I think it also reduces the feature to a point of near uselessness.
Essentially we'd be saying to trigger authors "something changed, feel
free to inspect the catalogs and see if you can guess what".

Just because the problem is hard doesn't mean you can get away with
not solving it.

regards, tom lane


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-18 22:11:00
Message-ID: m28vm91qff.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Well, we don't have any such examples, because frankly the nodeToString
> representation is pretty damn unfriendly. The only code we have that

I tend to agree here, but I know that Jan is convincing enough when he's
saying that it is in fact very friendly.

> does anything with it at all is the readfuncs.c code that turns it back
> into trees of C structs, and that's no help for triggers not themselves
> written in C. Besides which, readfuncs.c is the poster child for code
> that does have to change every time we tweak the struct definitions.
> We can't tell people to copy that approach.

Providing the same nested C structs thingy in python or perl or tcl
might be feasible and not too sketchy to maintain, but I'm failing to
see how to even approach that for plpgsql.

>> What do you think of removing the parsetree and the BEFORE trigger
>> support (so that trigger function can query the catalogs)?
>
> Well, it gets us out of the business of inventing a suitable API,
> but I think it also reduces the feature to a point of near uselessness.

Not being generic and flexible is not the same thing as not being of any
use at all. Extension whitelisting is still possible to implement
because all you need to know is the extension's name, then you choose to
let the command string you're given execute or not. Same with
replication or simple auditing cases, you still have the plain command
string to play with.

Not useful enough for being what we ship in 9.2, I can follow you there,
not useful at all, disagreed.

> Essentially we'd be saying to trigger authors "something changed, feel
> free to inspect the catalogs and see if you can guess what".

No, we'd also be providing the main OID of the object that changed (a
pg_class entry for a CREATE TABLE command, etc), the object name and its
schema name too. And the command string too. ALTER TABLE is still
difficult to handle, other more simple commands might be ok.

> Just because the problem is hard doesn't mean you can get away with
> not solving it.

That is the single simplest way of handling it, though, so I had to try
that first. Now, maybe we can find the right approach to publishing the
parse tree this month still. Any ideas welcome!

I guess XML would be ok but we don't embed powerful enough tools, and
JSON might be perfect but we would need to have a full blown datatype
and functions to work with that from plpgsql. What other tree-ish data
type can we have?

EXPLAIN is already able to spit out XML and JSON (and YAML) but the
typical client consuming that output is not running as a backend stored
procedure, so I guess that's not a precedent and we still need something
with a good support (type, operators, walking functions…) to back it.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-19 02:02:04
Message-ID: CA+Tgmobfd4xnLUAPxtEJrc+JHhoWnkCiE8qT_ckSR=e1ScFT2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Dec 18, 2011 at 5:11 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> I guess XML would be ok but we don't embed powerful enough tools, and
> JSON might be perfect but we would need to have a full blown datatype
> and functions to work with that from plpgsql.  What other tree-ish data
> type can we have?
>
> EXPLAIN is already able to spit out XML and JSON (and YAML) but the
> typical client consuming that output is not running as a backend stored
> procedure, so I guess that's not a precedent and we still need something
> with a good support (type, operators, walking functions…) to back it.

Right. If we're actually going to expose the parse tree, I think JSON
(or even XML) would be a far better way to expose that than the
existing nodeToString() output. Sure, you could make due with the
nodeToString() output for some things, especially in PL/perl or
PL/python. But JSON would be far better, since it's a standard format
rather than something we just made up, and could be used in PL/pgsql
as well, given proper support functions.

Another option would be to do something like this:

CREATE TYPE pg_trigger_on_create_table AS (
catalog_name text,
schema_name text,
relation_name text,
...
);

That's not materially different from exposing the parse tree, but it's
more convenient for PL/pgsql and doesn't require adding a new datatype
like JSON. It might require an awful lot of tuple-construction code
and datatype definitions, though.

Still another option would be to expose some of this information
through "magical" variables or functions, sort of like the way that
declaring a function to return trigger causes it to have NEW and OLD.
It could have STUFF.catalog_name, STUFF.schema_name,
STUFF.relation_name, or whatever we want.

None of these approaches really get around the fact that if the
command syntax changes, the trigger API has to change, too. You might
be able to get around that for CREATE commands by having only AFTER
triggers, and just passing the OID; and for DROP commands by having
only BEFORE triggers, and just passing the OID. But I don't see any
way to make it work very well for ALTER commands.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-19 14:31:17
Message-ID: 4EEF4AB5.6030902@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/18/2011 09:02 PM, Robert Haas wrote:
> If we're actually going to expose the parse tree, I think JSON
> (or even XML) would be a far better way to expose that than the
> existing nodeToString() output. Sure, you could make due with the
> nodeToString() output for some things, especially in PL/perl or
> PL/python. But JSON would be far better, since it's a standard format
> rather than something we just made up, and could be used in PL/pgsql
> as well, given proper support functions.

That seems pretty sensible; I wouldn't want to make it a hard
requirement though. There are three major ways this could go for 9.2:

1) Nothing is changed in core, the best that can be done is whatever you
can cram into an extension.

2) 9.2 is released with Command Triggers using nodeToString() output as
the detailed description. That part is labeled "experimental and the
API is expected to change completely in the next release"

3) 9.2 gets enough JSON support to also have an early nodeToJSON API
instead. Now it's labeled "early release and some API changes may be
needed in future releases".

From the perspective of enabling a developer community to spring up
around working in this area, I don't see a huge difference between (2)
and (3). We're going in a direction I don't think is very well explored
yet, by anyone. The idea that we're going to learn enough to accumulate
a comprehensive, stable API before release is rather optimistic.
There's something to be said for Dimitri's suggested approach from the
perspective of "ship it knowing it works for features A, B, then let's
see what else the users think to do with it". We can't determine what
feature complete looks like from what other people are doing anymore;
only way to find out is to see what the world at large thinks of after
release.

Think of it as being like the EXPLAIN plan output. That shipped as just
text, programs popped up to parse it anyway, they suffered some breaks
with each new release. That was still a major win. Then, new easier to
parse formats appeared, making the bar to entry for writing new such
tool much lower. And the construction of a better API for output
benefited from seeing what people had actually been struggling with
before then.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-19 14:39:44
Message-ID: 1324305473-sup-3773@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Dimitri Fontaine's message of dom dic 18 16:54:11 -0300 2011:
>
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> > Hmm ... I don't think that I *am* ok with that. ISTM that we'd then
> > find ourselves with any changes in utility statement parse trees
> > amounting to a user-visible API break, and that's not an acceptable
> > situation.
>
> Oh, you mean like exposing the parser for syntax coloring etc. I failed
> to see it's the same case. Do we have an acceptable proposal on that
> front yet?

The conclusion that was reached during developer's meeting was that
those interested should use a technique similar to the one used by the
ecpg parser, namely use some sort of tool to transform the gram.y source
code into something else (different productions). That idea is not
useful to you here, I'm afraid.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-19 16:12:13
Message-ID: CA+TgmobNRf5y8skr0wqfh_wExg6JvASKqPuKRUb2kaSmz=Jrqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 19, 2011 at 9:31 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> That seems pretty sensible; I wouldn't want to make it a hard requirement
> though.  There are three major ways this could go for 9.2:
>
> 1) Nothing is changed in core, the best that can be done is whatever you can
> cram into an extension.
>
> 2) 9.2 is released with Command Triggers using nodeToString() output as the
> detailed description.  That part is labeled "experimental and the API is
> expected to change completely in the next release"
>
> 3) 9.2 gets enough JSON support to also have an early nodeToJSON API
> instead.  Now it's labeled "early release and some API changes may be needed
> in future releases".
>
> From the perspective of enabling a developer community to spring up around
> working in this area, I don't see a huge difference between (2) and (3).

I do. Anyone coding in PL/pgsql is going to find the nodeToString()
output unusable, and we can easily provide a built-in JSON datatype
with enough richness to make that problem go away in time for 9.2.
People in PL/python and PL/perl may be a bit better off, but I see no
reason to ship something for 9.2 and then break it for 9.3 when we
could perfectly well make that compatibility break before the release.
(And, in case it's not clear, yes, I am volunteering to do the work,
if it comes down to that.)

But before we get bogged down in the data representation, I think we
need to make a more fundamental decision: to what extent are we OK
with exporting the parse tree at all, in any form? Tom is arguing
that we shouldn't, and I see his point: the recent DROP command rework
would have broken everybody's command triggers if we had adopted this
proposal, and that would be a real shame, because I don't particularly
like the idea that we can't continue to improve the code and refactor
things because someone out there might be depending on an older and
less well-considered behavior.

On the flip side, I don't really see any other way to make this
feature work at all. I think that AFTER CREATE triggers and BEFORE
DROP triggers could potentially be implemented by just passing OIDs
in, and that might be useful enough for many people. But what about
ALTER? I don't see that you're going to be able to write any sort of
meaningful triggers around ALTER without passing at least some of the
parse tree information to the trigger function.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2011-12-19 16:20:51
Message-ID: 201112191720.52175.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Monday, December 19, 2011 05:12:13 PM Robert Haas wrote:
> On Mon, Dec 19, 2011 at 9:31 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> > That seems pretty sensible; I wouldn't want to make it a hard requirement
> > though. There are three major ways this could go for 9.2:
> >
> > 1) Nothing is changed in core, the best that can be done is whatever you
> > can cram into an extension.
> >
> > 2) 9.2 is released with Command Triggers using nodeToString() output as
> > the detailed description. That part is labeled "experimental and the
> > API is expected to change completely in the next release"
> >
> > 3) 9.2 gets enough JSON support to also have an early nodeToJSON API
> > instead. Now it's labeled "early release and some API changes may be
> > needed in future releases".
> >
> > From the perspective of enabling a developer community to spring up
> > around working in this area, I don't see a huge difference between (2)
> > and (3).
>
> I do. Anyone coding in PL/pgsql is going to find the nodeToString()
> output unusable, and we can easily provide a built-in JSON datatype
> with enough richness to make that problem go away in time for 9.2.
> People in PL/python and PL/perl may be a bit better off, but I see no
> reason to ship something for 9.2 and then break it for 9.3 when we
> could perfectly well make that compatibility break before the release.
> (And, in case it's not clear, yes, I am volunteering to do the work,
> if it comes down to that.)
I personally think youre underestimating the complexity of providing a
sensible json compatibility shim ontop the nodestring representation. But I
would like to be proven wrong ;)
Whats your idea?

> But before we get bogged down in the data representation, I think we
> need to make a more fundamental decision: to what extent are we OK
> with exporting the parse tree at all, in any form? Tom is arguing
> that we shouldn't, and I see his point: the recent DROP command rework
> would have broken everybody's command triggers if we had adopted this
> proposal, and that would be a real shame, because I don't particularly
> like the idea that we can't continue to improve the code and refactor
> things because someone out there might be depending on an older and
> less well-considered behavior.
I don't see any realistic way to present the data in way thats abstracted from
the internals for now. The infrastructure is completely new and we don't
really know what it will be used for. I personally have no problem requiring
that any nontrivial nodestring access has to be done via c functions for now.
Building a completely new form of parstree seems way too much effort/flamebait
for me.

Andres


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-19 16:22:54
Message-ID: CAFj8pRBZM6vR93xt2N2iHr6tXehqAt=jAhtGNvi4EubbWqHqFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/12/19 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Dec 19, 2011 at 9:31 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>> That seems pretty sensible; I wouldn't want to make it a hard requirement
>> though.  There are three major ways this could go for 9.2:
>>
>> 1) Nothing is changed in core, the best that can be done is whatever you can
>> cram into an extension.
>>
>> 2) 9.2 is released with Command Triggers using nodeToString() output as the
>> detailed description.  That part is labeled "experimental and the API is
>> expected to change completely in the next release"
>>
>> 3) 9.2 gets enough JSON support to also have an early nodeToJSON API
>> instead.  Now it's labeled "early release and some API changes may be needed
>> in future releases".
>>
>> From the perspective of enabling a developer community to spring up around
>> working in this area, I don't see a huge difference between (2) and (3).
>
> I do.  Anyone coding in PL/pgsql is going to find the nodeToString()
> output unusable, and we can easily provide a built-in JSON datatype
> with enough richness to make that problem go away in time for 9.2.
> People in PL/python and PL/perl may be a bit better off, but I see no
> reason to ship something for 9.2 and then break it for 9.3 when we
> could perfectly well make that compatibility break before the release.
>  (And, in case it's not clear, yes, I am volunteering to do the work,
> if it comes down to that.)

absolutelly

Parsing a some document is not a good way for plpgsql. We can enhance
a diagnostics statement for triggers values.

there should be lot of variables, and it is cheap because we can take
content when it is requested

STATEMENT_NAME,
OBJECT_SCHEMA,
OBJECT_NAME,
OBJECT_TYPE,
OBJECT_OID,
ATTRIBUT_NAME,
ATTRIBUT_OID,
ATTRIBUT_TYPE,
STATEMENT_PARAMETERS_ARRAY

plpgsql is not good for recursive data - can be nice if all necessary
data can be flat.

Regards

Pavel

>
> But before we get bogged down in the data representation, I think we
> need to make a more fundamental decision: to what extent are we OK
> with exporting the parse tree at all, in any form?  Tom is arguing
> that we shouldn't, and I see his point: the recent DROP command rework
> would have broken everybody's command triggers if we had adopted this
> proposal, and that would be a real shame, because I don't particularly
> like the idea that we can't continue to improve the code and refactor
> things because someone out there might be depending on an older and
> less well-considered behavior.
>
> On the flip side, I don't really see any other way to make this
> feature work at all.  I think that AFTER CREATE triggers and BEFORE
> DROP triggers could potentially be implemented by just passing OIDs
> in, and that might be useful enough for many people.  But what about
> ALTER?  I don't see that you're going to be able to write any sort of
> meaningful triggers around ALTER without passing at least some of the
> parse tree information to the trigger function.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2011-12-19 16:41:43
Message-ID: CA+TgmoZYXn=+Po7gS93_1Bb1mYsai8Y=hgd9b8OnjydCtM+Y0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 19, 2011 at 11:20 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> I do.  Anyone coding in PL/pgsql is going to find the nodeToString()
>> output unusable, and we can easily provide a built-in JSON datatype
>> with enough richness to make that problem go away in time for 9.2.
>> People in PL/python and PL/perl may be a bit better off, but I see no
>> reason to ship something for 9.2 and then break it for 9.3 when we
>> could perfectly well make that compatibility break before the release.
>>  (And, in case it's not clear, yes, I am volunteering to do the work,
>> if it comes down to that.)
> I personally think youre underestimating the complexity of providing a
> sensible json compatibility shim ontop the nodestring representation. But I
> would like to be proven wrong ;)
> Whats your idea?

I haven't gotten that far down into the minutiae yet. :-)

But the reason I think it won't be too bad is because the current
representation is darn close to JSON already:

{VAR :varno 2 :varattno 1 :vartype 25 :vartypmod -1 :varcollid 100
:varlevelsup 0 :varnoold 2 :varoattno 1 :location 9998}

=>

{"_":"VAR","varno":2,"varattno":1,"vartype":25,"vartypmod":-1,"varcollid":100,"varlevelsup":0,"varnoold":2,"varoattno":1,"location":9998}

If you've got something like:

{OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false
:opcollid 0 :inputcollid 100 :args ({VAR :varno 2 :varattno 1 :vartype
25 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnoold 2 :varoattno
1 :location 9998} {VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1
:varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 1 :location
10009}) :location 10007}

...then the value for the "args" label will just be an object rather
than an integer.

>> But before we get bogged down in the data representation, I think we
>> need to make a more fundamental decision: to what extent are we OK
>> with exporting the parse tree at all, in any form?  Tom is arguing
>> that we shouldn't, and I see his point: the recent DROP command rework
>> would have broken everybody's command triggers if we had adopted this
>> proposal, and that would be a real shame, because I don't particularly
>> like the idea that we can't continue to improve the code and refactor
>> things because someone out there might be depending on an older and
>> less well-considered behavior.
> I don't see any realistic way to present the data in way thats abstracted from
> the internals for now. The infrastructure is completely new and we don't
> really know what it will be used for.

That's my feeling as well, but I'm hoping Tom or someone else has a better idea.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2011-12-19 17:52:11
Message-ID: 6471.1324317131@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On Monday, December 19, 2011 05:12:13 PM Robert Haas wrote:
>> I do. Anyone coding in PL/pgsql is going to find the nodeToString()
>> output unusable, and we can easily provide a built-in JSON datatype
>> with enough richness to make that problem go away in time for 9.2.

I hear the sound of goalposts moving.

> I personally think youre underestimating the complexity of providing a
> sensible json compatibility shim ontop the nodestring representation. But I
> would like to be proven wrong ;)

If we were going to do this at all, the way to do it is to flush the
existing nodestring representation and redefine it as being JSON. No?
If this is as easy as people are claiming, it should not be hard to
revise the lower-level bits of outfuncs/readfuncs to make the text
representation compatible. And there's no reason we can't change
what is stored in pg_rewrite.

>> But before we get bogged down in the data representation, I think we
>> need to make a more fundamental decision: to what extent are we OK
>> with exporting the parse tree at all, in any form? Tom is arguing
>> that we shouldn't, and I see his point: the recent DROP command rework
>> would have broken everybody's command triggers if we had adopted this
>> proposal, and that would be a real shame, because I don't particularly
>> like the idea that we can't continue to improve the code and refactor
>> things because someone out there might be depending on an older and
>> less well-considered behavior.

The problem that changing the nodestring representation could help with
is making user-written triggers roughly as robust as C code is, to wit
that you don't have to change it as long as the specific fields it
touches aren't redefined. The question is whether that's good enough.
The DROP command changes provide a pretty strong clue that it isn't.
Admittedly, that's not the sort of change we make too often. But I
will be seriously annoyed if we start getting the sort of pushback
on parsetree changes that we've been hearing from certain quarters about
configuration file changes. Those structures are *internal* and we have
got to have the flexibility to whack them around.

regards, tom lane


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command Triggers
Date: 2011-12-19 18:25:38
Message-ID: 4EEF81A2.1040908@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/19/2011 11:12 AM, Robert Haas wrote:
> But before we get bogged down in the data representation, I think we
> need to make a more fundamental decision: to what extent are we OK
> with exporting the parse tree at all, in any form? Tom is arguing
> that we shouldn't, and I see his point: the recent DROP command rework
> would have broken everybody's command triggers if we had adopted this
> proposal, and that would be a real shame, because I don't particularly
> like the idea that we can't continue to improve the code and refactor
> things because someone out there might be depending on an older and
> less well-considered behavior.

Any interface here would need to be in the same sense Linux uses the
term: subject to change in every major version, and maybe even in a
minor one if that's the best way to solve a higher priority issue. An
example we've been consuming that comes to mind is the "API" for keeping
processes from being killed by the OOM killer. It's far from stable:
http://archives.postgresql.org/message-id/4CE5E437.7080902@2ndquadrant.com
but it's still possible for users of it to keep up with new releases,
and when feasible some work toward backward compatibility is done (but
not always)

As a tool author, I would expect anything working at the level where the
data needed is only available from the parse tree would need to be
re-tested against each new version, and then have version specific
changes as needed. Setting the expectations bar any higher for
consumers of that interface would be unrealistic. The minority of
people who'd like to use this feature shouldn't want to see PostgreSQL
development hamstrung for the majority either, and the standards for
breakage here should be clear from the beginning--unlike those for, say,
GUC changes between releases.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2011-12-19 18:37:46
Message-ID: CA+Tgmobb=YcFyr2ygw9e6Y4hBaL6KO8NK8a6GNHUu=3bhp9SwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 19, 2011 at 12:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I personally think youre underestimating the complexity of providing a
>> sensible json compatibility shim ontop the nodestring representation. But I
>> would like to be proven wrong ;)
>
> If we were going to do this at all, the way to do it is to flush the
> existing nodestring representation and redefine it as being JSON.  No?
> If this is as easy as people are claiming, it should not be hard to
> revise the lower-level bits of outfuncs/readfuncs to make the text
> representation compatible.  And there's no reason we can't change
> what is stored in pg_rewrite.

I thought about that. A quick inspection suggests that this would
slightly increase the size of the stored rules, because the node type
would need to become a key, which would add at least a few more
characters, and also because we'd need more quoting. That is:

{THING :wump 1}

becomes, at a minimum:

{"_": "THING", "wump": 1}

And that's using a somewhat-lame single character label for the node
tag. Now, I suspect that in practice the cost of the stored rules
becoming slightly larger is negligible, so maybe it's worth it.

>>> But before we get bogged down in the data representation, I think we
>>> need to make a more fundamental decision: to what extent are we OK
>>> with exporting the parse tree at all, in any form?  Tom is arguing
>>> that we shouldn't, and I see his point: the recent DROP command rework
>>> would have broken everybody's command triggers if we had adopted this
>>> proposal, and that would be a real shame, because I don't particularly
>>> like the idea that we can't continue to improve the code and refactor
>>> things because someone out there might be depending on an older and
>>> less well-considered behavior.
>
> The problem that changing the nodestring representation could help with
> is making user-written triggers roughly as robust as C code is, to wit
> that you don't have to change it as long as the specific fields it
> touches aren't redefined.  The question is whether that's good enough.

Agreed.

> The DROP command changes provide a pretty strong clue that it isn't.
> Admittedly, that's not the sort of change we make too often.  But I
> will be seriously annoyed if we start getting the sort of pushback
> on parsetree changes that we've been hearing from certain quarters about
> configuration file changes.  Those structures are *internal* and we have
> got to have the flexibility to whack them around.

Yes.

Maybe we should try to split the baby here and defer the question of
whether to expose any of the parse tree internals, and if so how much,
to a future release. It seems to me that we could design a fairly
useful set of functionality around AFTER-CREATE, BEFORE-DROP, and
maybe even AFTER-ALTER triggers without exposing any parse tree
details. For CREATE and ALTER, that would make it the client's
responsibility to inspect the system catalogs and figure out what had
happened and what to do about it, which is admittedly not ideal, but
it would be more than we have now, and it would then give us the
option to consider requests to expose more information in future
releases on a case-by-case basis, rather than making a blanket
decision about whether to expose the parse tree or not. I have a
sneaking suspicion that, while we probably can't get by without
exposing any parse tree information ever, the amount we truly need to
expose might end up being only a small percentage of the total...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-10 21:11:57
Message-ID: m2obubxnxe.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Maybe we should try to split the baby here and defer the question of
> whether to expose any of the parse tree internals, and if so how much,
> to a future release. It seems to me that we could design a fairly
> useful set of functionality around AFTER-CREATE, BEFORE-DROP, and
> maybe even AFTER-ALTER triggers without exposing any parse tree
> details.

+1

Also remember that you have a “normalized” command string to play with.
Lots of use cases are already ok here. The other ones would need a tree
representation that's easy to consume, which in the current state of
affairs (I saw no progress on the JSON data type and facilities) is very
hard to imagine when you consider PLpgSQL.

So unless I hear about a show stopper, I'm going to work some more on
the command trigger patch where I still had some rough edges to polish.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-13 22:17:03
Message-ID: m2r4z3tfhc.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Maybe we should try to split the baby here and defer the question of
> whether to expose any of the parse tree internals, and if so how much,
> to a future release. It seems to me that we could design a fairly
> useful set of functionality around AFTER-CREATE, BEFORE-DROP, and
> maybe even AFTER-ALTER triggers without exposing any parse tree
> details.

Please find attached v5 of the patch, with nodeToString() support removed.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v5.patch.gz application/octet-stream 32.9 KB

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-13 22:22:37
Message-ID: 201201132322.38161.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, December 19, 2011 07:37:46 PM Robert Haas wrote:
> Maybe we should try to split the baby here and defer the question of
> whether to expose any of the parse tree internals, and if so how much,
> to a future release.
I personally think this is an error and those details should at least be
available on the c level (e.g. some pg_command_trigger_get_plan() function,
only available via C) to allow sensible playing around with that knowledge. I
don't really see making progress towards a nice interface unless we get
something to play around with out there.

Andres


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-13 22:53:32
Message-ID: m2lipbrz83.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> I personally think this is an error and those details should at least be
> available on the c level (e.g. some pg_command_trigger_get_plan() function,
> only available via C) to allow sensible playing around with that knowledge. I
> don't really see making progress towards a nice interface unless we get
> something to play around with out there.

If you target C coded triggers then all you need to do is provide a
pointer to the Node *parsetree, I would think. What else?

The drawback though is still the same, the day you do that you've
proposed a public API and changing the parsetree stops being internal
refactoring. The way around this problem is that if you want a command
trigger in C, just write an extension that implements the Process
Utility hook. Bonus, you can have that working with already released
versions of PostgreSQL.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-14 00:23:20
Message-ID: 201201140123.20514.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday, January 13, 2012 11:53:32 PM Dimitri Fontaine wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > I personally think this is an error and those details should at least be
> > available on the c level (e.g. some pg_command_trigger_get_plan()
> > function, only available via C) to allow sensible playing around with
> > that knowledge. I don't really see making progress towards a nice
> > interface unless we get something to play around with out there.
>
> If you target C coded triggers then all you need to do is provide a
> pointer to the Node *parsetree, I would think. What else?
Yes.

Being able to turn that into a statement again is still valuable imo.

> The drawback though is still the same, the day you do that you've
> proposed a public API and changing the parsetree stops being internal
> refactoring.
Yes, sure. I don't particularly care though actually. Changing some generic
guts of trigger functions is not really that much of a problem compared to the
other stuff involoved in a version migration.

> The way around this problem is that if you want a command
> trigger in C, just write an extension that implements the Process
> Utility hook.
The point is that with CREATE COMMAND TRIGGER only the internal part of the
triggers need to change. No the external interface. Which is a big difference
from my pov.
Also hooks are relatively hard to stack, i.e. its hard to use them sensibly
from multiple independent projects. They also cannot be purely installed via
extensions/sql.

Andres


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-14 11:09:28
Message-ID: m27h0usfpz.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
>> If you target C coded triggers then all you need to do is provide a
>> pointer to the Node *parsetree, I would think. What else?
> Yes.
>
> Being able to turn that into a statement again is still valuable imo.

That part of the WIP code is still in the patch, yes.

>> The drawback though is still the same, the day you do that you've
>> proposed a public API and changing the parsetree stops being internal
>> refactoring.
> Yes, sure. I don't particularly care though actually. Changing some generic
> guts of trigger functions is not really that much of a problem compared to the
> other stuff involoved in a version migration.

Let's hear about it from Tom, who's mainly been against publishing that.

> The point is that with CREATE COMMAND TRIGGER only the internal part of the
> triggers need to change. No the external interface. Which is a big difference
> from my pov.

I'm not sure. The way you get the arguments would stay rather stable,
but the parsetree would change at each release: that's not a long term
API here. I fail to see much difference in between a hook and a command
trigger as soon as you've chosen to implement the feature in C.

> Also hooks are relatively hard to stack, i.e. its hard to use them sensibly
> from multiple independent projects. They also cannot be purely installed via
> extensions/sql.

That remains true, you can't easily know in which order your hooks will
get fired, contrary to triggers, and you can't even list the hooks.

I fear that we won't be able to answer your need here in 9.2 though.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-14 15:02:34
Message-ID: CA+TgmoaE-1f+2d3arqyWFqExo0YX75o-q5Bz=ZpSRn3A3+Qpjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 13, 2012 at 5:53 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
>> I personally think this is an error and those details should at least be
>> available on the c level (e.g. some pg_command_trigger_get_plan() function,
>> only available via C) to allow sensible playing around with that knowledge. I
>> don't really see making progress towards a nice interface unless we get
>> something to play around with out there.
>
> If you target C coded triggers then all you need to do is provide a
> pointer to the Node *parsetree, I would think.  What else?
>
> The drawback though is still the same, the day you do that you've
> proposed a public API and changing the parsetree stops being internal
> refactoring.  The way around this problem is that if you want a command
> trigger in C, just write an extension that implements the Process
> Utility hook.  Bonus, you can have that working with already released
> versions of PostgreSQL.

But on the flip side, I think we're generally a bit more flexible
about exposing things via C than through the procedural languages. I
think it's reasonable for people to complain about their PL/pgsql
functions breaking between major releases, but I have less sympathy
for someone who has the same problem when coding in C. When you
program in C, you're interfacing with the guts of the system, and we
can't improve the system without changing the guts.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-14 21:46:18
Message-ID: m2ipkeq7o5.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> But on the flip side, I think we're generally a bit more flexible
> about exposing things via C than through the procedural languages.

So we could still expose the parsetree of the current command. I wonder
if it's already possible to get that from a C coded trigger, but I'll
admit I'm yet to code a trigger in C. Will look into that.

Then as Andres proposed, a new function would be available to get the
value, we're not changing the trigger procedure function API in case the
language is C…

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-17 21:43:48
Message-ID: m2r4yygg2z.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> But on the flip side, I think we're generally a bit more flexible
>> about exposing things via C than through the procedural languages.
>
> Then as Andres proposed, a new function would be available to get the
> value, we're not changing the trigger procedure function API in case the
> language is C…

I've been updating my github branch with a patch that provides the
parsetree to C coded command trigger functions only, as their 5th
argument, of type INTERNAL (so that only C coded procs apply).

https://github.com/dimitri/postgres/compare/master...command_triggers

I still have some cleaning to do before to prepare the next patch
version, such as documentation updating and dealing with rewrites of
CHECK and DEFAULT column constraints in CREATE TABLE. I had to add
support for the T_A_Const parser node, and now I'm about to see about
adding support for the T_A_Expr one, but I can't help to wonder how the
rewriter could work without them.

What is this simple thing I'm missing here, if any?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-17 21:57:05
Message-ID: 2281.1326837425@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> I still have some cleaning to do before to prepare the next patch
> version, such as documentation updating and dealing with rewrites of
> CHECK and DEFAULT column constraints in CREATE TABLE. I had to add
> support for the T_A_Const parser node, and now I'm about to see about
> adding support for the T_A_Expr one, but I can't help to wonder how the
> rewriter could work without them.

It doesn't, and it shouldn't have to. If those nodes get to the
rewriter then somebody forgot to apply parse analysis. What's your test
case?

regards, tom lane


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-18 09:09:34
Message-ID: 8762g92x81.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
>> I still have some cleaning to do before to prepare the next patch
>> version, such as documentation updating and dealing with rewrites of
>> CHECK and DEFAULT column constraints in CREATE TABLE. I had to add
>> support for the T_A_Const parser node, and now I'm about to see about
>> adding support for the T_A_Expr one, but I can't help to wonder how the
>> rewriter could work without them.
>
> It doesn't, and it shouldn't have to. If those nodes get to the
> rewriter then somebody forgot to apply parse analysis. What's your test
> case?

I'm trying to rewrite the command string from the parse tree, and the
simple example that I use to raise an ERROR is the following:

create table foo (id serial, foo integer default 1, primary key(id));

I don't know if the parsetree handed by ProcessUtility() has gone
through parse analysis and I don't know if that's needed to execute the
commands, so maybe I have some high level function to call before
walking the parse tree in my new rewriting support?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-18 14:46:14
Message-ID: 20982.1326897974@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> It doesn't, and it shouldn't have to. If those nodes get to the
>> rewriter then somebody forgot to apply parse analysis. What's your test
>> case?

> I'm trying to rewrite the command string from the parse tree, and the
> simple example that I use to raise an ERROR is the following:

> create table foo (id serial, foo integer default 1, primary key(id));

That needs to go through transformCreateStmt(). The comments at the
head of parse_utilcmd.c might be informative.

While I've not looked at your patch, I can't escape the suspicion that
this means you are trying to do the wrong things in the wrong places.
Calling transformCreateStmt() from some random place is not going to
make things better; it is going to break CREATE TABLE, which expects to
do that for itself.

regards, tom lane


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-18 19:03:29
Message-ID: m2boq0hlz2.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> create table foo (id serial, foo integer default 1, primary key(id));
>
> That needs to go through transformCreateStmt(). The comments at the
> head of parse_utilcmd.c might be informative.

Indeed, thanks for the heads up here.

> While I've not looked at your patch, I can't escape the suspicion that
> this means you are trying to do the wrong things in the wrong places.
> Calling transformCreateStmt() from some random place is not going to
> make things better; it is going to break CREATE TABLE, which expects to
> do that for itself.

From the comments in the file, it seems like I could either call the
function where I need it on a copy of the parse tree (as made by the
copyObject() function), or reshuffle either when that call happen or
when the calling of the command trigger user functions happens.

At the moment the trigger functions are called from
standard_ProcessUtility() and are given the parse tree as handed over to
that function, before the parse analysis.

We can easily enough copy the parse tree and do another round of parse
analysis on it only when some command triggers are going to get called.
Is the cost of doing so acceptable?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-18 19:17:36
Message-ID: 1326914211-sup-6553@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Dimitri Fontaine's message of mié ene 18 16:03:29 -0300 2012:

> At the moment the trigger functions are called from
> standard_ProcessUtility() and are given the parse tree as handed over to
> that function, before the parse analysis.
>
> We can easily enough copy the parse tree and do another round of parse
> analysis on it only when some command triggers are going to get called.
> Is the cost of doing so acceptable?

Huh, isn't it simpler to just pass the triggers the parse tree *after*
parse analysis? I don't understand what you're doing here.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-18 19:25:48
Message-ID: 201201182025.49696.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday, January 18, 2012 08:17:36 PM Alvaro Herrera wrote:
> Excerpts from Dimitri Fontaine's message of mié ene 18 16:03:29 -0300 2012:
> > At the moment the trigger functions are called from
> > standard_ProcessUtility() and are given the parse tree as handed over to
> > that function, before the parse analysis.
> >
> > We can easily enough copy the parse tree and do another round of parse
> > analysis on it only when some command triggers are going to get called.
> > Is the cost of doing so acceptable?
>
> Huh, isn't it simpler to just pass the triggers the parse tree *after*
> parse analysis? I don't understand what you're doing here.
Parse analysis is not exactly nicely separated for utility statements.

Andres


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-18 19:31:49
Message-ID: 27084.1326915109@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> We can easily enough copy the parse tree and do another round of parse
> analysis on it only when some command triggers are going to get called.
> Is the cost of doing so acceptable?

It's not the costs I'm worried about so much as the side effects ---
locks and so forth. Also, things like assignment of specific names
for indexes and sequences seem rather problematic. In the worst case
the trigger could run seeing "foo_bar_idx1" as the name of an index
to be created, and then when the action actually happens, the name
turns out to be "foo_bar_idx2" because someone else took the first name
meanwhile.

As I said, I think this suggests that you're trying to do the triggers
in the wrong place.

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: Command Triggers
Date: 2012-01-18 20:51:49
Message-ID: 201201182151.49439.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday, January 18, 2012 08:31:49 PM Tom Lane wrote:
> Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> > We can easily enough copy the parse tree and do another round of parse
> > analysis on it only when some command triggers are going to get called.
> > Is the cost of doing so acceptable?
>
> It's not the costs I'm worried about so much as the side effects ---
> locks and so forth. Also, things like assignment of specific names
> for indexes and sequences seem rather problematic. In the worst case
> the trigger could run seeing "foo_bar_idx1" as the name of an index
> to be created, and then when the action actually happens, the name
> turns out to be "foo_bar_idx2" because someone else took the first name
> meanwhile.
You can't generally assume such a thing anyway. Remember there can be BEFORE
command triggers. It would be easy to create a conflict there.

The CREATE TABLE will trigger command triggers on CREATE SEQUENCE and ALTER
SEQUENCE while creating the table. If one actually wants to do anything about
those that would be the place.

> As I said, I think this suggests that you're trying to do the triggers
> in the wrong place.
In my opinion it mostly shows that parse analysis of utlity statments is to
intermingled with other stuff.... Not sure what to do about that.

Andres


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-18 21:23:14
Message-ID: m2pqegg0xp.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Huh, isn't it simpler to just pass the triggers the parse tree *after*
> parse analysis? I don't understand what you're doing here.

I didn't realize that the parse analysis is in fact done from within
standard_ProcessUtility() directly, which means your suggestion is
indeed workable.

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> It's not the costs I'm worried about so much as the side effects ---

Ok, so I'm now calling the command trigger procedures once the parse
analysis is done, and guess what, I'm back to the same problem as
before:

https://github.com/dimitri/postgres/commit/4bfab6344a554c09f7322e861f9d09468f641bd9

CREATE TABLE public."ab_foo-bar"
(
id serial NOT NULL,
foo integer default 1,
PRIMARY KEY(id)
);
NOTICE: CREATE TABLE will create implicit sequence "ab_foo-bar_id_seq" for serial column "ab_foo-bar.id"
NOTICE: snitch: CREATE SEQUENCE
ERROR: unrecognized node type: 904

I'm not sure about the next step, and I'm quite sure I need to stop here
for tonight. Any advice welcome, I'll be working on that again as soon
as tomorrow.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-20 03:00:15
Message-ID: CA+TgmoZyVyNLTdd=480EmMiXOKRBaA28XM6Aiw049SHKZyBPSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 18, 2012 at 4:23 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>> Huh, isn't it simpler to just pass the triggers the parse tree *after*
>> parse analysis?  I don't understand what you're doing here.
>
> I didn't realize that the parse analysis is in fact done from within
> standard_ProcessUtility() directly, which means your suggestion is
> indeed workable.
>
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> It's not the costs I'm worried about so much as the side effects ---
>
> Ok, so I'm now calling the command trigger procedures once the parse
> analysis is done, and guess what, I'm back to the same problem as
> before:
>
>  https://github.com/dimitri/postgres/commit/4bfab6344a554c09f7322e861f9d09468f641bd9
>
>  CREATE TABLE public."ab_foo-bar"
>  (
>    id serial NOT NULL,
>    foo integer default 1,
>    PRIMARY KEY(id)
>  );
>  NOTICE:  CREATE TABLE will create implicit sequence "ab_foo-bar_id_seq" for serial column "ab_foo-bar.id"
>  NOTICE:  snitch: CREATE SEQUENCE
>  ERROR:  unrecognized node type: 904
>
> I'm not sure about the next step, and I'm quite sure I need to stop here
> for tonight. Any advice welcome, I'll be working on that again as soon
> as tomorrow.

My advice is to forget about trying to provide the command string to
the user for the first version of this patch. As you're finding out,
there's no simple, easy, obvious way of doing it, and there are N>0
useful things that can be done without that functionality. I continue
to think that for a first version of this, we should be satisfied to
pass just the OID. I know that's not really what you want, but
there's much to be said for picking a goal that is achievable in the
limited time available, and I fear that setting your sights too high
will lead to something that either (a) doesn't get committed, or (b)
gets committed, but turns out not to work very well, either of which
would be less than ideal.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-20 14:28:40
Message-ID: m2ipk6cusn.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> My advice is to forget about trying to provide the command string to
> the user for the first version of this patch. As you're finding out,
> there's no simple, easy, obvious way of doing it, and there are N>0
> useful things that can be done without that functionality.

Actually, none of my current examples and tests are relying on it. For
the trigger based replications Jan said he would need the full parse
tree rather than just the command string anyway, so we're not loosing
anything more that we already were ready to postpone.

> I continue
> to think that for a first version of this, we should be satisfied to
> pass just the OID. I know that's not really what you want, but
> there's much to be said for picking a goal that is achievable in the
> limited time available, and I fear that setting your sights too high
> will lead to something that either (a) doesn't get committed, or (b)
> gets committed, but turns out not to work very well, either of which
> would be less than ideal.

Agreed, mostly.

From the code I already have I think we should still pass in the command
tag, the schema name (or null) and the object name (or null) as input
parameters to the trigger's procedure.

I'm now working on that and then the concurrency angle of the command
triggers.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-20 15:03:53
Message-ID: CA+TgmoZSvowsLcyeYL3C8ma4sx4+aCvRmY73ZZ8QJf82bo0=QQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 20, 2012 at 9:28 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> My advice is to forget about trying to provide the command string to
>> the user for the first version of this patch.  As you're finding out,
>> there's no simple, easy, obvious way of doing it, and there are N>0
>> useful things that can be done without that functionality.
>
> Actually, none of my current examples and tests are relying on it. For
> the trigger based replications Jan said he would need the full parse
> tree rather than just the command string anyway, so we're not loosing
> anything more that we already were ready to postpone.

Cool.

>>  I continue
>> to think that for a first version of this, we should be satisfied to
>> pass just the OID.  I know that's not really what you want, but
>> there's much to be said for picking a goal that is achievable in the
>> limited time available, and I fear that setting your sights too high
>> will lead to something that either (a) doesn't get committed, or (b)
>> gets committed, but turns out not to work very well, either of which
>> would be less than ideal.
>
> Agreed, mostly.
>
> From the code I already have I think we should still pass in the command
> tag, the schema name (or null) and the object name (or null) as input
> parameters to the trigger's procedure.

I think the OID is better than the name, but if it's easy to pass the
name and schema, then I'm fine with it. But I do think this is one of
those problems that is complex enough that we should be happy to get
the core infrastructure in in the first commit, even with an extremely
limited amount of functionality, and then build on it.
Enhance-and-extend is so much easier than a monolithic code drop.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-20 17:14:47
Message-ID: m2ty3q9tyw.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I think the OID is better than the name, but if it's easy to pass the
> name and schema, then I'm fine with it. But I do think this is one of

It's quite easy to get name and schema from the command yes, here's an
example of how I'm doing it for some commands:

case T_CreateStmt:
{
CreateStmt *node = (CreateStmt *)parsetree;
cmd->schemaname = RangeVarGetNamespace(node->relation);
cmd->objectname = node->relation->relname;
break;
}

case T_AlterTableStmt:
{
AlterTableStmt *node = (AlterTableStmt *)parsetree;
cmd->schemaname = RangeVarGetNamespace(node->relation);
cmd->objectname = node->relation->relname;
break;
}

case T_CreateExtensionStmt:
{
cmd->schemaname = NULL;
cmd->objectname = ((CreateExtensionStmt *)parsetree)->extname;
break;
}

Getting the OID on the other hand is much harder, because each command
implements that part as it wants to, and DefineWhatever() functions are
returning void. We could maybe have them return the main Oid of the
object created, or we could have the CommandContext structure I'm using
be a backend global variable that any command would stuff.

In any case, adding support for the OID only works for after trigger
when talking about CREATE commands and for before trigger if talking
about DROP commands, assuming that the trigger procedure is run after
we've been locating said Oid.

It seems to me to be a couple orders of magnitude more work to get the
Oid here compared to just get the schemaname and objectname. And getting
those works in all cases as we take them from the command itself (we
fill in the schema with the first search_path entry when it's not given
explicitly in the command)

CREATE TABLE foo();
NOTICE: tag: CREATE TABLE
NOTICE: enforce_local_style{public.foo}: foo

> those problems that is complex enough that we should be happy to get
> the core infrastructure in in the first commit, even with an extremely
> limited amount of functionality, and then build on it.
> Enhance-and-extend is so much easier than a monolithic code drop.

I'm happy to read that, and I'm preparing next patch version (v6) with
that goal in mind.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-20 18:14:20
Message-ID: CA+TgmoZyS18rYFxaAjOVRLtdEe=oTNKoG2nFu_d6UzE-zyVvKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 20, 2012 at 12:14 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I think the OID is better than the name, but if it's easy to pass the
>> name and schema, then I'm fine with it.  But I do think this is one of
>
> It's quite easy to get name and schema from the command yes, here's an
> example of how I'm doing it for some commands:
>
>                case T_CreateStmt:
>                {
>                        CreateStmt *node = (CreateStmt *)parsetree;
>                        cmd->schemaname = RangeVarGetNamespace(node->relation);
>                        cmd->objectname = node->relation->relname;
>                        break;
>                }
>
>                case T_AlterTableStmt:
>                {
>                        AlterTableStmt *node = (AlterTableStmt *)parsetree;
>                        cmd->schemaname = RangeVarGetNamespace(node->relation);
>                        cmd->objectname = node->relation->relname;
>                        break;
>                }
>
>                case T_CreateExtensionStmt:
>                {
>                        cmd->schemaname = NULL;
>                        cmd->objectname = ((CreateExtensionStmt *)parsetree)->extname;
>                        break;
>                }
>
> Getting the OID on the other hand is much harder, because each command
> implements that part as it wants to, and DefineWhatever() functions are
> returning void. We could maybe have them return the main Oid of the
> object created, or we could have the CommandContext structure I'm using
> be a backend global variable that any command would stuff.
>
> In any case, adding support for the OID only works for after trigger
> when talking about CREATE commands and for before trigger if talking
> about DROP commands, assuming that the trigger procedure is run after
> we've been locating said Oid.
>
> It seems to me to be a couple orders of magnitude more work to get the
> Oid here compared to just get the schemaname and objectname. And getting
> those works in all cases as we take them from the command itself (we
> fill in the schema with the first search_path entry when it's not given
> explicitly in the command)
>
>  CREATE TABLE foo();
>  NOTICE:  tag:  CREATE TABLE
>  NOTICE:  enforce_local_style{public.foo}: foo

Hmm, OK. But what happens if the user doesn't specify a schema name explicitly?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-20 19:27:50
Message-ID: 1699.1327087670@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Hmm, OK. But what happens if the user doesn't specify a schema name explicitly?

For the creation case, RangeVarGetCreationNamespace should handle that.
The code Dimitri quoted is wrong, but not that hard to fix.

Unfortunately, the code he quoted for the ALTER case is also wrong,
and harder to fix. Until you've done the lookup you don't know which
schema the referenced object is in. And I don't care for the idea of
doing the lookup twice, as (a) it'll be slower and (b) there are race
cases where it will give the wrong answer, ie return an object other
than the one the ALTER eventually acts on.

Really I think there is not any single point where you can put the
command-trigger hook and be done. In almost every case, the right
place is going to be buried somewhere within the execution of the
command.

regards, tom lane


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-20 19:59:59
Message-ID: m28vl29mbk.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> For the creation case, RangeVarGetCreationNamespace should handle that.
> The code Dimitri quoted is wrong, but not that hard to fix.

Ok.

> Unfortunately, the code he quoted for the ALTER case is also wrong,
> and harder to fix. Until you've done the lookup you don't know which
> schema the referenced object is in. And I don't care for the idea of
> doing the lookup twice, as (a) it'll be slower and (b) there are race
> cases where it will give the wrong answer, ie return an object other
> than the one the ALTER eventually acts on.

Oh. Yes.

> Really I think there is not any single point where you can put the
> command-trigger hook and be done. In almost every case, the right
> place is going to be buried somewhere within the execution of the
> command.

I'm finally realizing it. I already introduced a structure called
CommandContextData to keep track of the current command elements we want
to pass down to command triggers, I think we're saying that this should
be a static variable that commands will need to be editing.

The main problem with that approach is that we will need to spread
calling the command trigger entry points to every supported command, I
wanted to avoid that first. It's no big deal though, as the API is
simple enough.

Expect a new patch made this way early next week.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-01-26 22:00:39
Message-ID: m21uqmunso.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
>> Really I think there is not any single point where you can put the
>> command-trigger hook and be done. In almost every case, the right
>> place is going to be buried somewhere within the execution of the
>> command.
>
> I'm finally realizing it. I already introduced a structure called
> CommandContextData to keep track of the current command elements we want
> to pass down to command triggers, I think we're saying that this should
> be a static variable that commands will need to be editing.

In fact passing it as an argument to the command trigger API is much
simpler and done in the attached. I'm having problems here with my
install and not enough time this week (you don't speak English if you
don't use understatements here and there, right?) so please expect a
one-step-further patch to show the integration concept, not a ready for
commit one just yet.

Next steps are about adding support for more commands, and now that we
have settled on a simpler integration that will be easy. The parameters
sent to the trigger procedure are now the command tag, the main object
Oid, its schema name and its object name. Only the command tag will
never be NULL, all the other columns could be left out when calling an
ANY COMMAND trigger, or a command on a schemaless object.

Note: the per-command integration means the Oid is generally available,
so let's just export it.

An ack about the way it's now implemented would be awesome, and we could
begin to start about which set of command exactly we want supported from
the get go (default to all of them of course, but well, I don't think
that's necessarily the best use of our time given our command list).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v6.patch.gz application/octet-stream 31.4 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-07 12:51:33
Message-ID: CA+U5nMKSsByGBtj1dAXn5X-+oY77gDgdxTquJ0tB6xt+=OSWUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 26, 2012 at 10:00 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
>>> Really I think there is not any single point where you can put the
>>> command-trigger hook and be done.  In almost every case, the right
>>> place is going to be buried somewhere within the execution of the
>>> command.
>>
>> I'm finally realizing it. I already introduced a structure called
>> CommandContextData to keep track of the current command elements we want
>> to pass down to command triggers, I think we're saying that this should
>> be a static variable that commands will need to be editing.
>
> In fact passing it as an argument to the command trigger API is much
> simpler and done in the attached. I'm having problems here with my
> install and not enough time this week (you don't speak English if you
> don't use understatements here and there, right?) so please expect a
> one-step-further patch to show the integration concept, not a ready for
> commit one just yet.
>
> Next steps are about adding support for more commands, and now that we
> have settled on a simpler integration that will be easy. The parameters
> sent to the trigger procedure are now the command tag, the main object
> Oid, its schema name and its object name. Only the command tag will
> never be NULL, all the other columns could be left out when calling an
> ANY COMMAND trigger, or a command on a schemaless object.
>
> Note: the per-command integration means the Oid is generally available,
> so let's just export it.
>
> An ack about the way it's now implemented would be awesome, and we could
> begin to start about which set of command exactly we want supported from
> the get go (default to all of them of course, but well, I don't think
> that's necessarily the best use of our time given our command list).

Looks good so far.

A user centric review of this patch...

Would like a way to say ALL COMMANDS rather than write out list of
supported commands. That list is presumably subject to change, so not
having this could result in bugs in later code.

The example given in the docs has no explanation. More examples and
explanation please. Would specifically be interested in a command
trigger which simply prevents all commands, which has many uses and is
simple enough to be in docs.

Why do we exclude SEQUENCEs? That could well be a problem for intended users

Please explain/add to docs/comments why ALTER VIEW is not supported?
Why not other commands??
Not a problem, but users will ask, so we should write down the answer

Please document why pg_cmdtriggers is a separate table and not
enhancement of pg_triggers?

Thanks

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-14 21:29:05
Message-ID: m2aa4l5cke.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> An ack about the way it's now implemented would be awesome

I'm still missing that, which is only fair, just a ping from me here.
This is about where to hook in the command trigger calls and which
parameters do we call the procedures with. I've now done it the way Tom
hinted me to, so I guess it's ok.

> Would like a way to say ALL COMMANDS rather than write out list of
> supported commands. That list is presumably subject to change, so not
> having this could result in bugs in later code.

That was included in the patch, but docs were not updated, they are now.

> The example given in the docs has no explanation. More examples and
> explanation please. Would specifically be interested in a command
> trigger which simply prevents all commands, which has many uses and is
> simple enough to be in docs.

Done. Docs still need some more love before commit, I'm planning to be
doing that tomorrow and/or the next days.

> Why do we exclude SEQUENCEs? That could well be a problem for intended users
>
> Please explain/add to docs/comments why ALTER VIEW is not supported?
> Why not other commands??
> Not a problem, but users will ask, so we should write down the answer

I've added support for much more commands (49 of them or something,
including SEQUENCEs), will continue doing so tomorrow, I think we should
be ok to support any and all commands that make sense for 9.2.

I still have some basic CREATE commands to add support for, then CLUSTER
and VACUUM and REINDEX, then it's down to ALTER commands where only
ALTER TABLE is currently finished. That's almost mechanical editing
here, the brain only gets used to pick the right place to hook-in.

The other change I need to make is support for regression tests. I
guess we will want one test per command included. The tests will need
to be in their own file and excluded from the parallel schedule too, or
we would be changing the behavior of the other tests under them and
couldn't compare the outputs anymore.

> Please document why pg_cmdtriggers is a separate table and not
> enhancement of pg_triggers?

Because pg_triggers is made for triggers attached to a relation, as we
can see from that unique constraint:

"pg_trigger_tgrelid_tgname_index" UNIQUE, btree (tgrelid, tgname)

A command trigger is attached to a command rather than a relation, so it
needs to live in its own place:

"pg_cmdtrigger_ctgcommand_ctgname_index" UNIQUE, btree (ctgcommand, ctgname)

I'm not sure how and where to include that in the docs, any advice would
be appreciated.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v7.patch.gz application/octet-stream 41.2 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-15 16:36:46
Message-ID: CA+TgmobNK5FJAZTfPzb2nn=SGrneCtf=rUseY9_fakBP_NwaeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 14, 2012 at 4:29 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
>>> An ack about the way it's now implemented would be awesome
> I'm still missing that, which is only fair, just a ping from me here.

I took a brief look at this just now, and in general I'm pleasantly
surprised. But, as you might imagine, I have some reservations:

1. I fear that the collection of commands to which this applies is
going to end up being kind of a random selection. I suggest that for
the first version of the patch, just pick a couple of simple commands
like VACUUM and ANALYZE, and do just those. We can add on more later
easily enough once the basic patch is committed. Also, that way, if
you don't get to absolutely everything, we'll have a coherent subset
of the functionality, rather than a subset defined by "what Dimitri
got to".

2. Currently, we have some objects (views) that support INSTEAD OF
triggers and others (tables) that support BEFORE and AFTER triggers.
I don't see any advantage in supporting both.

3. I am not at all convinced that it's safe to allow command triggers
to silently (i.e. without throwing an error) cancel the operation. I
don't have very much confidence that that is in general a safe thing
to do; there may be code calling this code that expects that such
things will not happen. This diff hunk, for example, scares the crap
out of me:

- /* check that the locales can be loaded */
- CommandCounterIncrement();
- (void) pg_newlocale_from_collation(newoid);
+ /* before or instead of command trigger might have cancelled the comman
+ if (OidIsValid(newoid))
+ {
+ /* check that the locales can be loaded */
+ CommandCounterIncrement();
+ (void) pg_newlocale_from_collation(newoid);
+ }

I don't immediately understand why that's necessary, but I'm sure
there's a good reason, and I bet a nickel that there are other places
where similar adjustments are necessary but you haven't found them. I
think we should rip out the option to silently cancel the command. We
can always add that later, but if we add it here and in the first
commit I think we are going to be chasing bugs for months.

4. This API forces all the work of setting up the CommandContext to be
done regardless of whether any command triggers are present:

+ cmd->objectId = InvalidOid;
+ cmd->objectname = (char *)aggName;
+ cmd->schemaname = get_namespace_name(aggNamespace);

I'll grant you that most people probably do not execute enough DDL for
the cost of those extra get_namespace_name() calls to add up, but I'm
not completely sure it's a negligible overhead in general, and in any
case I think it's a safe bet that there will be continuing demand to
add more information to the set of things that are supplied to the
trigger. So I think that should be rethought. It'd be nice to have a
cheap way to say if (AnyCommandTriggersFor(command_tag)) { ... do
stuff ... }, emphasis on cheap. There's a definitional problem here,
too, which is that you're supplying to the trigger the aggregate name
*as supplied by the user* and the schema name as it exists at the time
we do the reverse lookup. That reverse lookup isn't guaranteed to
work at all, and it's definitely not guaranteed to produce an answer
that's consistent with the aggName field. Maybe there's no better
way, but it would at least be better to pass the namespace OID rather
than the name. That way, the name lookup can be deferred until we are
sure that we actually need to call something.

5. I'm not entirely convinced that it makes sense to support command
triggers on commands that affect shared objects. It seems odd that
such triggers will fire or not fire depending on which database is
currently selected. I think that could lead to user confusion, too.

6. Why do we need all this extra
copyfuncs/equalfuncs/outfuncs/readfuncs support? I thought we were
dropping passing node trees for 9.2.

7. I don't have a strong feeling on what the psql command should be
called, but \dcT seems odd. Why one lower-case letter and one
upper-case letter?

In general, I like the direction that this is going. But I think it
will greatly improve its chances of being successful and relatively
non-buggy if we strip it down to something very simple for an initial
commit, and then add more stuff piece by piece.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-15 20:25:08
Message-ID: m2obszzvx7.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Thanks for your time reviewing that patch!

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I took a brief look at this just now, and in general I'm pleasantly
> surprised. But, as you might imagine, I have some reservations:

Good starting point, let's see about the details :)

> 1. I fear that the collection of commands to which this applies is
> going to end up being kind of a random selection. I suggest that for
> the first version of the patch, just pick a couple of simple commands
> like VACUUM and ANALYZE, and do just those. We can add on more later
> easily enough once the basic patch is committed. Also, that way, if
> you don't get to absolutely everything, we'll have a coherent subset
> of the functionality, rather than a subset defined by "what Dimitri
> got to".

I share that feeling here. Now, given the current scope of the patch, I
think we can add in 9.2 all the commands that make sense to support
(yes, including alter operator family). FWIW, I've been adding support
for 16 forms of ALTER commands today, triple (implementation of alter
rename, owner and namespace are separated).

So while your reaction is perfectly understandable, I don't think that's
the main thing here, you've just happened to see an intermediate state
of things.

> 2. Currently, we have some objects (views) that support INSTEAD OF
> triggers and others (tables) that support BEFORE and AFTER triggers.
> I don't see any advantage in supporting both.

That's because you can cancel an INSERT from a BEFORE trigger, that's
how you can write an INSTEAD OF trigger on a TABLE. As a VIEW does not
support INSERT (directly), an INSTEAD OF trigger is what makes sense
here.

I don't think it's possible to transpose that thinking to command
triggers, so I've been providing for either INSTEAD OF triggers or
BEFORE/AFTER triggers. Note that you can't have both with my current
patch. It's a convenience only thing, but I think it's worth having it:
you don't need to read the trigger procedure to decide if that BEFORE
command trigger is in fact an INSTEAD OF command trigger.

Also, the code footprint for this feature is very small.

> 3. I am not at all convinced that it's safe to allow command triggers
> to silently (i.e. without throwing an error) cancel the operation. I
> don't have very much confidence that that is in general a safe thing

I was trying to offer a parallel with returning NULL from a BEFORE
INSERT trigger, which allows you to cancel it. If that turns out to be
so bad an idea that we need to withdraw it, so be it.

It's still possible to cancel a command by means of RAISE EXCEPTION in a
BEFORE command trigger, or by means of an INSTEAD OF trigger that does
nothing.

> to do; there may be code calling this code that expects that such
> things will not happen. This diff hunk, for example, scares the crap
> out of me:
[...]
> I don't immediately understand why that's necessary, but I'm sure
> there's a good reason, and I bet a nickel that there are other places
> where similar adjustments are necessary but you haven't found them. I

Might be. That idea was sound to me when under the illusion that I
wouldn't have to edit each and every command implementation in order to
implement command triggers. I'm ok to step back now, but read on.

> think we should rip out the option to silently cancel the command. We
> can always add that later, but if we add it here and in the first
> commit I think we are going to be chasing bugs for months.

Ok, I'm going to remove that from the BEFORE command implementation.

What about having both INSTEAD OF triggers (the command is not executed)
and BEFORE trigger (you can cancel its execution only by raising an
exception, and that cancels the whole transaction). I'd like that we'd
be able to keep that feature.

> 4. This API forces all the work of setting up the CommandContext to be
> done regardless of whether any command triggers are present:
>
> + cmd->objectId = InvalidOid;
> + cmd->objectname = (char *)aggName;
> + cmd->schemaname = get_namespace_name(aggNamespace);

At some point while developing the patch I had something way smarter
than that, and centralized. I've not revised the API to get back the
smartness when breaking out of the centralized implementation, because
those elements only can be set from the innards of each command.

The API to list which triggers to run already exists and only need the
command tag, which we might have earlier in the processing. Note that we
have to deal with commands acting on more than one object, as in
RemoveObjects() where we loop over a list and call triggers each time:

https://github.com/dimitri/postgres/compare/master...command_triggers#diff-19

The best I can think of would be to build the list of triggers to call
as soon as we have the command tag, and skip preparing the rest of the
CommandContextData structure when that list comes empty.

Still no general stop-gap, but that would address your point I think.

Now, about INSTEAD OF command triggers, we still need to implement them
in exactly the same spot as BEFORE command triggers, and they're not
making things any more complex in the per-command support code.

> I'll grant you that most people probably do not execute enough DDL for
> the cost of those extra get_namespace_name() calls to add up, but I'm
> not completely sure it's a negligible overhead in general, and in any
> case I think it's a safe bet that there will be continuing demand to
> add more information to the set of things that are supplied to the
> trigger. So I think that should be rethought. It'd be nice to have a
> cheap way to say if (AnyCommandTriggersFor(command_tag)) { ... do
> stuff ... }, emphasis on cheap. There's a definitional problem here,

It's as cheap as scanning a catalog, as of now. I didn't install a new
catalog cache for command triggers, as I don't think this code path is
performance critical enough to pay back for the maintenance cost. Also
consider that a big user of command triggers might have as much as a
couple of triggers (BEFORE/AFTER) per command, that's about 300 of them.

If they go crazy and have more than that with special conditions each
time, let's say that's 1000 rows in the catalog. I don't think that kind
of size and use cases (DDL) calls for a catalog cache here, nor for
micro optimizing things.

> too, which is that you're supplying to the trigger the aggregate name
> *as supplied by the user* and the schema name as it exists at the time
> we do the reverse lookup. That reverse lookup isn't guaranteed to
> work at all, and it's definitely not guaranteed to produce an answer
> that's consistent with the aggName field. Maybe there's no better
> way, but it would at least be better to pass the namespace OID rather
> than the name. That way, the name lookup can be deferred until we are
> sure that we actually need to call something.

I'm confused here, because all error messages that needs to contain the
namespace are doing exactly the same thing as I'm doing in my patch.

> 5. I'm not entirely convinced that it makes sense to support command
> triggers on commands that affect shared objects. It seems odd that
> such triggers will fire or not fire depending on which database is
> currently selected. I think that could lead to user confusion, too.

You mean tablespace here, I guess, what else? I don't think I've added
other shared objects in there yet. I share your analyze btw, will
remove support.

> 6. Why do we need all this extra
> copyfuncs/equalfuncs/outfuncs/readfuncs support? I thought we were
> dropping passing node trees for 9.2.

I didn't clean that out yet, that's the only reason why it's still
there. I would like that this work was not useless :)

> 7. I don't have a strong feeling on what the psql command should be
> called, but \dcT seems odd. Why one lower-case letter and one
> upper-case letter?

Well \dt and \dT are already taken, so I got there. Will change to
something else, e.g. \dct would be your choice here?

> In general, I like the direction that this is going. But I think it
> will greatly improve its chances of being successful and relatively
> non-buggy if we strip it down to something very simple for an initial
> commit, and then add more stuff piece by piece.

I had the feeling we did exactly that when reducing the API not to
provide the parse tree nor the (rewritten) command string. I can see
more features to drop off, like the ability to silently cancel a
statement in a BEFORE command trigger.

I'd like to keep the INSTEAD OF command trigger feature and I think we
can support lots of commands as soon as the integration becomes simple
enough, which is where we stand as soon as we remove the statement
cancelling.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-15 21:11:43
Message-ID: CA+TgmoaUrwgUOTXtbJ9C1AR9MPaVPAOoYXmQw5g5-8spyqiajQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 15, 2012 at 3:25 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
>> 1. I fear that the collection of commands to which this applies is
>> going to end up being kind of a random selection.  I suggest that for
>> the first version of the patch, just pick a couple of simple commands
>> like VACUUM and ANALYZE, and do just those.  We can add on more later
>> easily enough once the basic patch is committed.  Also, that way, if
>> you don't get to absolutely everything, we'll have a coherent subset
>> of the functionality, rather than a subset defined by "what Dimitri
>> got to".
>
> I share that feeling here. Now, given the current scope of the patch, I
> think we can add in 9.2 all the commands that make sense to support
> (yes, including alter operator family). FWIW, I've been adding support
> for 16 forms of ALTER commands today, triple (implementation of alter
> rename, owner and namespace are separated).
>
> So while your reaction is perfectly understandable, I don't think that's
> the main thing here, you've just happened to see an intermediate state
> of things.

I'm just saying that nobody's realistically going to be able to verify
a patch of this size. It's either going to get committed warts and
all, or it's going to not get committed. Decomposing it into a series
of patches would make it possible to actually verify the logic. I
guess on reflection I don't really care whether you decompose it at
this point; the parts are pretty independent and it's easy enough to
revert pieces of it. But if I commit any of this it's certainly not
going to be the whole thing in one go.

> It's still possible to cancel a command by means of RAISE EXCEPTION in a
> BEFORE command trigger, or by means of an INSTEAD OF trigger that does
> nothing.

I think that there is no problem with cancelling a command via RAISE
EXCEPTION. It's an established precedent that errors can be thrown
anywhere, and any code that doesn't deal with that is flat broken.
But I think letting either a BEFORE or INSTEAD trigger cancel the
command is going to break things, and shouldn't be allowed without a
lot of careful study. So -1 from me on supporting INSTEAD triggers in
the first version of this.

>> I'll grant you that most people probably do not execute enough DDL for
>> the cost of those extra get_namespace_name() calls to add up, but I'm
>> not completely sure it's a negligible overhead in general, and in any
>> case I think it's a safe bet that there will be continuing demand to
>> add more information to the set of things that are supplied to the
>> trigger.  So I think that should be rethought.  It'd be nice to have a
>> cheap way to say if (AnyCommandTriggersFor(command_tag)) { ... do
>> stuff ... }, emphasis on cheap.  There's a definitional problem here,
>
> It's as cheap as scanning a catalog, as of now. I didn't install a new
> catalog cache for command triggers, as I don't think this code path is
> performance critical enough to pay back for the maintenance cost. Also
> consider that a big user of command triggers might have as much as a
> couple of triggers (BEFORE/AFTER) per command, that's about 300 of them.

Yowza. A catalog scan is WAY more expensive than a syscache lookup.
I definitely don't think you can afford to have every command result
in an extra index probe into pg_cmdtrigger. You definitely need some
kind of caching there.

Or at least, I think you do. You could try pgbench -f foo.sql, where
foo.sql repeatedly creates and drops a function. See if there's a
significant slowdown with your patch vs. HEAD. If there is, you need
some caching. You might actually need some whole new type of sinval
message to make this work efficiently.

>> too, which is that you're supplying to the trigger the aggregate name
>> *as supplied by the user* and the schema name as it exists at the time
>> we do the reverse lookup.  That reverse lookup isn't guaranteed to
>> work at all, and it's definitely not guaranteed to produce an answer
>> that's consistent with the aggName field.  Maybe there's no better
>> way, but it would at least be better to pass the namespace OID rather
>> than the name.  That way, the name lookup can be deferred until we are
>> sure that we actually need to call something.
>
> I'm confused here, because all error messages that needs to contain the
> namespace are doing exactly the same thing as I'm doing in my patch.

Hmm. I wonder what happens if those errors fire after the schema has
been dropped? I suppose the real answer here is probably to add
enough locking that that can't happen in the first place... so maybe
this isn't an issue for your patch to worry about.

>> 5. I'm not entirely convinced that it makes sense to support command
>> triggers on commands that affect shared objects.  It seems odd that
>> such triggers will fire or not fire depending on which database is
>> currently selected.  I think that could lead to user confusion, too.
>
> You mean tablespace here, I guess, what else? I don't think I've added
> other shared objects in there yet. I share your analyze btw, will
> remove support.

And databases.

>> 7. I don't have a strong feeling on what the psql command should be
>> called, but \dcT seems odd.  Why one lower-case letter and one
>> upper-case letter?
>
> Well \dt and \dT are already taken, so I got there. Will change to
> something else, e.g. \dct would be your choice here?

Yeah, probably.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-15 21:32:33
Message-ID: m2r4xvye8e.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I'm just saying that nobody's realistically going to be able to verify
> a patch of this size. It's either going to get committed warts and
> all, or it's going to not get committed. Decomposing it into a series
> of patches would make it possible to actually verify the logic. I
> guess on reflection I don't really care whether you decompose it at
> this point; the parts are pretty independent and it's easy enough to
> revert pieces of it. But if I commit any of this it's certainly not
> going to be the whole thing in one go.

Ok, I can perfectly understand that. The principled implementation is
not saving us here, we still need to review each call site. The way I
read your comment, I continue working on my big patch and we'll see what
pieces get in, right?

> I think that there is no problem with cancelling a command via RAISE
> EXCEPTION. It's an established precedent that errors can be thrown
> anywhere, and any code that doesn't deal with that is flat broken.

Sure.

> But I think letting either a BEFORE or INSTEAD trigger cancel the
> command is going to break things, and shouldn't be allowed without a
> lot of careful study. So -1 from me on supporting INSTEAD triggers in
> the first version of this.

I'm sad about that, but I hear you. Removing.

> Yowza. A catalog scan is WAY more expensive than a syscache lookup.
> I definitely don't think you can afford to have every command result
> in an extra index probe into pg_cmdtrigger. You definitely need some
> kind of caching there.
>
> Or at least, I think you do. You could try pgbench -f foo.sql, where
> foo.sql repeatedly creates and drops a function. See if there's a
> significant slowdown with your patch vs. HEAD. If there is, you need
> some caching. You might actually need some whole new type of sinval
> message to make this work efficiently.

Ok, I will test that down the road (before the end of this week).

>> I'm confused here, because all error messages that needs to contain the
>> namespace are doing exactly the same thing as I'm doing in my patch.
>
> Hmm. I wonder what happens if those errors fire after the schema has
> been dropped? I suppose the real answer here is probably to add
> enough locking that that can't happen in the first place... so maybe
> this isn't an issue for your patch to worry about.

I get it that I continue doing things this way, get_namespace_name() and
friends are trustworthy as far as I'm concerned.

>> You mean tablespace here, I guess, what else? I don't think I've added
>> other shared objects in there yet. I share your analyze btw, will
>> remove support.
>
> And databases.

Right, on their way.

>> something else, e.g. \dct would be your choice here?
>
> Yeah, probably.

Next version will sports that.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-15 22:21:42
Message-ID: CA+TgmoZyrZLFvhFeD8YtG3k+A18x-CyTVNz=4q416PByNCnwHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 15, 2012 at 4:32 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Ok, I can perfectly understand that.  The principled implementation is
> not saving us here, we still need to review each call site.  The way I
> read your comment, I continue working on my big patch and we'll see what
> pieces get in, right?

Yeah, I think that makes sense. I'll have a look at your next version
when it shows up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-16 17:42:26
Message-ID: m21upuwu7x.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Please find attached version 8 of the patch, which fixes most of your
complaints. Remaining tasks for me here: closing support for all
interesting commands, and improving docs with a section about command
triggers in the general trigger discussion in doc/src/sgml/trigger.sgml.

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> 1. I fear that the collection of commands to which this applies is
> going to end up being kind of a random selection. I suggest that for

I didn't change that (except for shared objects cleaning), and I still
think it's not too much work to add all objects we care offering command
triggers for in this release.

> 2. Currently, we have some objects (views) that support INSTEAD OF
> triggers and others (tables) that support BEFORE and AFTER triggers.
> I don't see any advantage in supporting both.

INSTEAD OF command triggers are now gone.

> 3. I am not at all convinced that it's safe to allow command triggers
> to silently (i.e. without throwing an error) cancel the operation.

That's gone too.

> 4. This API forces all the work of setting up the CommandContext to be
> done regardless of whether any command triggers are present:

Fixed, I've spent quite some time refining the API.

> trigger. So I think that should be rethought. It'd be nice to have a
> cheap way to say if (AnyCommandTriggersFor(command_tag)) { ... do
> stuff ... }, emphasis on cheap. There's a definitional problem here,

The CommandContext initialization is now doing the index scan, then the
command context properties are only filled when we actually want to run
some user functions. Both BEFORE and AFTER command triggers context
filling are now protected.

> 5. I'm not entirely convinced that it makes sense to support command
> triggers on commands that affect shared objects. It seems odd that
> such triggers will fire or not fire depending on which database is
> currently selected. I think that could lead to user confusion, too.

Agreed, they're now gone.

> 6. Why do we need all this extra
> copyfuncs/equalfuncs/outfuncs/readfuncs support? I thought we were
> dropping passing node trees for 9.2.

I still have to clean that up, and as it's not as simple as cancelling
any change I've made in the files, allow me to still defer to another
version of the patch.

> 7. I don't have a strong feeling on what the psql command should be
> called, but \dcT seems odd. Why one lower-case letter and one
> upper-case letter?

The psql command is now \dct.

> In general, I like the direction that this is going. But I think it
> will greatly improve its chances of being successful and relatively
> non-buggy if we strip it down to something very simple for an initial
> commit, and then add more stuff piece by piece.

Having now removed support for the ability to cancel a command without
raising an error, and after some API cleaning, I think it's now much
easier to review the patch and grow confidence into it being sane.

Also, I've added regression tests. They are only exercised by the
serial schedule (make installcheck) because they would randomly ruin the
output of the parallel schedule and make it impossible for pg_regress to
do its job here.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v8.patch.gz application/octet-stream 53.2 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-16 20:53:53
Message-ID: CA+TgmoYi7K-wnYmLRQC+zqnA7p-4zo9=Xqi7R2xb=FtH9=szxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 16, 2012 at 12:42 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Fixed, I've spent quite some time refining the API.

That is better.

What could still be done here is to create a cache (along the lines of
attoptcache.c) that stores a big array with one slot for each
supported command type. The first time a command is executed in a
particular session, we construct a cache entry for that command type,
storing the OIDs of the before and after triggers. Then, instead of
having to do a catalog scan to notice that there are no triggers for a
given command type, we can just reference into the array and see, oh,
we probed before and found no triggers. CacheRegisterSyscacheCallback
or some other mechanism can be used to flush all the cached
information whenever we notice that pg_cmdtrigger has been updated.

Now, I don't know for sure that this is necessary without performance
testing, but I think we ought to try to figure that out. This version
doesn't apply cleanly for me; there is a conflict in functioncmds.c,
which precludes my easily benchmarking it.

It strikes me that this patch introduces a possible security hole: it
allows command triggers to be installed by the database owner, but
that seems like it might allow the database owner can usurp the
privileges of any user who runs one of these commands in his or her
database, including the superuser. Perhaps that could be fixed by
running command triggers as the person who created them rather than as
the superuser, but that seems like it might be lead to other kinds of
privilege-escalation bugs.

If I install a command trigger that prevents all DDL, how do I
uninstall it? Or say I'm the superuser and I want to undo something
my disgruntled DBA did before he quit.

I would much prefer to have DropCmdTrigStmt wedge itself into the
existing DropStmt infrastructure which I just recently worked so hard
on. If you do that, you should find that you can then easily also
support comments on command triggers, security labels on command
triggers (though I don't know if that's useful), and the ability to
include command triggers in extensions.

I am a bit worried about supporting command triggers on statements
that do internal transaction control, such as VACUUM and CREATE INDEX
CONCURRENTLY. Obviously that's very useful and I'd like to have it,
but there's a problem: if the AFTER trigger errors out, it won't undo
the whole command. That might be very surprising. BEFORE triggers
seem OK, and AFTER triggers might be OK too but we at least need to
think hard about how to document that.

I think it would be better to bail on trying to use CREATE TRIGGER and
DROP TRIGGER as a basis for this functionality, and instead create
completely new toplevel statements CREATE COMMAND TRIGGER and DROP
COMMAND TRIGGER. Then, you could decide that all command triggers
live in the same namespace, and therefore to get rid of the command
trigger called bob you can just say "DROP COMMAND TRIGGER bob",
without having to specify the type of command it applies to. It's
still clear that you're dropping a *command* trigger because that's
right in the statement name, whereas it would make me a bit uneasy to
decide that "DROP TRIGGER bob" means a command trigger just by virtue
of the fact that no table name was specified. That would probably
also make it easier to accomplish the above-described goal of
integrating this into the DropStmt infrastructure.

+ if (!superuser())
+ if (!pg_database_ownercheck(MyDatabaseId, GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_DATABASE,
+
get_database_name(MyDatabaseId));

The separate superuser check is superfluous; pg_database_ownercheck()
already handles that.

Can we call InitCommandContext in some centralized place, rather than
separately at lots of different call sites?

I am confused why this is adding a new file called dumpcatalog.c which
looks suspiciously similar to some existing pg_dump code I've been
staring at all day.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-16 21:19:57
Message-ID: 1329426351-sup-5633@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Dimitri Fontaine's message of jue feb 16 14:42:26 -0300 2012:
> Hi,
>
> Please find attached version 8 of the patch, which fixes most of your
> complaints.

Hi,

I didn't like the new cmdtrigger.h file. It's included by a lot of
other headers, and it's also itself including execnodes.h and
parsenodes.h which means practically the whole lot of the source tree
is included in turn. If you could split it, so that the struct
definition is in a different file that's only included by the few .c
files that actually use that struct, I'd be much happier.

... after looking at it more closely, I think only this line needs to be
in a separate file:

typedef struct CommandContextData *CommandContext;

and that file is included by other headers; they don't need the function
declarations or the struct definition.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-16 21:21:30
Message-ID: m2obsyscdh.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

First answer now, new patch version tomorrow.

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> That is better.

Cool :)

> What could still be done here is to create a cache (along the lines of
> attoptcache.c) that stores a big array with one slot for each
> supported command type. The first time a command is executed in a
> particular session, we construct a cache entry for that command type,
> storing the OIDs of the before and after triggers. Then, instead of
> having to do a catalog scan to notice that there are no triggers for a
> given command type, we can just reference into the array and see, oh,
> we probed before and found no triggers. CacheRegisterSyscacheCallback
> or some other mechanism can be used to flush all the cached
> information whenever we notice that pg_cmdtrigger has been updated.

I guess it's another spelling for a catalog cache, so I'll look at what
it takes to build either a full catcache or this array cache you're
talking about tomorrow.

> Now, I don't know for sure that this is necessary without performance
> testing, but I think we ought to try to figure that out. This version
> doesn't apply cleanly for me; there is a conflict in functioncmds.c,
> which precludes my easily benchmarking it.

Means I need to update my master's branch and merge conflicts. You could
also test right from my github branch too, I guess.

https://github.com/dimitri/postgres
https://github.com/dimitri/postgres/tree/command_triggers

> It strikes me that this patch introduces a possible security hole: it
> allows command triggers to be installed by the database owner, but
> that seems like it might allow the database owner can usurp the
> privileges of any user who runs one of these commands in his or her
> database, including the superuser. Perhaps that could be fixed by
> running command triggers as the person who created them rather than as
> the superuser, but that seems like it might be lead to other kinds of
> privilege-escalation bugs.

We could decide command triggers are superuser only. Security is not
something I'm very strong at, so I'll leave it up to you to decide.

> If I install a command trigger that prevents all DDL, how do I
> uninstall it? Or say I'm the superuser and I want to undo something
> my disgruntled DBA did before he quit.

Good catch, I guess we need to remove creating and dropping a command
trigger to the list of supported commands in the ANY COMMAND list.

> I would much prefer to have DropCmdTrigStmt wedge itself into the
> existing DropStmt infrastructure which I just recently worked so hard
> on. If you do that, you should find that you can then easily also
> support comments on command triggers, security labels on command
> triggers (though I don't know if that's useful), and the ability to
> include command triggers in extensions.

Ah yes, that too was on the TODO list, I just forgot about it. I still
remember the merge conflicts when that patch went it, you know… :)

> I am a bit worried about supporting command triggers on statements
> that do internal transaction control, such as VACUUM and CREATE INDEX
> CONCURRENTLY. Obviously that's very useful and I'd like to have it,
> but there's a problem: if the AFTER trigger errors out, it won't undo
> the whole command. That might be very surprising. BEFORE triggers
> seem OK, and AFTER triggers might be OK too but we at least need to
> think hard about how to document that.

I think we should limit the support to BEFORE command trigger only. It
was unclear to me how to solve the problem for the AFTER command case,
and if you're unclear to, then there's not that many open questions.

> I think it would be better to bail on trying to use CREATE TRIGGER and
> DROP TRIGGER as a basis for this functionality, and instead create
> completely new toplevel statements CREATE COMMAND TRIGGER and DROP
> COMMAND TRIGGER. Then, you could decide that all command triggers
> live in the same namespace, and therefore to get rid of the command
> trigger called bob you can just say "DROP COMMAND TRIGGER bob",
> without having to specify the type of command it applies to. It's

I have no strong feeling about that. Would that require that COMMAND be
more reserved than it currently is (UNRESERVED_KEYWORD)?

> still clear that you're dropping a *command* trigger because that's
> right in the statement name, whereas it would make me a bit uneasy to
> decide that "DROP TRIGGER bob" means a command trigger just by virtue
> of the fact that no table name was specified. That would probably
> also make it easier to accomplish the above-described goal of
> integrating this into the DropStmt infrastructure.

The other thing is that you might want to drop the trigger from only one
command, of course we could support both syntax. We could also add
support for the following one:

DROP TRIGGER bob ON ALL COMMANDS;

As ALL is already a reserved word, that will work.

> + if (!superuser())
> + if (!pg_database_ownercheck(MyDatabaseId, GetUserId()))
> + aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_DATABASE,
> +
> get_database_name(MyDatabaseId));
>
> The separate superuser check is superfluous; pg_database_ownercheck()
> already handles that.

Ok, let's see if we keep the feature open to database owners then.

> Can we call InitCommandContext in some centralized place, rather than
> separately at lots of different call sites?

Then you have to either make the current command context a backend
private global variable, or amend even more call sites to pass it down.
The global variable idea does not work, see RemoveRelations() and
RemoveObjects() which are handling an array of command contexts.

So do you prefer lots of InitCommandContext() or adding another parameter
to almost all functions called by standard_ProcessUtility()?

> I am confused why this is adding a new file called dumpcatalog.c which
> looks suspiciously similar to some existing pg_dump code I've been
> staring at all day.

Merge or diff issue I think, I will look into that, I don't know.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-16 21:30:49
Message-ID: m2ipj6sbxy.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I didn't like the new cmdtrigger.h file. It's included by a lot of
> other headers, and it's also itself including execnodes.h and
> parsenodes.h which means practically the whole lot of the source tree
> is included in turn. If you could split it, so that the struct
> definition is in a different file that's only included by the few .c
> files that actually use that struct, I'd be much happier.

I didn't realize that, thanks for reviewing!

> ... after looking at it more closely, I think only this line needs to be
> in a separate file:
>
> typedef struct CommandContextData *CommandContext;
>
> and that file is included by other headers; they don't need the function
> declarations or the struct definition.

I'll look into that tomorrow then. The same trick is already applied to
Relation and RelationData (resp. in src/include/utils/relcache.h and
src/include/utils/rel.h), and only now I understand why :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-16 22:19:54
Message-ID: CA+TgmoZad+su11+0hjwy1LGKBKyp9oVE669YbGMvo8MkeU6--Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 16, 2012 at 4:21 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> We could decide command triggers are superuser only.  Security is not
> something I'm very strong at, so I'll leave it up to you to decide.

That's certainly the easiest option. If you don't feel passionate
about spending a lot of energy figuring out how to make it secure,
then I suggest we just restrict it to superusers until someone does.

>> If I install a command trigger that prevents all DDL, how do I
>> uninstall it?  Or say I'm the superuser and I want to undo something
>> my disgruntled DBA did before he quit.
>
> Good catch, I guess we need to remove creating and dropping a command
> trigger to the list of supported commands in the ANY COMMAND list.

Another option would be to add a PGC_SUSET boolean GUC that can be
used to disable command triggers. I think that might be more
flexible, not to mention useful for recursion prevention.

>> I am a bit worried about supporting command triggers on statements
>> that do internal transaction control, such as VACUUM and CREATE INDEX
>> CONCURRENTLY.  Obviously that's very useful and I'd like to have it,
>> but there's a problem: if the AFTER trigger errors out, it won't undo
>> the whole command.  That might be very surprising.  BEFORE triggers
>> seem OK, and AFTER triggers might be OK too but we at least need to
>> think hard about how to document that.
>
> I think we should limit the support to BEFORE command trigger only.  It
> was unclear to me how to solve the problem for the AFTER command case,
> and if you're unclear to, then there's not that many open questions.

Works for me.

>> I think it would be better to bail on trying to use CREATE TRIGGER and
>> DROP TRIGGER as a basis for this functionality, and instead create
>> completely new toplevel statements CREATE COMMAND TRIGGER and DROP
>> COMMAND TRIGGER.  Then, you could decide that all command triggers
>> live in the same namespace, and therefore to get rid of the command
>> trigger called bob you can just say "DROP COMMAND TRIGGER bob",
>> without having to specify the type of command it applies to.  It's
>
> I have no strong feeling about that.  Would that require that COMMAND be
> more reserved than it currently is (UNRESERVED_KEYWORD)?

It shouldn't.

>> still clear that you're dropping a *command* trigger because that's
>> right in the statement name, whereas it would make me a bit uneasy to
>> decide that "DROP TRIGGER bob" means a command trigger just by virtue
>> of the fact that no table name was specified.  That would probably
>> also make it easier to accomplish the above-described goal of
>> integrating this into the DropStmt infrastructure.
>
> The other thing is that you might want to drop the trigger from only one
> command, of course we could support both syntax.  We could also add
> support for the following one:
>
>  DROP TRIGGER bob ON ALL COMMANDS;

Uh, hold on. Creating a trigger on multiple commands ought to only
create one entry in pg_cmdtrigger. If you drop it, you drop the whole
thing. Changing which commands the trigger applies to would be the
job for ALTER, not DROP. But note that we have no similar
functionality for regular triggers, so I can't think we really need it
here either.

>> Can we call InitCommandContext in some centralized place, rather than
>> separately at lots of different call sites?
>
> Then you have to either make the current command context a backend
> private global variable, or amend even more call sites to pass it down.
> The global variable idea does not work, see RemoveRelations() and
> RemoveObjects() which are handling an array of command contexts.
>
> So do you prefer lots of InitCommandContext() or adding another parameter
> to almost all functions called by standard_ProcessUtility()?

Blech. Maybe we should just have CommandFiresTriggers initialize it
if not already done?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-16 22:38:47
Message-ID: m2obsyqu88.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Feb 16, 2012 at 4:21 PM, Dimitri Fontaine
> <dimitri(at)2ndquadrant(dot)fr> wrote:
> That's certainly the easiest option. If you don't feel passionate
> about spending a lot of energy figuring out how to make it secure,
> then I suggest we just restrict it to superusers until someone does.

Works for me.

>>> If I install a command trigger that prevents all DDL, how do I
>>> uninstall it?  Or say I'm the superuser and I want to undo something
>>> my disgruntled DBA did before he quit.
>>
>> Good catch, I guess we need to remove creating and dropping a command
>> trigger to the list of supported commands in the ANY COMMAND list.
>
> Another option would be to add a PGC_SUSET boolean GUC that can be
> used to disable command triggers. I think that might be more
> flexible, not to mention useful for recursion prevention.

Wait, we already have ALTER TRIGGER bob ON ANY COMMAND SET DISABLED;
which I had forgotten about in the previous answer, so I think we're
good as it is. That's how I prevented recursion in some of my tests
(not included in the regress tests, was using INSTEAD OF).

>>  DROP TRIGGER bob ON ALL COMMANDS;
>
> Uh, hold on. Creating a trigger on multiple commands ought to only
> create one entry in pg_cmdtrigger. If you drop it, you drop the whole
> thing. Changing which commands the trigger applies to would be the
> job for ALTER, not DROP. But note that we have no similar
> functionality for regular triggers, so I can't think we really need it
> here either.

Why would we do it that way (a single entry for multiple commands)? The
way it is now, it's only syntactic sugar, so I think it's easier to
implement, document and use.

>> So do you prefer lots of InitCommandContext() or adding another parameter
>> to almost all functions called by standard_ProcessUtility()?
>
> Blech. Maybe we should just have CommandFiresTriggers initialize it
> if not already done?

Thing is, in a vast number of cases (almost of ALTER OBJECT name,
namespace and owner) you don't have the Node * parse tree any more from
the place where you check for CommandFiresTriggers(), so you can't
initialize there. That's part of the fun.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 05:02:27
Message-ID: CA+TgmoaKAcQBTeTyKUhJWfs+N=_PY8EPALcTPC0tnzuMQiuPcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 16, 2012 at 5:38 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
>> Another option would be to add a PGC_SUSET boolean GUC that can be
>> used to disable command triggers.  I think that might be more
>> flexible, not to mention useful for recursion prevention.
>
> Wait, we already have ALTER TRIGGER bob ON ANY COMMAND SET DISABLED;
> which I had forgotten about in the previous answer, so I think we're
> good as it is.  That's how I prevented recursion in some of my tests
> (not included in the regress tests, was using INSTEAD OF).

Eh, so what happens then if someone sets a command trigger on ALTER TRIGGER?

>>>  DROP TRIGGER bob ON ALL COMMANDS;
>>
>> Uh, hold on.  Creating a trigger on multiple commands ought to only
>> create one entry in pg_cmdtrigger.  If you drop it, you drop the whole
>> thing.  Changing which commands the trigger applies to would be the
>> job for ALTER, not DROP.  But note that we have no similar
>> functionality for regular triggers, so I can't think we really need it
>> here either.
>
> Why would we do it that way (a single entry for multiple commands)?  The
> way it is now, it's only syntactic sugar, so I think it's easier to
> implement, document and use.

Well, for one thing, it's consistent with how we handle it for regular
triggers. For two things, if you create an object named bob, you
expect to end up with an object named bob - not 47 objects (or
whatever) that are all named bob. Also, suppose you create a trigger
on ALL COMMANDS, and then a new version of PG adds a new command.
When you dump and reload, do you expect to end up with a trigger on
all commands that existed in the old version, or all the commands that
exist in the new version? Or conversely, suppose we get rid of a
command in a future release. How will we handle that? I can't think
of another example of where a CREATE command creates multiple objects
like that.

>>> So do you prefer lots of InitCommandContext() or adding another parameter
>>> to almost all functions called by standard_ProcessUtility()?
>>
>> Blech.  Maybe we should just have CommandFiresTriggers initialize it
>> if not already done?
>
> Thing is, in a vast number of cases (almost of ALTER OBJECT name,
> namespace and owner) you don't have the Node * parse tree any more from
> the place where you check for CommandFiresTriggers(), so you can't
> initialize there. That's part of the fun.

Hmm, I'll look at this in more detail next time through.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 08:54:17
Message-ID: m239a9rgau.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Wait, we already have ALTER TRIGGER bob ON ANY COMMAND SET DISABLED;
>
> Eh, so what happens then if someone sets a command trigger on ALTER TRIGGER?

We should remove support for command triggers on alter command triggers.
Well I could also go with the GUC idea, it's only that I'm not entirely
sold it's the best we can do yet and I'd like to avoid yet another GUC.

>> Why would we do it that way (a single entry for multiple commands)?  The
>> way it is now, it's only syntactic sugar, so I think it's easier to
>> implement, document and use.
>
> Well, for one thing, it's consistent with how we handle it for regular
> triggers. For two things, if you create an object named bob, you

I don't think so, if you attach the same procedure to more than one
table each time with the same name, you get multiple entries in
pg_trigger:

"pg_trigger_tgrelid_tgname_index" UNIQUE, btree (tgrelid, tgname)

create trigger footg after insert on tg.foo for each row execute procedure tg.trigfunc();
create trigger footg after insert on tg.bar for each row execute procedure tg.trigfunc();
create trigger footg after insert on tg.baz for each row execute procedure tg.trigfunc();

select oid, tgrelid::regclass, tgname, tgfoid, tgtype, tgenabled from pg_trigger;
oid | tgrelid | tgname | tgfoid | tgtype | tgenabled
--------+---------+--------+--------+--------+-----------
533210 | tg.foo | footg | 533209 | 5 | O
533211 | tg.bar | footg | 533209 | 5 | O
533212 | tg.baz | footg | 533209 | 5 | O
(3 rows)

The difference I see is that in the table trigger case you don't have a
syntax that allows you to do the 3 operations I did above in 1 command,
and it's easy to provide for this capability with command triggers (and
the use case is much bigger too, as all command triggers are given the
same arguments and all expected to return void).

> expect to end up with an object named bob - not 47 objects (or
> whatever) that are all named bob. Also, suppose you create a trigger
> on ALL COMMANDS, and then a new version of PG adds a new command.

You create a trigger on ANY command :)

> When you dump and reload, do you expect to end up with a trigger on
> all commands that existed in the old version, or all the commands that
> exist in the new version? Or conversely, suppose we get rid of a
> command in a future release. How will we handle that? I can't think
> of another example of where a CREATE command creates multiple objects
> like that.

ANY COMMAND triggers are just one entry in pg_cmdtrigger, with the
command name registered as "ANY", which is only safe as long as we don't
provide a new SQL command whose command tag is ANY. We could decide that
we want to name this magic ANY command "__ANY__", but it does not look
like it fits the project usual naming style.

--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 13:19:21
Message-ID: m2ty2poaw6.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> ... after looking at it more closely, I think only this line needs to be
> in a separate file:
>
> typedef struct CommandContextData *CommandContext;

Files like src/backend/commands/tablecmds.c and others need both the
structure and the pointer, so we need both. What about putting those
definitions into src/include/catalog/pg_cmdtrigger.h?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 13:31:56
Message-ID: m2k43loab7.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I didn't like the new cmdtrigger.h file. It's included by a lot of
> other headers, and it's also itself including execnodes.h and

It turns around that this file does not need including execnode.h, I've
cleaned that up now (compile ok, make installcheck ok).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 13:37:23
Message-ID: CACMqXCLQzHR8gmZne2EZARZon8J_0H+q0=dg55cbcG98GtYWPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 10:54 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Wait, we already have ALTER TRIGGER bob ON ANY COMMAND SET DISABLED;
>>
>> Eh, so what happens then if someone sets a command trigger on ALTER TRIGGER?
>
> We should remove support for command triggers on alter command triggers.
> Well I could also go with the GUC idea, it's only that I'm not entirely
> sold it's the best we can do yet and I'd like to avoid yet another GUC.

Btw, we already have a GUC for triggers: session_replication_role,
how will the command triggers follow that?

--
marko


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 14:04:48
Message-ID: m21upto8sf.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen <markokr(at)gmail(dot)com> writes:
> Btw, we already have a GUC for triggers: session_replication_role,
> how will the command triggers follow that?

Note that the replica here in my mind would have been an Hot Standby
node, and having the standby run the replica/always command triggers is
not implemented yet, because you can't run DDL on the standby.

Now that you mention it we should also provide support the GUC here and
only fire the triggers matching it. I'm working on that now.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 14:11:08
Message-ID: CA+TgmoafP6f1qpME-ZYm--P7SyKEB3iZ9Y+BArdkqYjc0cjF8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 3:54 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Wait, we already have ALTER TRIGGER bob ON ANY COMMAND SET DISABLED;
>>
>> Eh, so what happens then if someone sets a command trigger on ALTER TRIGGER?
>
> We should remove support for command triggers on alter command triggers.
> Well I could also go with the GUC idea, it's only that I'm not entirely
> sold it's the best we can do yet and I'd like to avoid yet another GUC.

I'm OK with not supporting command triggers on command triggers, but I
still think the GUC is useful. Keep in mind that flipping a GUC is
really cheap compared to a catalog change, and can affect just one
session. Those are significant advantages. However, if you want to
just not support triggers on statements that modify command triggers,
I'm OK with that, too.

>>> Why would we do it that way (a single entry for multiple commands)?  The
>>> way it is now, it's only syntactic sugar, so I think it's easier to
>>> implement, document and use.
>>
>> Well, for one thing, it's consistent with how we handle it for regular
>> triggers. For two things, if you create an object named bob, you
>
> I don't think so, if you attach the same procedure to more than one
> table each time with the same name, you get multiple entries in
> pg_trigger:
>
>    "pg_trigger_tgrelid_tgname_index" UNIQUE, btree (tgrelid, tgname)
>
> create trigger footg after insert on tg.foo for each row execute procedure tg.trigfunc();
> create trigger footg after insert on tg.bar for each row execute procedure tg.trigfunc();
> create trigger footg after insert on tg.baz for each row execute procedure tg.trigfunc();

Sure, but if you run the same trigger on multiple operations - INSERT
OR UPDATE OR DELETE.

>> expect to end up with an object named bob - not 47 objects (or
>> whatever) that are all named bob.  Also, suppose you create a trigger
>> on ALL COMMANDS, and then a new version of PG adds a new command.
>
> You create a trigger on ANY command :)

Oh. Well, then +1 for me on the ANY COMMAND thing, but -1 on ALL
COMMANDS. I can't see that there's enough utility to having a
bulk-create functionality to justify its existence. The ANY COMMAND
thing I think is what people will want.

> ANY COMMAND triggers are just one entry in pg_cmdtrigger, with the
> command name registered as "ANY", which is only safe as long as we don't
> provide a new SQL command whose command tag is ANY. We could decide that
> we want to name this magic ANY command "__ANY__", but it does not look
> like it fits the project usual naming style.

I am thinking that we should ditch the idea of keeping track of
commands using strings and instead assign a bunch of integer constants
using a big enum. The parser can translate from what the user enters
to these constants and then use those throughout, including in the
system catalogs.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 14:12:06
Message-ID: CACMqXCJ+c6gOAKvyGJs=bM+bbkez2yWdqhLZUWYhJhfRwLB=VQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 4:04 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Marko Kreen <markokr(at)gmail(dot)com> writes:
>> Btw, we already have a GUC for triggers: session_replication_role,
>> how will the command triggers follow that?
>
> Note that the replica here in my mind would have been an Hot Standby
> node, and having the standby run the replica/always command triggers is
> not implemented yet, because you can't run DDL on the standby.

But we will be able? Thats news to me.

I'm more interested whether it follows ordinary trigger
behaviour on Slony/Londiste slave node.

--
marko


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 15:42:59
Message-ID: m2sji9lb3w.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Please find attached version 9 of this patch, answering to most open
comments about it.

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> That's certainly the easiest option. If you don't feel passionate
> about spending a lot of energy figuring out how to make it secure,
> then I suggest we just restrict it to superusers until someone does.

Done.

>> Good catch, I guess we need to remove creating and dropping a command
>> trigger to the list of supported commands in the ANY COMMAND list.

Done.

> Another option would be to add a PGC_SUSET boolean GUC that can be
> used to disable command triggers. I think that might be more
> flexible, not to mention useful for recursion prevention.

I implemented support for the session_replication_role GUC so that you
can SET session_replication_role TO replica; and avoid any command
trigger running (if you didn't ALTER them from their default setting).

Of course you can also use the GUC in its intended purpose, as said
Marko.

>>> [CREATE INDEX CONCURRENTLY and VACUUM]
>> I think we should limit the support to BEFORE command trigger only.  It
>> was unclear to me how to solve the problem for the AFTER command case,
>> and if you're unclear to, then there's not that many open questions.
>
> Works for me.

Done. Of course at the time the command trigger is created you can't
distinguish if the CREATE INDEX command will be run CONCURRENTLY or not,
so I've decided to issue a WARNING about it.

>>> I think it would be better to bail on trying to use CREATE TRIGGER and
>>> DROP TRIGGER as a basis for this functionality, and instead create
>>> completely new toplevel statements CREATE COMMAND TRIGGER and DROP
>>> COMMAND TRIGGER.  Then, you could decide that all command triggers
>>> live in the same namespace, and therefore to get rid of the command
>>> trigger called bob you can just say "DROP COMMAND TRIGGER bob",
>>> without having to specify the type of command it applies to.  It's

Not done yet, I'm not sure we took a decision on this one. Also note
that I chose to give DDL on command triggers their own command tag so
that it's easy to separate them out in the command trigger context.

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Wait, we already have ALTER TRIGGER bob ON ANY COMMAND SET DISABLED;
> Eh, so what happens then if someone sets a command trigger on ALTER TRIGGER?

You would need to set a command trigger on ALTER COMMAND TRIGGER and
that's not supported. Triggers on command "ALTER TRIGGER" in fact will
not get fired on ALTER TRIGGER ... ON COMMAND ...

I guess that's a point to change the grammar the way you're hinting:

CREATE COMMAND TRIGGER
DROP COMMAND TRIGGER
ALTER COMMAND TRIGGER

That also needs each their own reference page. It will be easier on the
users I guess. Will work on that.

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I'm OK with not supporting command triggers on command triggers, but I
> still think the GUC is useful. Keep in mind that flipping a GUC is
> really cheap compared to a catalog change, and can affect just one
> session. Those are significant advantages. However, if you want to
> just not support triggers on statements that modify command triggers,
> I'm OK with that, too.

Both done, if you agree with using session_replication_role here.

> Sure, but if you run the same trigger on multiple operations - INSERT
> OR UPDATE OR DELETE.

I failed to see that analogy. The other problem with the current way of
doing things is that I can't integrate with RemoveObjects(), and I think
you won't like that :)

>> You create a trigger on ANY command :)
>
> Oh. Well, then +1 for me on the ANY COMMAND thing, but -1 on ALL
> COMMANDS. I can't see that there's enough utility to having a
> bulk-create functionality to justify its existence. The ANY COMMAND
> thing I think is what people will want.

There's no such thing as ALL COMMANDS in the patch, there's a syntactic
sugar allowing you to create and drop more than one command trigger in a
single command, much as we have DROP TABLE foo, bar, baz;

> I am thinking that we should ditch the idea of keeping track of
> commands using strings and instead assign a bunch of integer constants
> using a big enum. The parser can translate from what the user enters
> to these constants and then use those throughout, including in the
> system catalogs.

It's not really command strings but the Command Tag we've historically
been using up until now. You're saying that it should remain the same
for users but change internally. No strong opinion from me here, apart
from it being more code for doing the same thing.

I mean we will need to get the command tag from the parse tree then
change that into an integer thanks to yet another huge switch that will
have to be maintained every time we add a new command. We only have too
many of them now (not proposing to remove some).

> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>> I didn't like the new cmdtrigger.h file. It's included by a lot of
>> other headers, and it's also itself including execnodes.h and

Fixed in the attach.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v9.patch.gz application/octet-stream 50.0 KB

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 17:06:05
Message-ID: m2ipj5jsoy.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
>>>> I think it would be better to bail on trying to use CREATE TRIGGER and
>>>> DROP TRIGGER as a basis for this functionality, and instead create
>>>> completely new toplevel statements CREATE COMMAND TRIGGER and DROP
>>>> COMMAND TRIGGER.  Then, you could decide that all command triggers
>>>> live in the same namespace, and therefore to get rid of the command
>>>> trigger called bob you can just say "DROP COMMAND TRIGGER bob",
>>>> without having to specify the type of command it applies to.  It's
>
> I guess that's a point to change the grammar the way you're hinting:
>
> CREATE COMMAND TRIGGER
> DROP COMMAND TRIGGER
> ALTER COMMAND TRIGGER
>
> That also needs each their own reference page. It will be easier on the
> users I guess. Will work on that.

FWIW I've pushed such a change to my github repository, I'm not spamming
the list with v10 already though, unless someone wants to see it.

https://github.com/dimitri/postgres/commit/82996b45aae10f12818f1e3097ba805fff22a97b

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 17:07:44
Message-ID: CA+TgmoZSoX_rR-kFXyWSRLTKasfN38GQZ=H1Kh42=ahuGLCcQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 10:42 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Done.  Of course at the time the command trigger is created you can't
> distinguish if the CREATE INDEX command will be run CONCURRENTLY or not,
> so I've decided to issue a WARNING about it.

That seems icky. Whatever warnings need to be given should be in the
documentation, not at runtime.

Another idea here would be to treat CREATE INDEX CONCURRENTLY as if it
were a separate toplevel command, for command-trigger purposes only.
But I'm not sure that's any better.

> You would need to set a command trigger on ALTER COMMAND TRIGGER and
> that's not supported. Triggers on command "ALTER TRIGGER" in fact will
> not get fired on ALTER TRIGGER ... ON COMMAND ...
>
> I guess that's a point to change the grammar the way you're hinting:

Indeed it is. :-)

>  CREATE COMMAND TRIGGER
>  DROP COMMAND TRIGGER
>  ALTER COMMAND TRIGGER
>
> That also needs each their own reference page.  It will be easier on the
> users I guess.  Will work on that.

Yeah, I think that will be much more clear, and not really that much
work for you. It will also make the reference pages simpler, I think,
since there are significant behavioral differences between ordinary
triggers and command triggers.

> Both done, if you agree with using session_replication_role here.

It's better than a sharp stick in the eye. I'm not convinced it's
ideal, but I don't feel strongly enough about the issue to push on it
for now, as long as we disallow command triggers on CREATE/ALTER/DROP
COMMAND TRIGGER.

>> Sure, but if you run the same trigger on multiple operations - INSERT
>> OR UPDATE OR DELETE.
>
> I failed to see that analogy. The other problem with the current way of
> doing things is that I can't integrate with RemoveObjects(), and I think
> you won't like that :)

I sure won't. I think ultimately you won't like it either, since the
objectaddress infrastructure is also needed to make this work with
extensions. And I assume you would agree with me that extensions are
an important feature. :-)

>>> You create a trigger on ANY command :)
>>
>> Oh.  Well, then +1 for me on the ANY COMMAND thing, but -1 on ALL
>> COMMANDS.  I can't see that there's enough utility to having a
>> bulk-create functionality to justify its existence.  The ANY COMMAND
>> thing I think is what people will want.
>
> There's no such thing as ALL COMMANDS in the patch, there's a syntactic
> sugar allowing you to create and drop more than one command trigger in a
> single command, much as we have DROP TABLE foo, bar, baz;

OK, I'll look more carefully.

>> I am thinking that we should ditch the idea of keeping track of
>> commands using strings and instead assign a bunch of integer constants
>> using a big enum.  The parser can translate from what the user enters
>> to these constants and then use those throughout, including in the
>> system catalogs.
>
> It's not really command strings but the Command Tag we've historically
> been using up until now.  You're saying that it should remain the same
> for users but change internally.  No strong opinion from me here, apart
> from it being more code for doing the same thing.

Well, the reason I thought it might be better is for caching purposes.
If you have a cache of which triggers need to be run for which
commands, an integer index into an array will be a lot faster than a
hash table lookup. But it may bear more examination, so I don't feel
this is a must-do at this point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-17 17:45:29
Message-ID: m2ty2picau.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Feb 17, 2012 at 10:42 AM, Dimitri Fontaine
> <dimitri(at)2ndquadrant(dot)fr> wrote:
>> Done.  Of course at the time the command trigger is created you can't
>> distinguish if the CREATE INDEX command will be run CONCURRENTLY or not,
>> so I've decided to issue a WARNING about it.
>
> That seems icky. Whatever warnings need to be given should be in the
> documentation, not at runtime.

Agreed. We're still making user visible changes though, so I wanted to
defer docs editing some more. Even documented, a WARNING seems a good
idea to me, but maybe you would prefer a NOTICE?

> Another idea here would be to treat CREATE INDEX CONCURRENTLY as if it
> were a separate toplevel command, for command-trigger purposes only.
> But I'm not sure that's any better.

The patch as it stands will fire the AFTER command trigger only when not
using the CONCURRENTLY variant, which I think is enough, once documented.

> Yeah, I think that will be much more clear, and not really that much
> work for you. It will also make the reference pages simpler, I think,
> since there are significant behavioral differences between ordinary
> triggers and command triggers.

Yeah done this way, still needed an overview section in triggers.sgml I
think.

>> Both done, if you agree with using session_replication_role here.
>
> It's better than a sharp stick in the eye. I'm not convinced it's
> ideal, but I don't feel strongly enough about the issue to push on it
> for now, as long as we disallow command triggers on CREATE/ALTER/DROP
> COMMAND TRIGGER.

We simply don't support those commands as far as command triggers are
concerned, which seems to be like a sane limitation.

> I sure won't. I think ultimately you won't like it either, since the
> objectaddress infrastructure is also needed to make this work with
> extensions. And I assume you would agree with me that extensions are
> an important feature. :-)

How you'd guess about that :)

Will see about it later tonight, I'd like to keep the multiple command
drop command trigger spelling.

>> It's not really command strings but the Command Tag we've historically
>> been using up until now.  You're saying that it should remain the same
>> for users but change internally.  No strong opinion from me here, apart
>> from it being more code for doing the same thing.
>
> Well, the reason I thought it might be better is for caching purposes.
> If you have a cache of which triggers need to be run for which
> commands, an integer index into an array will be a lot faster than a
> hash table lookup. But it may bear more examination, so I don't feel
> this is a must-do at this point.

I've been trying to get a feeling of the runtime performance with
command triggers in the line you suggested, even if I'd be very
surprised that a couple of index scans are anything but noise when
completing a DDL command.

I'm having those results on my development machine:

duration: 30 s
number of transactions actually processed: 42390
tps = 1413.004051 (including connections establishing)
tps = 1413.505517 (excluding connections establishing)
statement latencies in milliseconds:
0.705843 create or replace function plus1(int) returns bigint language sql as $$ select $1::bigint + 1; $$;

I don't have the setup to compare that easily to current master's
branch, I was hoping you would run tests on your side (btw the previous
patch version is rebased against master and cleaned up, should be fine
now — oh and in context format).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Smith <greg(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Command Triggers
Date: 2012-02-18 21:24:01
Message-ID: m2boovhm32.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> That seems icky. Whatever warnings need to be given should be in the
> documentation, not at runtime.

So, new patch is attached, with some more docs, but it's still shy on it
though, mainly because I don't have a setup to build docs easily here.
I intend to be completing docs on Monday.

If you agree with the way it's implemented now, I'll integrate some more
commands (all our commands expect for those working with shared objects,
and restricting away after command trigger support for commands that
handle transactions themselves).

Once those two items are done, I believe we will reach the Grail^W
“ready for commit” status, and that should happen early next week.

>>  CREATE COMMAND TRIGGER
>>  DROP COMMAND TRIGGER
>>  ALTER COMMAND TRIGGER
>
> Yeah, I think that will be much more clear, and not really that much
> work for you. It will also make the reference pages simpler, I think,
> since there are significant behavioral differences between ordinary
> triggers and command triggers.

This is implementing in the attached patch, v10.

>> I failed to see that analogy. The other problem with the current way of
>> doing things is that I can't integrate with RemoveObjects(), and I think
>> you won't like that :)
>
> I sure won't. I think ultimately you won't like it either, since the
> objectaddress infrastructure is also needed to make this work with
> extensions. And I assume you would agree with me that extensions are
> an important feature. :-)

So now I've integrated with RemoveObjects(), which means it's no longer
possible to drop a command trigger attached to more than one command in
one go. Not a show stopper, but maybe worth noting here.

>> It's not really command strings but the Command Tag we've historically
>> been using up until now.  You're saying that it should remain the same
>> for users but change internally.  No strong opinion from me here, apart
>> from it being more code for doing the same thing.
>
> Well, the reason I thought it might be better is for caching purposes.
> If you have a cache of which triggers need to be run for which
> commands, an integer index into an array will be a lot faster than a
> hash table lookup. But it may bear more examination, so I don't feel
> this is a must-do at this point.

I'm still not convinced that two index scans will be noticeable when
executing a DDL command. So it all feels like premature optimization to
me, even if I completely understand where you come from (most of the
patch you worked on for this release share an easy to spot common theme
after all).

So I didn't implement a cache here yet in patch v10. I would hate to
spend time on that to chase a 1% DDL-only performance regression.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
command-trigger.v10.patch.gz application/octet-stream 51.7 KB

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2012-02-27 23:26:05
Message-ID: 201202280026.05421.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday, December 04, 2011 02:09:08 AM Andres Freund wrote:
> First, does anybody think it would be worth getting rid of the duplication
> from OpenIntoRel (formerly from execMain.c) in regard to DefineRelation()?
> I noticed that there already is some diversion between both. E.g. CREATE
> TABLE frak TABLESPACE pg_global AS SELECT 1; is possible while it would
> be forbidden via a plain CREATE TABLE. (I will send a fix for this
> separately).
Sorry for letting this slide.

Is it worth adding this bit to OpenIntoRel? Not sure if there is danger in
allowing anyone to create shared tables

/* In all cases disallow placing user relations in pg_global */
if (tablespaceId == GLOBALTABLESPACE_OID)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("only shared relations can be placed in pg_global
tablespace")));

Andres


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2012-02-27 23:30:36
Message-ID: 21501.1330385436@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> Sorry for letting this slide.

> Is it worth adding this bit to OpenIntoRel? Not sure if there is danger in
> allowing anyone to create shared tables

> /* In all cases disallow placing user relations in pg_global */
> if (tablespaceId == GLOBALTABLESPACE_OID)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("only shared relations can be placed in pg_global
> tablespace")));

Ugh ... if that's currently allowed, we definitely need to fix it.
But I'm not sure OpenIntoRel is the right place. I'd have expected
the test to be at some lower level, like heap_create_with_catalog
or so.

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2012-02-27 23:43:02
Message-ID: 201202280043.02617.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday, February 28, 2012 12:30:36 AM Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > Sorry for letting this slide.
> >
> > Is it worth adding this bit to OpenIntoRel? Not sure if there is danger
> > in allowing anyone to create shared tables
> >
> > /* In all cases disallow placing user relations in pg_global */
> > if (tablespaceId == GLOBALTABLESPACE_OID)
> >
> > ereport(ERROR,
> >
> > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >
> > errmsg("only shared relations can be placed in pg_global
> >
> > tablespace")));
>
> Ugh ... if that's currently allowed, we definitely need to fix it.
> But I'm not sure OpenIntoRel is the right place. I'd have expected
> the test to be at some lower level, like heap_create_with_catalog
> or so.
Its definitely allowed right now:

test-upgrade=# CREATE TABLE foo TABLESPACE pg_global AS SELECT 1;
SELECT 1
Time: 354.097 ms

The analogous check for the missing one in OpenIntoRel for plain relations is
in defineRelation. heap_create_with_catalog only contains the inverse check:

/*
* Shared relations must be in pg_global (last-ditch check)
*/
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
elog(ERROR, "shared relations must be placed in pg_global
tablespace");

Moving it there sounds like a good idea without any problem I can see right
now. Want me to prepare a patch or is it just the same for you if you do it
yourself?

Andres


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2012-02-27 23:44:18
Message-ID: 201202280044.18423.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday, February 28, 2012 12:30:36 AM Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > Sorry for letting this slide.
> >
> > Is it worth adding this bit to OpenIntoRel? Not sure if there is danger
> > in allowing anyone to create shared tables
> >
> > /* In all cases disallow placing user relations in pg_global */
> > if (tablespaceId == GLOBALTABLESPACE_OID)
> >
> > ereport(ERROR,
> >
> > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >
> > errmsg("only shared relations can be placed in pg_global
> >
> > tablespace")));
>
> Ugh ... if that's currently allowed, we definitely need to fix it.
Btw, whats the danger youre seing?

Andres


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2012-02-27 23:54:41
Message-ID: 21917.1330386881@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On Tuesday, February 28, 2012 12:30:36 AM Tom Lane wrote:
>> Ugh ... if that's currently allowed, we definitely need to fix it.

> Btw, whats the danger youre seing?

Well, I'm not sure that it would actively break anything, but we
definitely meant to disallow the case. Also, I seem to recall some
places that intuit a relation's shared marker in the opposite direction
(if it's in pg_global it must be shared), and that could definitely
cause issues if we treat a rel as shared when it isn't.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2012-02-28 02:12:08
Message-ID: 1330394958-sup-6995@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Tom Lane's message of lun feb 27 20:54:41 -0300 2012:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On Tuesday, February 28, 2012 12:30:36 AM Tom Lane wrote:
> >> Ugh ... if that's currently allowed, we definitely need to fix it.
>
> > Btw, whats the danger youre seing?
>
> Well, I'm not sure that it would actively break anything, but we
> definitely meant to disallow the case. Also, I seem to recall some
> places that intuit a relation's shared marker in the opposite direction
> (if it's in pg_global it must be shared), and that could definitely
> cause issues if we treat a rel as shared when it isn't.

The list of shared rels is hardcoded -- see IsSharedRelation.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2012-03-20 16:02:01
Message-ID: 201203201702.02130.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday, February 28, 2012 12:43:02 AM Andres Freund wrote:
> On Tuesday, February 28, 2012 12:30:36 AM Tom Lane wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> > > Sorry for letting this slide.
> > >
> > > Is it worth adding this bit to OpenIntoRel? Not sure if there is danger
> > > in allowing anyone to create shared tables
> > >
> > > /* In all cases disallow placing user relations in pg_global */
> > > if (tablespaceId == GLOBALTABLESPACE_OID)
> > >
> > > ereport(ERROR,
> > >
> > > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > >
> > > errmsg("only shared relations can be placed in pg_global
> > >
> > > tablespace")));
> >
> > Ugh ... if that's currently allowed, we definitely need to fix it.
> > But I'm not sure OpenIntoRel is the right place. I'd have expected
> > the test to be at some lower level, like heap_create_with_catalog
> > or so.
>
> Its definitely allowed right now:
>
> test-upgrade=# CREATE TABLE foo TABLESPACE pg_global AS SELECT 1;
> SELECT 1
> Time: 354.097 ms
>
> The analogous check for the missing one in OpenIntoRel for plain relations
> is in defineRelation. heap_create_with_catalog only contains the inverse
> check:
>
> /*
> * Shared relations must be in pg_global (last-ditch check)
> */
> if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
> elog(ERROR, "shared relations must be placed in pg_global
> tablespace");
>
>
> Moving it there sounds like a good idea without any problem I can see right
> now. Want me to prepare a patch or is it just the same for you if you do it
> yourself?
Sorry to bother you with that dreary topic further, but this should probably
be fixed before the next set of stable releases.

The check cannot easily be moved to heap_create_with_catalog because e.g.
cluster.c's make_new_heap does heap_create_with_catalog for a temporary copy
of shared relations without being able to mark them as such (because they
don't have the right oid and thus IsSharedRelation would cry). So I think just
adding the same check to the ctas code as the normal DefineRelation contains
is the best way forward.

The attached patch applies from 8.3 to 9.1 (8.2 has conflicts but
thankfully...).

Andres

Attachment Content-Type Size
0001-Check-that-the-specified-tablespace-in-a-CREATE-TABL.patch text/x-patch 1.6 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Subject: Re: Command Triggers
Date: 2012-03-21 15:54:00
Message-ID: CA+TgmoYhJkkCuNUQrq2R2Z5Q1YY1pSV9mzXG0kkcD2gD+ZAi8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 20, 2012 at 12:02 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> The attached patch applies from 8.3 to 9.1 (8.2 has conflicts but
> thankfully...).

It seems it doesn't apply to master (any more?).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2012-03-21 15:58:22
Message-ID: 201203211658.22265.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday, March 21, 2012 04:54:00 PM Robert Haas wrote:
> On Tue, Mar 20, 2012 at 12:02 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > The attached patch applies from 8.3 to 9.1 (8.2 has conflicts but
> > thankfully...).
>
> It seems it doesn't apply to master (any more?).
Its not required there because of the unification of CTAS with normal code.
Sorry, that was only clear from a few mails away, should have made that
explicit.

Andres


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Command Triggers
Date: 2012-03-21 16:40:57
Message-ID: CA+TgmoYwhrwLU7oO5W56xU_goFSuNd0VfYKXz1z=0Rs+HPRSgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 21, 2012 at 11:58 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On Wednesday, March 21, 2012 04:54:00 PM Robert Haas wrote:
>> On Tue, Mar 20, 2012 at 12:02 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> > The attached patch applies from 8.3 to 9.1 (8.2 has conflicts but
>> > thankfully...).
>>
>> It seems it doesn't apply to master (any more?).
> Its not required there because of the unification of CTAS with normal code.
> Sorry, that was only clear from a few mails away, should have made that
> explicit.

Or maybe I should have tested it before jumping to conclusions.

Anyway, I've committed the patch to 9.1, 9.0, 8.4, and 8.3.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company