Re: Add CREATE support to event triggers

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add CREATE support to event triggers
Date: 2013-11-20 18:36:46
Message-ID: CAFNqd5VjebDDNF2AGjuAMq9T-s6ii7i+xC5A_WDkizW5LJj8dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 8, 2013 at 10:33 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Hello,
>
> Attached you can find a very-much-WIP patch to add CREATE info support
> for event triggers (normalized commands). This patch builds mainly on
> two things:
>
> 1. Dimitri's "DDL rewrite" patch he submitted way back, in
> http://www.postgresql.org/message-id/m2zk1j9c44.fsf@2ndQuadrant.fr
>
> I borrowed the whole ddl_rewrite.c code, and tweaked it a bit. There
> are several things still wrong with it and which will need to be fixed
> before a final patch can even be contemplated; but there are some
> questions that require a consensus answer before I go and fix it all,
> because what it will look like will depend on said answers.

I have tried this out; the patch applies fine.

Note that it induces (modulo my environment) a failure in "make check".

The "opr_sanity" test fails.

postgres(at)cbbrowne ~/p/s/t/regress> diff expected/opr_sanity.out
results/opr_sanity.out
348,350c348,351
< oid | proname
< -----+---------
< (0 rows)
---
> oid | proname
> ------+------------------------------------------
> 3567 | pg_event_trigger_get_normalized_commands
> (1 row)

That's a minor problem; the trouble there is that the new function is not
yet documented. Not a concern at this stage.

> 2. The ideas we used to build DROP support. Mainly, the interesting
> thing here is the fact that we use a SRF to report, at
> ddl_command_end, all the objects that were created during execution
> of that command. We do this by collecting them in a list in some raw
> form somewhere during ProcessUtility, and then spitting them out if
> the SRF is called. I think the general idea is sound, although of
> course I admit there might be bugs in the implementation.
>
> Note this patch doesn't try to add any kind of ALTER support. I think
> this is fine in principle, because we agreed that we would attack each
> kind of command separately (divide to conquer and all that); but there
> is a slight problem for some kind of objects that are represented partly
> as ALTER state during creation; for example creating a table with a
> sequence uses ALTER SEQ/OWNED BY internally at some point. There might
> be other cases I'm missing, also. (The REFRESH command is nominally
> also supported.)

I imagine that the things we create in earlier stages may help with later
stages, so it's worth *some* planning so we can hope not to build bits
now that push later enhancements into corners that they can't get out of.

But I'm not disagreeing at all.

> Now about the questions I mentioned above:
>
> a) It doesn't work to reverse-parse the statement nodes in all cases;
> there are several unfixable bugs if we only do that. In order to create
> always-correct statements, we need access to the catalogs for the
> created objects. But if we are doing catalog access, then it seems to
> me that we can do away with the statement parse nodes completely and
> just reconstruct the objects from catalog information. Shall we go that
> route?

Here's a case where it doesn't work.

testevent(at)localhost-> create schema foo;
CREATE SCHEMA
testevent(at)localhost-> create domain foo.bar integer;
CREATE DOMAIN
testevent(at)localhost-> CREATE OR REPLACE FUNCTION snitch() RETURNS
event_trigger LANGUAGE plpgsql AS $$
testevent$# DECLARE
testevent$# r RECORD;
testevent$# BEGIN
testevent$# FOR r IN SELECT * FROM
pg_event_trigger_get_normalized_commands()
testevent$# LOOP
testevent$# RAISE NOTICE 'object created: id %, statement %',
testevent$# r.identity, r.command;
testevent$# END LOOP;
testevent$# END;
testevent$# $$;
CREATE FUNCTION
testevent(at)localhost-> CREATE EVENT TRIGGER snitch ON ddl_command_end
EXECUTE PROCEDURE snitch();
CREATE EVENT TRIGGER
testevent(at)localhost-> set search_path to public, foo;
SET
testevent(at)localhost-> create table foo.foo2 (acolumn bar);
NOTICE: object created: id foo.foo2, statement CREATE TABLE foo.foo2
(acolumn bar)
CREATE TABLE

The trouble is that you have only normalized the table name. The
domain, bar, needs its name normalized as well.

> b) What's the best design of the SRF output? This patch proposes two
> columns, object identity and create statement. Is there use for
> anything else? Class/object OIDs perhaps, schema OIDs for objects types
> that have it? I don't see any immediate need to that info, but perhaps
> someone does.

Probably an object type is needed as well, to know if it's a table or
a domain or a sequence or whatever.

I suspect that what will be needed to make it all usable is some sort of
"structured" form. That is in keeping with Robert Haas' discomfort with
the normalized form.

My "minor" gripe is that you haven't normalized enough (e.g. - it should be
CREATE TABLE foo.foo2 (acolumn foo.bar), capturing the normalization of
data types that are referenced).

But Robert's quite right that users may want more than just to capture that
literally; they may want to modify it, for instance, by shifting to another
schema. And it will be no fun at all if you have to construct an SQL parser
in order to change it.

The "big change" in Slony 2.2 was essentially to solve that problem; we had
been capturing logged updates as near-raw SQL, and discovered that we
really needed to capture it in a form that allows restructuring it without
needing to reparse it. It seems to me that the same need applies here.

The answer we came up with in Slony was to change from "SQL fragment"
to "array of attribute names and attribute values." I expect that is quite
unsuitable for CREATE events, being too "flat" a representation.

> c) The current patch stashes all objects in a list, whenever there's an
> event trigger function. But perhaps some users want to have event
> triggers and not have any use for the CREATE statements. So one idea is
> to require users that want the objects reported to call a special
> function in a ddl_command_start event trigger which enables collection;
> if it's not called, objects are not reported. This eliminates
> performance impact for people not using it, but then maybe it will be
> surprising for people that call the SRF and find that it always returns
> empty.

Hmm. I'm not sure I follow what the issue is here (I think Robert has
the same problem; if that's so, then it's probably worth a separate
explanation/discussion).

I think there are actually more options here than just worrying about
CREATE...

I can see caring/not caring about CREATE events. Also caring/not
caring about different sorts of objects.

Thus, an event trigger might either filter on event type
(CREATE versus DROP versus ALTER vs ...), as well as on object
type (TABLE vs VIEW vs SCHEMA vs DOMAIN vs ...).

I could see that being either part of the trigger definition or as an
attribute established within the trigger.

Thus...

create event trigger foo snitch on ddl_command_end
on create, drop, alter
for table, view
execute procedure snitch();

And having internals

create or replace function snitch() returns event_trigger language plpgsql as
$$
declare
r record;
begin
for r in select * from pg_event_trigger_get_normalized_commands loop
raise notice 'action: % object type % object name % object id %
normalized statement %',
r.type, r.identity, r.id, r.command;
end loop
end;
$$;

And I suspect that there needs to be another value returned (not
necessarily by the same function) that is some form of the parse tree.

If the parse tree is already available in memory on the C side of things, then
making it accessible by calling a function that allows "walking" the tree and
invoking methods on the elements is a perfectly reasonable idea. Ideally,
there's something that can be usefully called that is written in pl/pgsql;
mandating that we punt to C to do arbitrary magic doesn't seem nice.
(If that's not reading coherently, well, blame it on my mayor being on
crack! ;-) )

> d) There's a new routine uncookConstraintOrDefault. This takes a raw
> expression, runs transformExpr() on it, and then deparses it (possibly
> setting up a deparse context based on some relation). This is a
> somewhat funny thing to be doing, so if there are other ideas on how to
> handle this, I'm all ears.

Is that perhaps some of the "magic" to address my perhaps incoherent
bit about "punt to C"? It looks a bit small to be that...
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2013-11-20 18:40:16 Re: nested hstore patch
Previous Message Tom Lane 2013-11-20 18:36:05 Re: additional json functionality