Re: Event Triggers: adding information

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Event Triggers: adding information
Date: 2012-12-25 15:34:09
Message-ID: m21uee6t8u.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> That's a good idea. I only started quite late in that patch to work on
>> the ObjectID piece of information, that's why I didn't split it before
>> doing so.
>
> That's fine. I've committed that part. I think the remainder of the
> patch is really several different features that ought to be split
> apart and considered independently. We may want some but not others,
> or some may be ready to go in but not others.

Thank you for this partial commit, and Simon and Andres to fill in the
gaps. I should mention that the missing header parts were all in my
patch, and that headers hacking is proving suprisingly uneasy.

I've been having several rounds of problems with the gcc tool chain when
modifying headers. Worst case has been a successful compiling followed
by random errors. Then gdb was not giving any clue (botched memory when
returned from a function, palloc'ed memory not freed yet). After
spending more time on it that I care to admit, I did a `make
maintainer-clean`, built again and the problem disappeared all by
itself.

The gcc tool chain is not my favorite developer environment.

>> Also, keep in mind we want the ObjectID in all CREATE, ALTER and DROP
>> statements, so my current patch is still some bricks shy of a load… (I
>> added ObjectID only in the commands I added rewrite support for, apart
>> from DROP).
>
> I shall rely on you to provide those bricks which are still missing.

Cool, will prepare a separate patch implementing that API now, and base
the following patches on top of that. My thinking is to do as following:

- API patch to get Oid from all CREATE/ALTER commands
- API patch for getting the Oid(s) in (before) DROP commands
- Event Trigger Infos patch, depending on the previous ones
- Command String Rewrite and its publishing in Event Triggers

Of course for the DROP parts, we'd better have the Oid and use it before
the command runs through completion or it will not be usable from the
event trigger (the target is ddl_command_start in that case).

>> We might be able to have a better way of testing the feature here, but I
>> tried to stay into the realms of what I know pg_regress able to do.
>
> I was thinking that we might need to go beyond what pg_regress can do
> in this case. For example, one idea would be to install an audit
> trigger that records all the DDL that happens during the regression
> tests. Then, afterwards, replay that DDL into a new database. Then
> do a schema-only dump of the old and new databases and diff the dump
> files. That's a little wacky by current community standards but FWIW
> EDB has a bunch of internal tests that we run to check our proprietary
> stuff; some of them work along these lines and it's pretty effective
> at shaking out bugs.

Are you in a position to contribute those parts to the community?
Implementing them again then having to support two different variants of
them does not look like the best use of both our times.

> Hmm. I have to study that more, maybe. I certainly agree that if you
> can look at the catalogs, you should be able to reliably reconstruct
> what happened - which isn't possible just looking at the parse tree.

Well in fact we're looking at the parsetree once edited to host the
exact data that goes into the catalogs. In most cases that data is
easily available to further consumption, or it's possible to use the
transform API without touching catalogs again.

In some places I'm doing the same processing twice, which I don't like
very much, but it's only happening if an Event Trigger is going to fire
so I though we could optimize that later. The aim here has been to limit
the changes to review, it looks like we have enough of them already.

Concerned parts are dealing with raw and cooked expressions for CHECK
and DEFAULT and WHERE clauses (create table, create index, alter table,
create constraint, create domain).

> However, it feels weird to me to do something that's partly based on
> the parse tree and partly based on the catalog contents. Moreover,
> the current pre-create hook isn't the right place to gather
> information from the catalogs anyway as it happens before locks have
> been taken.

So all the information is derived from the parse tree. The trick is that
this information is only valid once it has hit the catalogs (think of a
CHECK constraint in a CREATE TABLE statement, referencing some column
attribute, same with a DEFAULT expression depending on another col).

The case where we should certainly prefer looking at the catalog caches
are when we want to know the actual schema where the object has been
created. The current patch is deriving that information mostly from the
parse tree, using the first entry of the search_path when the schema is
not given in the command. It's ok because we are still in the same
transaction and no command has been able to run in between the user
command and our lookup, but I could easily get convinced to look up the
catalogs instead, even more so once we have the OID of the object easily
available in all places.

> Now, there's another thing that is bothering me here, too. How
> complete is the support you've implemented in this patch? Does it
> cover ALL CREATE/ALTER/DROP commands, including all options to and all
> variants of those commands? Because while I'm not very sanguine about
> doing this at all, it somehow feels like doing it only halfway would
> be even worse.

This patch is implementing complete support for all commands for most of
the information, and partial support for some information that comes
from the parse tree analysis. The goal for this commit fest is to agree
on how to do it, then finish the whole command set for the next one.

As decided at the hackers meeting, now is the time to address anything
controversial, once done we can "return with feedback". So that's my
goal for this commit fest, concerning the Normalized Command String.

Given your review, I think the way forward is pretty clear now, and my
plan is as following:

- commit Oid related API changes in this commit fest
- commit Event Trigger Information changes in this commit fest
- agree on how to tackle exposing the normalized command string

Merry Christmas!
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-12-25 18:22:54 Re: proposal: regrole type?
Previous Message Vlad Arkhipov 2012-12-25 11:48:08 Temporal features in PostgreSQL