Re: WIP: Triggers on VIEWs

Lists: pgsql-hackers
From: Bernd Helmle <mailings(at)oopsware(dot)de>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: David Christensen <david(at)endpoint(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Triggers on VIEWs
Date: 2010-10-05 20:17:17
Message-ID: 86DCE0534FDCCCF9A19F0DE8@amenophis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

--On 30. September 2010 07:38:18 +0100 Dean Rasheed
<dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:

>> This version doesn't apply clean anymore due to some rejects in
>> plainmain.c. Corrected version attached.
>>
>
> Ah yes, those pesky bits have been rotting while I wasn't looking.
> Thanks for fixing them!

Basic summary of this patch:

* The patch includes a fairly complete discussion about INSTEAD OF triggers
and their usage on views. There are also additional enhancements to the
RULE documentation, which seems, given that this might supersede the usage
of RULES for updatable views, reasonable.

* The patch passes regressions tests and comes with a bunch of its own
regression tests. I think they are complete, they cover statement and row
Level trigger and show the usage for JOINed views for example.

* I've checked against a draft of the SQL standard, the behavior of the
patch seems to match the spec (my copy might be out of date, however).

* The code looks pretty good to me, there are some low level error messages
exposing some implementation details, which could be changed (e.g.
"wholerow is NULL"), but given that this is most of the time unexpected and
is used in some older code as well, this doesn't seem very important.

* The implementation introduces the notion of "wholerow". This is a junk
target list entry which allows the executor to carry the view information
to an INSTEAD OF trigger. In case of DELETE/UPDATE, the rewriter is
responsible to inject the new "wholerow" TLE and make the original query to
point to a new range table entry (correct me, when i'm wrong), which is
based on the view's query. I'm not sure i'm happy with the notion of
"wholerow" here, maybe "viewrow" or "viewtarget" is more descriptive?

* I'm inclined to say that INSTEAD OF triggers have less overhead than
RULES, but this is not proven yet with a reasonable benchmark.

I would like to do some more tests/review, but going to mark this patch as
"Ready for Committer", so that someone more qualified on the executor part
can have a look on it during this commitfest, if that's okay for us?

--
Thanks

Bernd


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Bernd Helmle <mailings(at)oopsware(dot)de>
Cc: David Christensen <david(at)endpoint(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Triggers on VIEWs
Date: 2010-10-06 08:20:38
Message-ID: AANLkTin1VeyehQA1wQmVhPXXUjQmf1anvVZhXQ-hYoGa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 5 October 2010 21:17, Bernd Helmle <mailings(at)oopsware(dot)de> wrote:
> Basic summary of this patch:
>

Thanks for the review.

> * The patch includes a fairly complete discussion about INSTEAD OF triggers
> and their usage on views. There are also additional enhancements to the RULE
> documentation, which seems, given that this might supersede the usage of
> RULES for updatable views, reasonable.
>
> * The patch passes regressions tests and comes with a bunch of its own
> regression tests. I think they are complete, they cover statement and row
> Level trigger and show the usage for JOINed views for example.
>
> * I've checked against a draft of the SQL standard, the behavior of the
> patch seems to match the spec (my copy might be out of date, however).
>
> * The code looks pretty good to me, there are some low level error messages
> exposing some implementation details, which could be changed (e.g. "wholerow
> is NULL"), but given that this is most of the time unexpected and is used in
> some older code as well, this doesn't seem very important.
>

Hopefully that error should never happen, since it would indicate a
bug in the code rather than a user error.

> * The implementation introduces the notion of "wholerow". This is a junk
> target list entry which allows the executor to carry the view information to
> an INSTEAD OF trigger. In case of DELETE/UPDATE, the rewriter is responsible
> to inject the new "wholerow" TLE and make the original query to point to a
> new range table entry (correct me, when i'm wrong), which is based on the
> view's query. I'm not sure i'm happy with the notion of "wholerow" here,
> maybe "viewrow" or "viewtarget" is more descriptive?
>

That's a good description of how it works. I chose "wholerow" because
that matched similar terminology used already, for example in
preptlist.c when doing FOR UPDATE/SHARE on a view. I don't feel
strongly about it, but my inclination is to stick with "wholerow"
unless someone feels strongly otherwise.

> * I'm inclined to say that INSTEAD OF triggers have less overhead than
> RULES, but this is not proven yet with a reasonable benchmark.
>

It's difficult to come up with a general statement on performance
because there are so many variables that might affect it. In a few
simple tests, I found that for repeated small updates RULEs and
TRIGGERs perform roughly the same, but for bulk updates (one query
updating 1000s of rows) a RULE is best.

> I would like to do some more tests/review, but going to mark this patch as
> "Ready for Committer", so that someone more qualified on the executor part
> can have a look on it during this commitfest, if that's okay for us?
>

Thanks for looking at it. I hope this is useful functionality to make
it easier to write updatable views, and perhaps it will help with
implementing auto-updatable views too.

Cheers,
Dean

> --
> Thanks
>
>        Bernd
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Bernd Helmle <mailings(at)oopsware(dot)de>, David Christensen <david(at)endpoint(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Triggers on VIEWs
Date: 2010-10-08 15:50:15
Message-ID: 23980.1286553015@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bernd Helmle <mailings(at)oopsware(dot)de> writes:
> I would like to do some more tests/review, but going to mark this patch as
> "Ready for Committer", so that someone more qualified on the executor part
> can have a look on it during this commitfest, if that's okay for us?

I've started looking at this patch now. I think it would have been best
submitted as two patches: one to add the SQL-spec "INSTEAD OF" trigger
functionality, and a follow-on to extend INSTEAD OF triggers to views.
I'm thinking of breaking it apart into two separate commits along that
line, mainly because I think INSTEAD OF is probably committable but
I'm much less sure about the other part.

In the INSTEAD OF part, the main gripe I've got is the data
representation choice:

***************
*** 1609,1614 ****
--- 1609,1615 ----
List *funcname; /* qual. name of function to call */
List *args; /* list of (T_String) Values or NIL */
bool before; /* BEFORE/AFTER */
+ bool instead; /* INSTEAD OF (overrides BEFORE/AFTER) */
bool row; /* ROW/STATEMENT */
/* events uses the TRIGGER_TYPE bits defined in catalog/pg_trigger.h */
int16 events; /* INSERT/UPDATE/DELETE/TRUNCATE */

This pretty much sucks, because not only is this a rather confusing
definition, it's not really backwards compatible: any code that thinks
"!before" must mean "after" is going to be silently broken. Usually,
when you do something that necessarily breaks old code, what you want
is to make sure the breakage is obvious at compile time. I think the
right thing here is to replace "before" with a three-valued enum,
perhaps called "timing", so as to force people to take another look
at any code that touches the field directly.

Although we already have macros TRIGGER_FIRED_AFTER/TRIGGER_FIRED_BEFORE
that seem to mask the details here, the changes you had to make in
contrib illustrate that the macros' callers could still be embedding this
basic mistake of testing "!before" when they mean "after" or vice versa.
I wonder whether we should intentionally rename the macros to force
people to take another look at their logic. Or is that going too far?
Comments anyone?

regards, tom lane


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bernd Helmle <mailings(at)oopsware(dot)de>, David Christensen <david(at)endpoint(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Triggers on VIEWs
Date: 2010-10-08 16:49:49
Message-ID: AANLkTinDpg2VFNjJf+PQTaUeDOgUg5mbVXKw-p5=denf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 October 2010 16:50, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Bernd Helmle <mailings(at)oopsware(dot)de> writes:
>> I would like to do some more tests/review, but going to mark this patch as
>> "Ready for Committer", so that someone more qualified on the executor part
>> can have a look on it during this commitfest, if that's okay for us?
>
> I've started looking at this patch now.  I think it would have been best
> submitted as two patches: one to add the SQL-spec "INSTEAD OF" trigger
> functionality, and a follow-on to extend INSTEAD OF triggers to views.

SQL-spec "INSTEAD OF" triggers *are* view triggers. I don't see how
you can separate the two.

> I'm thinking of breaking it apart into two separate commits along that
> line, mainly because I think INSTEAD OF is probably committable but
> I'm much less sure about the other part.
>
> In the INSTEAD OF part, the main gripe I've got is the data
> representation choice:
>
> ***************
> *** 1609,1614 ****
> --- 1609,1615 ----
>      List       *funcname;       /* qual. name of function to call */
>      List       *args;           /* list of (T_String) Values or NIL */
>      bool        before;         /* BEFORE/AFTER */
> +     bool        instead;        /* INSTEAD OF (overrides BEFORE/AFTER) */
>      bool        row;            /* ROW/STATEMENT */
>      /* events uses the TRIGGER_TYPE bits defined in catalog/pg_trigger.h */
>      int16       events;         /* INSERT/UPDATE/DELETE/TRUNCATE */
>
> This pretty much sucks, because not only is this a rather confusing
> definition, it's not really backwards compatible: any code that thinks
> "!before" must mean "after" is going to be silently broken.  Usually,
> when you do something that necessarily breaks old code, what you want
> is to make sure the breakage is obvious at compile time.  I think the
> right thing here is to replace "before" with a three-valued enum,
> perhaps called "timing", so as to force people to take another look
> at any code that touches the field directly.
>
> Although we already have macros TRIGGER_FIRED_AFTER/TRIGGER_FIRED_BEFORE
> that seem to mask the details here, the changes you had to make in
> contrib illustrate that the macros' callers could still be embedding this
> basic mistake of testing "!before" when they mean "after" or vice versa.
> I wonder whether we should intentionally rename the macros to force
> people to take another look at their logic.  Or is that going too far?
> Comments anyone?
>

I think that you're confusing 2 different parts of code here. The
TRIGGER_FIRED_AFTER/TRIGGER_FIRED_BEFORE macros operate on the bits
from the tg_event field of the TriggerData structure. This has a
completely different representation for the trigger timing compared to
the code you mention above, which is from the CreateTrigStmt
structure. That structure is only used internally in a couple of
places, by the parser and CreateTrigger().

I agree that perhaps it would be neater to represent it as an enum,
but I think the scope for code breakage is far smaller than you claim.
This structure is not being exposed to trigger code.

The changes I made in contrib are unrelated to the representation used
in CreateTrigStmt. It's just a matter of tidying up code in before
triggers to say "if (!fired_before) error" rather than "if
(fired_after) error". That's neater anyway, even in the case where
they're mutually exclusive (as they are on tables). The scope for user
code being broken is limited beause they'd have to put one of these
trigger functions in an INSTEAD OF trigger on a view.

Regards,
Dean

>                        regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Bernd Helmle <mailings(at)oopsware(dot)de>, David Christensen <david(at)endpoint(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Triggers on VIEWs
Date: 2010-10-08 17:03:36
Message-ID: 7529.1286557416@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> On 8 October 2010 16:50, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I've started looking at this patch now. I think it would have been best
>> submitted as two patches: one to add the SQL-spec "INSTEAD OF" trigger
>> functionality, and a follow-on to extend INSTEAD OF triggers to views.

> SQL-spec "INSTEAD OF" triggers *are* view triggers. I don't see how
> you can separate the two.

Oh, they're not allowed on tables? Why not? AFAICS they'd be exactly
equivalent to a BEFORE trigger that always returns NULL.

> I think that you're confusing 2 different parts of code here. The
> TRIGGER_FIRED_AFTER/TRIGGER_FIRED_BEFORE macros operate on the bits
> from the tg_event field of the TriggerData structure.

Yeah, I'm aware of that. My point is that all code that deals with
trigger firing times now has to consider three possible states where
before there were two; and it's entirely likely that some places are
testing for the wrong condition once a third state is admitted as a
possibility.

> The scope for user
> code being broken is limited beause they'd have to put one of these
> trigger functions in an INSTEAD OF trigger on a view.

Well, the only reason those tests are being made at all is as
self-defense against being called via an incorrect trigger definition.
So they're worth nothing if they fail to treat the INSTEAD OF case
correctly.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, David Christensen <david(at)endpoint(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Triggers on VIEWs
Date: 2010-10-08 17:44:44
Message-ID: AANLkTimuz3s-bJQDx8PsR0a5ePXwbO3cvdgSBZvUZXOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 8, 2010 at 11:50 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I think the
> right thing here is to replace "before" with a three-valued enum,
> perhaps called "timing", so as to force people to take another look
> at any code that touches the field directly.

+1. That seems much nicer.

> Although we already have macros TRIGGER_FIRED_AFTER/TRIGGER_FIRED_BEFORE
> that seem to mask the details here, the changes you had to make in
> contrib illustrate that the macros' callers could still be embedding this
> basic mistake of testing "!before" when they mean "after" or vice versa.
> I wonder whether we should intentionally rename the macros to force
> people to take another look at their logic.  Or is that going too far?
> Comments anyone?

I'm less sold on this one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, David Christensen <david(at)endpoint(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Triggers on VIEWs
Date: 2010-10-08 17:57:31
Message-ID: 14009.1286560651@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Oct 8, 2010 at 11:50 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Although we already have macros TRIGGER_FIRED_AFTER/TRIGGER_FIRED_BEFORE
>> that seem to mask the details here, the changes you had to make in
>> contrib illustrate that the macros' callers could still be embedding this
>> basic mistake of testing "!before" when they mean "after" or vice versa.
>> I wonder whether we should intentionally rename the macros to force
>> people to take another look at their logic. Or is that going too far?
>> Comments anyone?

> I'm less sold on this one.

I'm not sold on it either, just wanted to run it up the flagpole to see
if anyone would salute. For the moment I'm thinking that calling out
the point in the 9.1 release notes should be sufficient. I made an
extra commit to make sure the issue is salient in the commit log.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, David Christensen <david(at)endpoint(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Triggers on VIEWs
Date: 2010-10-08 19:49:24
Message-ID: 24848.1286567364@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

BTW, while I'm looking at this: it seems like the "index" arrays in
struct TrigDesc are really a lot more complication than they are worth.
It'd be far easier to dispense with them and instead iterate through
the main trigger array, skipping any triggers whose tgtype doesn't match
what we need. If you had a really huge number of triggers on a table,
it's possible that could be marginally slower, but I'm having a hard
time envisioning practical situations where anybody could tell the
difference.

regards, tom lane