Re: Triggers on foreign tables

From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Triggers on foreign tables
Date: 2014-03-18 08:31:06
Message-ID: 3161353.ByxmBBnHhi@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le mardi 18 mars 2014 03:54:19 Kouhei Kaigai a écrit :
> > I hacked on this for awhile, but there remain two matters on which I'm
> > uncertain about the right way forward.
> >
> > (1) To acquire the old tuple for UPDATE/DELETE operations, the patch
> > closely
parallels our handling for INSTEAD OF triggers on views. It
> > adds a wholerow resjunk attribute, from which it constructs a HeapTuple
> > before calling a trigger function. This loses the system columns, an
> > irrelevant
> > consideration for views but meaningful for foreign tables. postgres_fdw
> > maintains the "ctid" system column (t_self), but triggers will always see
> > an invalid t_self for the old tuple. One way to fix this is to pass
> > around
the old tuple data as ROW(ctid, oid, xmin, cmin, xmax, cmax,
> > tableoid, wholerow). That's fairly close to sufficient, but it doesn't
> > cover t_ctid. Frankly, I would like to find a principled excuse to not
> > worry about making foreign table system columns accessible from triggers.
> > Supporting system columns dramatically affects the mechanism, and what
> > trigger is likely to care? Unfortunately, that argument seems too weak.
> > Does anyone have a cleaner idea for keeping track of the system column
> > values or a stronger argument for not bothering?
> >
>
> Regarding to the first suggestion,
> I think, it is better not to care about system columns on foreign tables,
> because it fully depends on driver's implementation whether FDW fetches
> "ctid" from its data source, or not.
> Usually, system columns of foreign table, except for tableoid, are
> nonsense.
Because of implementation reason, postgres_fdw fetches "ctid" of
> remote tables on UPDATE / DELETE, it is not a common nature for all FDW
> drivers. For example, we can assume an implementation that uses primary key
> of remote table to identify the record to be updated or deleted. In this
> case, local "ctid" does not have meaningful value.
> So, fundamentally, we cannot guarantee FDW driver returns meaningful "ctid"
> or other system columns.
>

I agree, I think it is somewhat clunky to have postgres_fdw use a feature that
is basically meaningless for other FDWs. Looking at some threads in this list,
it confused many people.

This is off-topic, but maybe we could devise an API allowing for local "system
attributes" on foreign table. This would allow FDWs to carry attributes that
weren't declared as part of the table definition. This could then be used for
postgres_fdw ctid, as well as others foreign data wrappers equivalent of an
implicit "tuple id".

>
> > (2) When a foreign table has an AFTER ROW trigger, the FDW's
> > ExecForeign{Insert,Update,Delete} callbacks must return a slot covering
> > all columns. Current FDW API documentation says things like this:
> >
> >
> > The data in the returned slot is used only if the INSERT query has a
> > RETURNING clause. Hence, the FDW could choose to optimize away
> > returning
> > some or all columns depending on the contents of the RETURNING clause.
> >
> >
> > Consistent with that, postgres_fdw inspects the returningLists of the
> > ModifyTable node to decide which columns are necessary. This patch has
> > rewriteTargetListIU() add a resjunk wholerow var to the returningList of
> > any query that will fire an AFTER ROW trigger on a foreign table. That
> > avoids the need to change the FDW API contract or any postgres_fdw code.
> > I was pleased about that for a time, but on further review, I'm doubting
> > that the benefit for writable FDWs justifies muddying the definition of
> > returningList; until now, returningList has been free from resjunk TLEs.
> > For example, callers of
> > FetchStatementTargetList() may be unprepared to see an all-resjunk list,
> > instead of NIL, for a data modification statement that returns nothing.
> >
> > If we do keep the patch's approach, I'm inclined to rename returningList.
> > However, I more lean toward adding a separate flag to indicate the need
> > to return a complete tuple regardless of the RETURNING list. The
> > benefits
> > of overloading returningList are all short-term benefits. We know that
> > the FDW API is still converging, so changing it seems
> > eventually-preferable
to, and safer than, changing the name or meaning
> > of returningList. Thoughts?
> >
> > On Thu, Mar 06, 2014 at 09:11:19AM +0100, Ronan Dunklau wrote:
> >
> > > Le mercredi 5 mars 2014 22:36:44 Noah Misch a écrit :
> > >
> > > > Agreed. More specifically, I see only two scenarios for retrieving
> > > > tuples from the tuplestore. Scenario one is that we need the next
> > > > tuple (or pair of tuples, depending on the TriggerEvent). Scenario
> > > > two is that we need the tuple(s) most recently retrieved. If that's
> > > > correct, I'm inclined to rearrange afterTriggerInvokeEvents() and
> > > > AfterTriggerExecute() to remember the tuple or pair of tuples most
> > > > recently retrieved. They'll then never call tuplestore_advance()
> > > > just to reposition. Do you see a problem with that?
> > >
> > >
> > >
> > > I don't see any problem with that. I don't know how this would be
> > > implemented, but it would make sense to avoid those scans, as long as
> > > a fresh copy is passed to the trigger: modifications to a tuple
> > > performed in an after trigger should not be visible to the next one.
> >
> >
> > Trigger functions are not permitted to modify tg_trigtuple or
> > tg_newtuple;
> > notice that, for non-foreign triggers, we pass shared_buffers-backed
> > tuples
in those fields. Therefore, no copy is necessary.
> >
> >
> > > > I was again somewhat tempted to remove ate_tupleindex, perhaps by
> > > > defining the four flag bits this way:
> > > >
> > > >
> > > >
> > > > #define AFTER_TRIGGER_DONE 0x10000000
> > > > #define AFTER_TRIGGER_IN_PROGRESS 0x20000000
> > > > /* two bits describing the size of and tuple sources for this event
> >
> > */
> >
> > > > #define AFTER_TRIGGER_TUP_BITS 0xC0000000
> > > > #define AFTER_TRIGGER_FDW_REUSE 0x00000000
> > > > #define AFTER_TRIGGER_FDW_FETCH 0x40000000
> > > > #define AFTER_TRIGGER_1CTID 0x80000000
> > > > #define AFTER_TRIGGER_2CTID 0xC0000000
> > > >
> > > >
> > > >
> > > > AFTER_TRIGGER_FDW_FETCH and AFTER_TRIGGER_FDW_REUSE correspond to
> > > > the aforementioned scenarios one and two, respectively. I think,
> > > > though, I'll rate this as needless micro-optimization and not bother;
> >
> > opinions welcome.
> >
> > > > (The savings is four bytes per foreign table trigger event.)
> > >
> > >
> > >
> > > I was already happy with having a lower footprint for foreign table
> > > trigger events than for regular trigger events, but if we remove the
> > > need for seeking in the tuplestore entirely, it would make sense to get
> >
> > rid of the index.
> >
> > I'm pleased with how this turned out. Besides the memory savings in
> > question, this removed the INT_MAX limitation and simplified the code
> > overall. I did not observe a notable runtime improvement, though that's
> > unsurprising.
> >
> >
> > Other notable changes in the attached revision:
> >
> > 1. UPDATE/DELETE row-level triggers on foreign tables and INSTEAD OF
> > triggers on views have a similar requirement to generate a HeapTuple
> > representing the old row. View triggers did so in nodeModifyTable.c,
> > while
foreign table triggers did so in trigger.c. Both were valid
> > choices, but the code siting should not be relkind-dependent without good
> > reason. I centralized this in ExecModifyTable().
> >
> > 2. Made CREATE TRIGGER forbid INSTEAD OF and TRUNCATE triggers on foreign
> > tables. The documentation already claimed they were unavailable.
> >
> > 3. Fixed pointer arithmetic in AfterTriggerBeginQuery()'s MemSet() call.
> >
> > 4. Modified GetCurrentFDWTuplestore() to allocate the tuplestore in
> > TopTransactionContext. We explicitly put the events list there
> > (specifically, in a documentation-only child of that context), so it
> > seemed
more consistent to do the same for the associated foreign tuples.
> > I did not find any live bug from the previous coding, because
> > CurrentMemoryContext always happened to be one that survived past
> > AfterTriggerEndQuery().
> > 5. Updated comments and documentation that still reflected earlier
> > versions
of the patch, as well as older comments obsoleted by the patch.
> >
> > 6. Reverted cosmetic changes, like addition of braces and blank lines, to
> > passages of code not otherwise changing. Please see:
> > https://wiki.postgresql.org/wiki/Creating_Clean_Patches
> >
> > --
> > Noah Misch
> > EnterpriseDB
> > http://www.enterprisedb.com
>
>
> --
> NEC OSS Promotion Center / PG-Strom Project
> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
>

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KONDO Mitsumasa 2014-03-18 08:54:45 Re: gaussian distribution pgbench
Previous Message Jeff Janes 2014-03-18 06:04:24 Re: Planner hints in Postgresql