Re: Segfault with before triggers and after triggers with a WHEN clause.

Lists: pgsql-bugs
From: Yoran Heling <info(at)yorhel(dot)nl>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Segfault with before triggers and after triggers with a WHEN clause.
Date: 2011-08-21 12:00:55
Message-ID: CADL1CPghUq99wiKeAXja3oUBBma9ksiSQe3fR_b5fCa0nR3KXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hello,

After upgrading to PostgreSQL 9.0.4 (don't remember exactly where I
came from, but I believe it was an earlier 9.0.x), postgresql began to
segault on certain queries. I have managed to isolate the problem and
can reproduce the crash on a newly created and empty database with the
following queries:

CREATE TABLE some_t (some_col boolean NOT NULL);
CREATE OR REPLACE FUNCTION trig_before() RETURNS trigger AS $$
BEGIN RETURN NEW; END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION trig_after() RETURNS trigger AS $$
BEGIN RETURN NULL; END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trig_before BEFORE UPDATE ON some_t FOR EACH ROW
EXECUTE PROCEDURE trig_before();
CREATE TRIGGER trig_aftera AFTER UPDATE ON some_t FOR EACH ROW WHEN
(NOT OLD.some_col AND NEW.some_col) EXECUTE PROCEDURE trig_after();
CREATE TRIGGER trig_afterb AFTER UPDATE ON some_t FOR EACH ROW WHEN
(NOT NEW.some_col) EXECUTE PROCEDURE trig_after();
INSERT INTO some_t VALUES (TRUE);
UPDATE some_t SET some_col = TRUE;

This is on a 64bit Arch Linux system with the postgresql-9.0.4-4
package and a linux-3.0.1 kernel.

Here is a backtrace of the crash, although I suppose it might not be
very useful without debugging symbols:

#0 0x0000000000450dca in slot_getattr ()
#1 0x000000000054787c in ?? ()
#2 0x000000000054de11 in ExecQual ()
#3 0x000000000052dbf2 in ?? ()
#4 0x000000000052de9a in ?? ()
#5 0x0000000000532035 in ExecARUpdateTriggers ()
#6 0x000000000055b341 in ExecModifyTable ()
#7 0x0000000000547518 in ExecProcNode ()
#8 0x0000000000544d5a in standard_ExecutorRun ()
#9 0x00000000005fbf01 in ?? ()
#10 0x00000000005fc114 in ?? ()
#11 0x00000000005fccc2 in PortalRun ()
#12 0x00000000005f8f3c in PostgresMain ()
#13 0x00000000005c9ee8 in ?? ()
#14 0x00000000005ca7dc in PostmasterMain ()
#15 0x000000000044f397 in main ()


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Yoran Heling <info(at)yorhel(dot)nl>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Segfault with before triggers and after triggers with a WHEN clause.
Date: 2011-08-21 18:51:41
Message-ID: 8536.1313952701@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Yoran Heling <info(at)yorhel(dot)nl> writes:
> After upgrading to PostgreSQL 9.0.4 (don't remember exactly where I
> came from, but I believe it was an earlier 9.0.x), postgresql began to
> segault on certain queries. I have managed to isolate the problem and
> can reproduce the crash on a newly created and empty database with the
> following queries:

Thanks, nice example! I traced through this and found that:

1. ExecBRUpdateTriggers returns the tuple-modified-by-the-before-trigger
in the estate->es_trig_tuple_slot slot.

2. ExecUpdate does ExecMaterializeSlot() on that slot. Now the slot
has a privately allocated copy of the tuple. (This is necessary since
we'll scribble on the tuple's header fields during heap_update.)

3. During ExecARUpdateTriggers, TriggerEnabled needs to put the new
tuple into a slot for execution of the WHEN condition. It thinks it
can use the estate->es_trig_tuple_slot slot for this, but it's passing
the same tuple *already* stored in that slot to ExecStoreTuple.
ExecStoreTuple sees it's clearing a slot with shouldFree = true, so
it pfree's the tuple, and then stores a dangling pointer back into
the slot. Ooops.

TriggerEnabled's apparently-similar use of estate->es_trig_oldtup_slot
is perfectly safe because that slot is actually dedicated for the use
of this function. The safest fix for this bug would be to make another
dedicated slot for the new tuple too. That will require adding a field
to EState, which is a bit risky in released branches, but I think we
can get away with it if we add the field at the end of the struct.
We did the same in a post-release 8.1 patch (in fact, that was adding
es_trig_tuple_slot itself) and did not get complaints.

The only alternative I can see that doesn't add another field to EState
is to hack the TriggerEnabled code so that it checks if the tuple is
already stored in the slot and skips ExecStoreTuple if so. That seems
like a modularity violation, though: it'd require more knowledge about
the detailed behavior of slots than I think this function ought to have.
And it's still fairly fragile, in that es_trig_tuple_slot is mainly
meant for the use of the layer of functions that are calling
TriggerEnabled --- it's not hard to foresee other bugs if we rearrange
the timing of the existing ExecStoreTuple calls in ExecBRUpdateTriggers
and friends.

regards, tom lane