Virtual tuple slots versus TOAST: big problem

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: Alexey Beschiokov <proforg(at)maloletka(dot)ru>, Jan Wieck <JanWieck(at)Yahoo(dot)com>
Subject: Virtual tuple slots versus TOAST: big problem
Date: 2005-11-19 17:22:05
Message-ID: 20874.1132420925@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I looked into this 8.1 bug reported by Alexey Beschiokov:
http://archives.postgresql.org/pgsql-bugs/2005-11/msg00192.php
The executive summary is: it looks like a kluge solution isn't hard,
but solving it in a more reasonable fashion is going to require some
significant API changes inside the backend :-(

The problem is that ExecInsert() does ExecMaterializeSlot() to get a
physical tuple from the virtual tuple slot it's initially handed.
After this step, the TupleTableSlot contains that same physical tuple.
ExecInsert then passes the bare tuple to some subroutines and the Slot
to other subroutines. In the particular case shown by Alexey, the
sequence is:
ExecConstraints is called with the slot
heap_insert (and thence tuptoaster.c) is called with the bare tuple
ExecInsertIndexTuples is called with the slot

The problem is that ExecConstraints accesses some fields of the slot,
causing partial extraction of tuple fields and setup of the tts_values[]
array within the slot. After that, the tuptoaster proceeds to smash
down the tuple to a size it likes, which it does by scribbling on the
HeapTuple structure it's handed. That makes the TupleTableSlot
structure inconsistent --- it has tts_values fields pointing at memory
that's no longer part of the tuple actually stored in the slot. When
ExecInsertIndexTuples then tries to extract more fields from the slot,
a crash is not unlikely.

(It's annoying that we didn't find this during beta, but a failure
requires ExecInsertIndexTuples to try to access fields to the right of
the last one fetched by ExecConstraints, and it only matters if the
tuple actually got toasted in between, so it is a bit of a corner case.)

ExecUpdate has the same bug. I don't think there are any other places,
because ExecMaterializeSlot isn't used elsewhere ATM. The problem did
not exist before 8.1 because TupleTableSlots didn't contain extra info
beyond the bare tuple, so tuptoaster could hack that without rendering
the Slot inconsistent.

I think that a kluge fix is possible by setting tts_nvalid to zero after
invoking heap_insert or heap_update, so that ExecInsertIndexTuples will
be forced to recompute tts_values instead of reusing the previous data.
This is probably the right thing to do in the 8.1 branch, but it's
incredibly ugly and we need to fix it better going forward.

"A better fix" seems to require passing the TupleTableSlot, not just the
bare tuple, down to the toaster --- else there is no way for the toaster
to update the data structure that it's accidentally invalidating. This
seems like it might be a good idea anyway on performance grounds: we
could save one cycle of heap_deform_tuple and heap_formtuple in the case
where toasting is needed, if the toaster is invoked on the tuple while
it's still in virtual-slot format. The problem is that given the
current structure, that means changing the APIs of heap_insert and
heap_update, or else making near-duplicate versions that take a
TupleTableSlot instead of a bare tuple. Neither of these things seem
real attractive. If we wanted to avoid forming a physical tuple until
the last moment we'd also need to change the APIs associated with
triggers, ie make them work on Slots not tuples. This'd be even more
invasive. It would likely be cleaner and more efficient in the long
run, but there's a lot of code to touch, and breaking user-defined
triggers doesn't seem palatable at all.

Any thoughts on the best way to proceed?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-11-19 17:43:15 Re: Returning multiple result sets
Previous Message Martijn van Oosterhout 2005-11-19 15:13:43 Returning multiple result sets