Re: Reducing overhead for repeat de-TOASTing

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Reducing overhead for repeat de-TOASTing
Date: 2008-06-18 16:01:35
Message-ID: 28020.1213804895@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> Agreed. Yet I'm thinking that a more coherent approach to optimising the
> tuple memory usage in the executor tree might be better than the special
> cases we seem to have in various places. I don't know what that is, or
> even if its possible though.

Yeah. I had tried to think of a way to manage the cached detoasted
value as part of the TupleTableSlot in which the toasted datum is
(normally) stored, but there seems no way to know which slot that is
at the point where pg_detoast_datum is invoked --- and as mentioned
earlier, speculatively detoasting things at the point of the slot access
seems a loser.

[ thinks a bit more ... ] But there's always more than one way to
skin a cat. Right now, when you fetch a toasted attribute value
out of a Slot, what you get is a pointer to a stored-on-disk TOAST
pointer, ie

0x80 or 0x01
length (18)
struct varatt_external

Now the Slot knows which attributes are varlena (it has a tupdesc)
so it could easily check whether it's about to return one of these.
It could instead return a pointer to, say

0x80 or 0x01
length (more than 18)
struct varatt_external
pointer to Slot
pointer to detoasted value, or NULL if not detoasted yet

and that pointer-to-Slot would give us the hook we need to manage
the detoasting when and if pg_detoast_datum gets called. Both
this struct and the ultimately decompressed value would be auxiliary
memory belonging to the Slot, and would go away at slot clear.
(This is certain to work since a not-toasted pass-by-ref datum
in the tuple would have that same lifetime.)

Come to think of it, if Slots are going to manage detoasted copies
of attributes, we could have them auto-detoast inline-compressed
Datums at the time of fetch. The argument that this might be
wasted work has a lot less force for that case.

I am not sure this is a better scheme than the backend-wide cache,
but it's worth thinking about. It would have a lot less management
overhead. On the other hand it couldn't amortize detoastings across
repeated tuple fetches (such as could happen in a join, or successive
queries on the same value).

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2008-06-18 17:55:16 Re: Better error reporting for tsearch config file problems
Previous Message Tom Lane 2008-06-18 15:39:38 Better error reporting for tsearch config file problems