TupleTableSlot API problem

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: TupleTableSlot API problem
Date: 2009-03-29 21:21:37
Message-ID: 26485.1238361697@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

In CVS HEAD, try this in an assert-enabled build:

CREATE TEMPORARY TABLE tree(
id INTEGER PRIMARY KEY,
parent_id INTEGER REFERENCES tree(id)
);

INSERT INTO tree
VALUES (1, NULL), (2, 1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,3),
(9,4), (10,4), (11,7), (12,7), (13,7), (14, 9), (15,11), (16,11);

WITH RECURSIVE t(id, path) AS (
VALUES(1,ARRAY[]::integer[])
UNION ALL
SELECT tree.id, t.path || tree.id
FROM tree JOIN t ON (tree.parent_id = t.id)
)
SELECT t1.id, t2.path, t2 FROM t AS t1 JOIN t AS t2 ON
(t1.id=t2.id);

I get
ERROR: cache lookup failed for type 2139062143
(the error might be less predictable in a non-assert build).

What is happening is that ExecProject fetches the Datum value of t2.path
from a TupleTableSlot that contains a "minimal tuple" fetched from the
tuplestore associated with the CTE "t". Then, it fetches the value of
the whole-row variable t2. ExecEvalWholeRowVar calls
ExecFetchSlotTuple, which finds that the slot doesn't contain a regular
tuple, so it calls ExecMaterializeSlot, which replaces the minimal tuple
with a regular tuple and frees the former. Now the already-fetched
Datum for t2.path is pointing at freed storage.

In principle there ought to be a whole lot of bugs around this area,
because ExecFetchSlotTuple, ExecFetchSlotMinimalTuple, and
ExecFetchSlotTupleDatum all are potentially destructive of the original
slot contents; furthermore there ought be be visible bugs in 8.3 and
maybe before. However, in an hour or so of poking at it, I've been
unable to produce a failure without using CTE syntax; it's just really
hard to get the planner to generate a whole-row-var reference in a
context where the referenced slot might contain a minimal tuple.
Still, I suspect that merely shows I'm being insufficiently creative
today. I think we ought to assume there's a related bug in existing
branches unless someone can prove there's not.

One solution to this problem is to redefine these functions as always
returning locally palloc'd tuples, so that they can be guaranteed to not
modify the contents of the Slot. That's fairly unfortunate though since
in the vast majority of cases it will result in a palloc/pfree cycle
that is not necessary. It would also mean changing most of the callers
to pfree the results, unless we can tolerate memory leaks there.

Plan B is to agree that these functions should be documented as
potentially destructive of the slot contents, in which case this is just
a bug in ExecEvalWholeRowVar: it should be using a call that is
nondestructive (eg ExecCopySlotTuple). (So far as I can determine from
looking at the callers, only ExecEvalWholeRowVar and its sibling
ExecEvalWholeRowSlow actually pose a risk; no other call sites look like
there's any risk of having other existing references into the slot
contents.)

Plan C is to change the Slot logic so that a slot can store both regular
and minimal tuples, not just one or the other. It would only actually
do that if one format is stored into it and then the other is requested.
The eventual ExecClearTuple would then free both copies. Whichever
format is not the one originally stored is secondary and isn't used for
fetching individual datums from the Slot. This nets out at the same
performance we have now, it just postpones the pfree() that currently
happens immediately when we change the slot's format. It might result
in higher peak memory use but I'm not exceedingly worried about that.

Plan C is probably the most complicated to code, but I'm inclined to try
to fix it that way, because plan A loses performance and plan B exposes
us to a continuing risk of future bugs of the same type.

Comments, better ideas?

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane), pgsql-hackers(at)postgresql(dot)org
Subject: Re: TupleTableSlot API problem
Date: 2009-03-29 23:38:49
Message-ID: 871vsfc31y.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> What is happening is that ExecProject fetches the Datum value of
Tom> t2.path from a TupleTableSlot that contains a "minimal tuple"
Tom> fetched from the tuplestore associated with the CTE "t". Then,
Tom> it fetches the value of the whole-row variable t2.
Tom> ExecEvalWholeRowVar calls ExecFetchSlotTuple, which finds that
Tom> the slot doesn't contain a regular tuple, so it calls
Tom> ExecMaterializeSlot, which replaces the minimal tuple with a
Tom> regular tuple and frees the former. Now the already-fetched
Tom> Datum for t2.path is pointing at freed storage.

Tom> In principle there ought to be a whole lot of bugs around this
Tom> area, because ExecFetchSlotTuple, ExecFetchSlotMinimalTuple, and
Tom> ExecFetchSlotTupleDatum all are potentially destructive of the
Tom> original slot contents; furthermore there ought be be visible
Tom> bugs in 8.3 and maybe before. However, in an hour or so of
Tom> poking at it, I've been unable to produce a failure without
Tom> using CTE syntax; it's just really hard to get the planner to
Tom> generate a whole-row-var reference in a context where the
Tom> referenced slot might contain a minimal tuple.

Generating the whole-row-var reference doesn't seem to be hard, it's
doing it in a context where slot->tts_shouldFree is _already_ set that
seems to be the issue.

For example, given some function foo(out a text, out b text) returning
setof record, the query select t.a, t from foo() t; follows the
sequence of events you describe, but it doesn't fail because
slot->tts_shouldFree is false, so the original minimaltuple isn't
slot->freed.

If there aren't any code paths in the back branches that have
tts_shouldFree set in this context, that would explain the lack of
previously visible bugs, no? Or am I completely misunderstanding it?

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TupleTableSlot API problem
Date: 2009-03-30 00:24:05
Message-ID: 4824.1238372645@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> In principle there ought to be a whole lot of bugs around this
> Tom> area, because ExecFetchSlotTuple, ExecFetchSlotMinimalTuple, and
> Tom> ExecFetchSlotTupleDatum all are potentially destructive of the
> Tom> original slot contents; furthermore there ought be be visible
> Tom> bugs in 8.3 and maybe before. However, in an hour or so of
> Tom> poking at it, I've been unable to produce a failure without
> Tom> using CTE syntax; it's just really hard to get the planner to
> Tom> generate a whole-row-var reference in a context where the
> Tom> referenced slot might contain a minimal tuple.

> Generating the whole-row-var reference doesn't seem to be hard, it's
> doing it in a context where slot->tts_shouldFree is _already_ set that
> seems to be the issue.

> For example, given some function foo(out a text, out b text) returning
> setof record, the query select t.a, t from foo() t; follows the
> sequence of events you describe, but it doesn't fail because
> slot->tts_shouldFree is false, so the original minimaltuple isn't
> slot->freed.

Yeah, good point. However I think that you could still get a failure.
The cases where a slot might contain a minimal tuple are generally where
we are reading out of a tuplestore or tuplesort object, and all you have
to do to get it to be a palloc'd mintuple is to make the test case big
enough so the tuplestore has dumped to disk. (Now that I think about
it, I failed to try scaling up the test cases I did try...)

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TupleTableSlot API problem
Date: 2009-03-30 00:29:29
Message-ID: 87wsa7am52.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>> For example, given some function foo(out a text, out b text) returning
>> setof record, the query select t.a, t from foo() t; follows the
>> sequence of events you describe, but it doesn't fail because
>> slot-> tts_shouldFree is false, so the original minimaltuple isn't
>> freed.

Tom> Yeah, good point. However I think that you could still get a
Tom> failure. The cases where a slot might contain a minimal tuple
Tom> are generally where we are reading out of a tuplestore or
Tom> tuplesort object, and all you have to do to get it to be a
Tom> palloc'd mintuple is to make the test case big enough so the
Tom> tuplestore has dumped to disk. (Now that I think about it, I
Tom> failed to try scaling up the test cases I did try...)

Aha; and indeed if you use select t.a, t from func() t; where the
function returns a set larger than work_mem, it does indeed fail
messily (against my -O0 --enable-cassert HEAD I just get corrupted
values for t.a, though, rather than an error). I'll try and reproduce
that on a back branch...

--
Andrew.


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TupleTableSlot API problem
Date: 2009-03-30 00:58:18
Message-ID: 87prfzakt1.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> Yeah, good point. However I think that you could still get a
Tom> failure. The cases where a slot might contain a minimal tuple
Tom> are generally where we are reading out of a tuplestore or
Tom> tuplesort object, and all you have to do to get it to be a
Tom> palloc'd mintuple is to make the test case big enough so the
Tom> tuplestore has dumped to disk. (Now that I think about it, I
Tom> failed to try scaling up the test cases I did try...)

Andrew> Aha; and indeed if you use select t.a, t from func() t; where
Andrew> the function returns a set larger than work_mem, it does
Andrew> indeed fail messily (against my -O0 --enable-cassert HEAD I
Andrew> just get corrupted values for t.a, though, rather than an
Andrew> error). I'll try and reproduce that on a back branch...

Yup, fails the same way on an --enable-cassert build of 8.3.7.

Without --enable-cassert it _appears_ to work, but this is presumably
just because the freed memory happens not to be being clobbered
immediately.

--
Andrew.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TupleTableSlot API problem
Date: 2009-03-30 01:55:36
Message-ID: 6572.1238378136@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> Yup, fails the same way on an --enable-cassert build of 8.3.7.

Do you have a quick test case? I just finished coding up my plan-C
fix, and I need some test cases ...

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TupleTableSlot API problem
Date: 2009-03-30 01:59:39
Message-ID: 87ab73ahys.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
>> Yup, fails the same way on an --enable-cassert build of 8.3.7.

Tom> Do you have a quick test case? I just finished coding up my
Tom> plan-C fix, and I need some test cases ...

This is the one I've been using:

create or replace function foo(n integer, out a text, out b text)
returns setof record language plpgsql as $f$
begin
return query(select 'foo '||i, 'bar '||i from generate_series(1,n) i);
end;
$f$;

set work_mem=64;

select t.a, t, t.a from foo(100000) t limit 1;
a | t | a
--------------------------------------------------------------+-------------------+-------
\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F | ("foo 1","bar 1") | foo 1
(1 row)

--
Andrew.


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TupleTableSlot API problem
Date: 2009-03-30 03:15:03
Message-ID: 874oxbaeh4.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>>> Yup, fails the same way on an --enable-cassert build of 8.3.7.

And on 8.2.13.

Tom> Do you have a quick test case? I just finished coding up my
Tom> plan-C fix, and I need some test cases ...

Andrew> This is the one I've been using:

This one is simpler and works on 8.2 as well:

create or replace function foo(n integer, out a text, out b text)
returns setof record language sql
as $f$ select 'foo '||i, 'bar '||i from generate_series(1,$1) i; $f$;

set work_mem=64;

select t.a, t, t.a from foo(100000) t limit 1;
ERROR: invalid memory alloc request size 2139062147

--
Andrew.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TupleTableSlot API problem
Date: 2009-03-30 03:29:56
Message-ID: 13509.1238383796@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> This one is simpler and works on 8.2 as well:

Yeah, I had just finished making the same adjustment to get an
8.2-compatible test case. 8.1 and before should be okay because
they haven't got the MinimalTuple business.

regards, tom lane