Re: SRF memory mgmt patch (was [HACKERS] Concern about memory management

Lists: pgsql-hackerspgsql-patches
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Concern about memory management with SRFs
Date: 2002-08-28 23:08:45
Message-ID: 22310.1030576125@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I've been looking at the sample SRFs (show_all_settings etc) and am not
happy about the way memory management is done. As the code is currently
set up, the functions essentially assume that they are executed in a
context that will never be reset until they're done returning tuples.
(This is true because tupledescs and so on are blithely constructed in
CurrentMemoryContext during the first call.)

This approach means that SRFs cannot afford to leak any memory per-call.
If they do, and the result set is large, they will run the backend out
of memory. I don't think that's acceptable.

The reason that the code fails to crash is that nodeFunctionscan.c
doesn't do a ResetExprContext(econtext) in the loop that collects rows
from the function and stashes them in the tuplestore. But I think it
must do so in the long run, and so it would be better to get this right
the first time.

I think we should document that any memory that is allocated in the
first call for use in subsequent calls must come from the memory context
saved in FuncCallContext (and let's choose a more meaningful name than
fmctx, please). This would mean adding code like

oldcontext = MemoryContextSwitchTo(funcctx->fmctx);

...

MemoryContextSwitchTo(oldcontext);

around the setup code that follows SRF_FIRSTCALL_INIT. Then it would be
safe for nodeFunctionscan.c to do a reset before each function call.

Comments?

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Concern about memory management with SRFs
Date: 2002-08-29 01:19:54
Message-ID: 3D6D76BA.80600@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> I think we should document that any memory that is allocated in the
> first call for use in subsequent calls must come from the memory context
> saved in FuncCallContext (and let's choose a more meaningful name than
> fmctx, please). This would mean adding code like
>
> oldcontext = MemoryContextSwitchTo(funcctx->fmctx);
>
> ...
>
> MemoryContextSwitchTo(oldcontext);
>
> around the setup code that follows SRF_FIRSTCALL_INIT. Then it would be
> safe for nodeFunctionscan.c to do a reset before each function call.

That sounds like a good plan.

But can/should we wrap those calls in either existing or new macros? Or
is it better to have the function author keenly aware of the memory
management details? I tend to think the former is better.

Maybe SRF_FIRSTCALL_INIT()(init_MultiFuncCall()) should:
- save CurrentMemoryContext to funcctx->per_call_memory_ctx
(new member of the struct)
- save fcinfo->flinfo->fn_mcxt to funcctx->multi_call_memory_ctx
(nee funcctx->fmctx)
- leave fcinfo->flinfo->fn_mcxt as the current memory context when it
exits

Then SRF_PERCALL_SETUP() (per_MultiFuncCall()) can change back to
funcctx->per_call_memory_ctx.

Would this work?

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Concern about memory management with SRFs
Date: 2002-08-29 01:32:09
Message-ID: 6481.1030584729@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Maybe SRF_FIRSTCALL_INIT()(init_MultiFuncCall()) should:
> - save CurrentMemoryContext to funcctx->per_call_memory_ctx
> (new member of the struct)
> - save fcinfo->flinfo->fn_mcxt to funcctx->multi_call_memory_ctx
> (nee funcctx->fmctx)
> - leave fcinfo->flinfo->fn_mcxt as the current memory context when it
> exits

> Then SRF_PERCALL_SETUP() (per_MultiFuncCall()) can change back to
> funcctx->per_call_memory_ctx.

I thought about that and didn't like it; it may simplify the simple case
but I think it actively gets in the way of less-simple cases. For
example, the FIRSTCALL code might generate some transient structures
along with ones that it wants to keep. Also, your recommended
pseudocode allows the author to write code between the end of the
FIRSTCALL branch and the PERCALL_SETUP call; that code will not execute
in a predictable context if we do it this way.

I'm also not happy with the implied assumption that every call to the
function executes in the same transient context. That is true at the
moment but I'd just as soon not see it as a wired-in assumption.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Concern about memory management with SRFs
Date: 2002-08-29 01:46:55
Message-ID: 3D6D7D0F.1080307@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> I thought about that and didn't like it; it may simplify the simple case
> but I think it actively gets in the way of less-simple cases. For
> example, the FIRSTCALL code might generate some transient structures
> along with ones that it wants to keep. Also, your recommended
> pseudocode allows the author to write code between the end of the
> FIRSTCALL branch and the PERCALL_SETUP call; that code will not execute
> in a predictable context if we do it this way.
>
> I'm also not happy with the implied assumption that every call to the
> function executes in the same transient context. That is true at the
> moment but I'd just as soon not see it as a wired-in assumption.

Fair enough. I'll take a shot at the necessary changes (if you want me
to). Is it OK to use fcinfo->flinfo->fn_mcxt as the long term memory
context or is there a better choice? Is funcctx->multi_call_memory_ctx a
suitable name in place of funcctx->fmctx?

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Concern about memory management with SRFs
Date: 2002-08-29 01:50:41
Message-ID: 6618.1030585841@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Is it OK to use fcinfo->flinfo->fn_mcxt as the long term memory
> context or is there a better choice?

That is the correct choice.

> Is funcctx->multi_call_memory_ctx a
> suitable name in place of funcctx->fmctx?

No objection here.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: SRF memory mgmt patch (was [HACKERS] Concern about memory management with SRFs)
Date: 2002-08-29 06:53:55
Message-ID: 3D6DC503.3060802@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>>Is it OK to use fcinfo->flinfo->fn_mcxt as the long term memory
>>context or is there a better choice?
>
> That is the correct choice.
>
>>Is funcctx->multi_call_memory_ctx a
>>suitable name in place of funcctx->fmctx?
>
> No objection here.

Here's a patch to address Tom's SRF API memory management concerns, as
discussed earlier today on HACKERS.

Please note that, although this should apply cleanly on cvs tip, it will
have (two) failed hunks (nodeFunctionscan.c) *if* applied after Neil's
plpgsql SRF patch. Or it will cause a failure in Neil's patch if it is
applied first (I think). The fix in either case is to wrap the loop that
collects rows from the function and stashes them in the tuplestore as
follows:

do until no more tuples
+ ExprContext *econtext = scanstate->csstate.cstate.cs_ExprContext;

get one tuple
put it in the tuplestore

+ ResetExprContext(econtext);
loop

Also note that contrib/dblink is intentionally missing, because I'm
still working on other aspects of it. I'll have an updated dblink in a
day or so.

If there are no objections, please apply.

Thanks,

Joe

Attachment Content-Type Size
srf-api-memctx.1.patch text/plain 19.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about memory management with SRFs)
Date: 2002-08-29 17:18:14
Message-ID: 19627.1030641494@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Here's a patch to address Tom's SRF API memory management concerns, as
> discussed earlier today on HACKERS.

Patch committed.

It seemed to me that pgstattuple.c does not really want to be an SRF,
but only a function returning a single tuple. As such, it can provide
a fine example of using the funcapi.h tuple-building machinery *without*
the SRF machinery. I changed it accordingly, but am not able to update
README.pgstattuple.euc_jp; Tatsuo-san, would you handle that?

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about memory management
Date: 2002-08-29 23:32:55
Message-ID: 3D6EAF27.4090604@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Patch committed.

Given the change to SRF memory management, is the following still
necessary (or possibly even incorrect)?

in funcapi.c:
per_MultiFuncCall(PG_FUNCTION_ARGS)
{
FuncCallContext *retval = (FuncCallContext *)
fcinfo->flinfo->fn_extra;

/* make sure we start with a fresh slot */
if(retval->slot != NULL)
ExecClearTuple(retval->slot);

return retval;
}

All but one of the SRFs I've tried don't seem to care, but I have one
that is getting an assertion:

0x42029331 in kill () from /lib/i686/libc.so.6
(gdb) bt
#0 0x42029331 in kill () from /lib/i686/libc.so.6
#1 0x4202911a in raise () from /lib/i686/libc.so.6
#2 0x4202a8c2 in abort () from /lib/i686/libc.so.6
#3 0x08179ab9 in ExceptionalCondition () at assert.c:48
#4 0x0818416f in pfree (pointer=0x7f7f7f7f) at mcxt.c:470
#5 0x0806bd32 in heap_freetuple (htup=0x832bb80) at heaptuple.c:736
#6 0x080e47df in ExecClearTuple (slot=0x832b2cc) at execTuples.c:406
#7 0x0817cf49 in per_MultiFuncCall (fcinfo=0xbfffe8e0) at funcapi.c:88
#8 0x40017273 in dblink_get_pkey (fcinfo=0xbfffe8e0) at dblink.c:911

Not quite sure why yet, but I'm thinking the ExecClearTuple() is no
longer needed/desired anyway.

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about memory management with SRFs)
Date: 2002-08-29 23:40:11
Message-ID: 13672.1030664411@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Given the change to SRF memory management, is the following still
> necessary (or possibly even incorrect)?

> /* make sure we start with a fresh slot */
> if(retval->slot != NULL)
> ExecClearTuple(retval->slot);

I don't think it was ever necessary... but there's nothing wrong with
it either.

> All but one of the SRFs I've tried don't seem to care, but I have one
> that is getting an assertion:

> 0x42029331 in kill () from /lib/i686/libc.so.6
> (gdb) bt
> #0 0x42029331 in kill () from /lib/i686/libc.so.6
> #1 0x4202911a in raise () from /lib/i686/libc.so.6
> #2 0x4202a8c2 in abort () from /lib/i686/libc.so.6
> #3 0x08179ab9 in ExceptionalCondition () at assert.c:48
> #4 0x0818416f in pfree (pointer=0x7f7f7f7f) at mcxt.c:470
> #5 0x0806bd32 in heap_freetuple (htup=0x832bb80) at heaptuple.c:736
> #6 0x080e47df in ExecClearTuple (slot=0x832b2cc) at execTuples.c:406
> #7 0x0817cf49 in per_MultiFuncCall (fcinfo=0xbfffe8e0) at funcapi.c:88
> #8 0x40017273 in dblink_get_pkey (fcinfo=0xbfffe8e0) at dblink.c:911

> Not quite sure why yet, but I'm thinking the ExecClearTuple() is no
> longer needed/desired anyway.

You'll need to fix that anyway because the next ExecStoreTuple will try
to do an ExecClearTuple. Looks like the same tuple is being freed
twice. Once you've handed a tuple to ExecStoreTuple it's not yours to
free anymore; perhaps some bit of code in dblink has that wrong?

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about memory management
Date: 2002-08-29 23:48:54
Message-ID: 3D6EB2E6.1070902@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> You'll need to fix that anyway because the next ExecStoreTuple will try
> to do an ExecClearTuple. Looks like the same tuple is being freed
> twice. Once you've handed a tuple to ExecStoreTuple it's not yours to
> free anymore; perhaps some bit of code in dblink has that wrong?

That's just it:
0x40017273 in dblink_get_pkey (fcinfo=0xbfffe8e0) at dblink.c:911
*is*
funcctx = SRF_PERCALL_SETUP();
which is is a macro
#define SRF_PERCALL_SETUP() per_MultiFuncCall(fcinfo)

When I remove the call to ExecClearTuple() from per_MultiFuncCall(),
everything starts to work.

As you said, if the next ExecStoreTuple will try to do an
ExecClearTuple(), ISTM that it should be removed from
per_MultiFuncCall()/SRF_PERCALL_SETUP(). Or am I crazy?

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about memory management with SRFs)
Date: 2002-08-30 00:01:32
Message-ID: 14329.1030665692@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> As you said, if the next ExecStoreTuple will try to do an
> ExecClearTuple(), ISTM that it should be removed from
> per_MultiFuncCall()/SRF_PERCALL_SETUP().

No, it's not necessary: ExecClearTuple knows the difference between a
full and an empty TupleSlot.

I'm not sure where the excess free is coming from, but it ain't
ExecClearTuple's fault. You might try setting a breakpoint at
heap_freetuple to see if that helps.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about memory management with SRFs)
Date: 2002-08-30 00:07:35
Message-ID: 14452.1030666055@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> As you said, if the next ExecStoreTuple will try to do an
> ExecClearTuple(), ISTM that it should be removed from
> per_MultiFuncCall()/SRF_PERCALL_SETUP(). Or am I crazy?

Actually ... on second thought ...

I bet the real issue here is that we have a long-lived TupleTableSlot
pointing at a short-lived tuple. (I assume you're just forming the
tuple in the function's working context, no?)

When ExecClearTuple is called on the next time through, it tries to
pfree a tuple that has already been recycled along with the rest of
the short-term context. Result: coredump.

However, if that were the story then *none* of the SRFs returning
tuple should work, and they do. So I'm still confused.

But I suspect that what we want to do is take management of the tuples
away from the Slot: pass should_free = FALSE to ExecStoreTuple in the
TupleGetDatum macro. The ClearTuple call *is* appropriate when you do
that, because it will reset the Slot to empty rather than leaving it
containing a dangling pointer to a long-since-freed tuple.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about memory management
Date: 2002-08-30 00:21:44
Message-ID: 3D6EBA98.9050601@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>As you said, if the next ExecStoreTuple will try to do an
>>ExecClearTuple(), ISTM that it should be removed from
>>per_MultiFuncCall()/SRF_PERCALL_SETUP(). Or am I crazy?
>
>
> Actually ... on second thought ...
>
> I bet the real issue here is that we have a long-lived TupleTableSlot
> pointing at a short-lived tuple. (I assume you're just forming the
> tuple in the function's working context, no?)

yep

> When ExecClearTuple is called on the next time through, it tries to
> pfree a tuple that has already been recycled along with the rest of
> the short-term context. Result: coredump.
>
> However, if that were the story then *none* of the SRFs returning
> tuple should work, and they do. So I'm still confused.

That's what had me confused.

I have found the smoking gun, however. I had changed this function from
returning setof text, to returning setof
two_column_named_composite_type. *However*. I had not dropped and
recreated the function with the proper declaration. Once I redeclared
the function properly, the coredump went away, even with current
per_MultiFuncCall() code.

The way I found this was by removing ExecClearTuple() from
per_MultiFuncCall(). That allowed the function to return without core
dumping, but it gave me one column of garbage -- that finally clued me in.

> But I suspect that what we want to do is take management of the tuples
> away from the Slot: pass should_free = FALSE to ExecStoreTuple in the
> TupleGetDatum macro. The ClearTuple call *is* appropriate when you do
> that, because it will reset the Slot to empty rather than leaving it
> containing a dangling pointer to a long-since-freed tuple.

OK. I'll make that change and leave ExecClearTuple() in
per_MultiFuncCall(). Sound like a plan?

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about memory management with SRFs)
Date: 2002-08-30 00:46:24
Message-ID: 15535.1030668384@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> I have found the smoking gun, however. I had changed this function from
> returning setof text, to returning setof
> two_column_named_composite_type. *However*. I had not dropped and
> recreated the function with the proper declaration. Once I redeclared
> the function properly, the coredump went away, even with current
> per_MultiFuncCall() code.

Ah. I think the changes I just committed would have helped:
nodeFunctionscan.c now runs a tupledesc_mismatch() check regardless of
whether it thinks the function returns RECORD or not.

>> But I suspect that what we want to do is take management of the tuples
>> away from the Slot: pass should_free = FALSE to ExecStoreTuple in the
>> TupleGetDatum macro. The ClearTuple call *is* appropriate when you do
>> that, because it will reset the Slot to empty rather than leaving it
>> containing a dangling pointer to a long-since-freed tuple.

> OK. I'll make that change and leave ExecClearTuple() in
> per_MultiFuncCall(). Sound like a plan?

First let's see if we can figure out why the code is failing to fail
as it stands. The fact that it's not dumping core says there's
something we don't understand yet ...

regards, tom lane


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: mail(at)joeconway(dot)com, pgsql-patches(at)postgresql(dot)org
Subject: Re: SRF memory mgmt patch
Date: 2002-08-30 01:25:48
Message-ID: 20020830.102548.26275686.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> Joe Conway <mail(at)joeconway(dot)com> writes:
> > Here's a patch to address Tom's SRF API memory management concerns, as
> > discussed earlier today on HACKERS.
>
> Patch committed.
>
> It seemed to me that pgstattuple.c does not really want to be an SRF,
> but only a function returning a single tuple.

Thank you for modifying pgstattuple.c. You are right, it does not want
to return more than 1 tuple.

> As such, it can provide
> a fine example of using the funcapi.h tuple-building machinery *without*
> the SRF machinery. I changed it accordingly, but am not able to update
> README.pgstattuple.euc_jp; Tatsuo-san, would you handle that?

Sure. I'll take care of that.
--
Tatsuo Ishii


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-patches(at)postgresql(dot)org
Subject: Re: SRF memory mgmt patch
Date: 2002-08-30 01:29:45
Message-ID: 3D6ECA89.60800@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tatsuo Ishii wrote:
>>It seemed to me that pgstattuple.c does not really want to be an SRF,
>>but only a function returning a single tuple.
>
> Thank you for modifying pgstattuple.c. You are right, it does not want
> to return more than 1 tuple.
>

I noticed that too, but it did occur to me that at some point you might
want to make the function return a row for every table in a database.
Perhaps even another system view (like pg_locks or pg_settings)?

Joe


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about
Date: 2002-08-30 04:29:49
Message-ID: 3D6EF4BD.8060908@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> First let's see if we can figure out why the code is failing to fail
> as it stands. The fact that it's not dumping core says there's
> something we don't understand yet ...

I'm not sure if the attached will help figure it out, but at the very
least it was eye-opening for me. I ran a test on
dblink_get_pkey('foobar') that returns 5 rows. I had a breakpoint set in
ExecClearTuple. I found that ExecClearTuple was called a total of 32
times for 5 returned rows!

Relevant to this discussion was that ExecClearTuple was called three
times, with the same slot pointer, for each function call to
dblink_get_pkey. Once in SRF_PERCALL_SETUP (per_MultiFuncCall), once in
TupleGetDatum (ExecStoreTuple), and once in FunctionNext in the loop
that builds the tuplestore.

Unfortunately I have not been able to get back to a point where I see a
coredump :(. But, that did seem to be related to calling the function
with an inappropriate declaration (now it just gives me garbage instead
of dumping core, even though I reverted the per_MultiFuncCall changes I
made earlier). I'll keep messing with this for a while, but I was hoping
the attached info would lead to some more suggestions of where to be
looking.

Thanks,

Joe

Attachment Content-Type Size
srf-cleartuple.dbg text/plain 13.4 KB

From: Joe Conway <mail(at)joeconway(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about
Date: 2002-08-30 17:34:32
Message-ID: 3D6FACA8.50008@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway wrote:
> Unfortunately I have not been able to get back to a point where I see a
> coredump :(. But, that did seem to be related to calling the function
> with an inappropriate declaration (now it just gives me garbage instead
> of dumping core, even though I reverted the per_MultiFuncCall changes I
> made earlier). I'll keep messing with this for a while, but I was hoping
> the attached info would lead to some more suggestions of where to be
> looking.

It's back as of cvs tip. This time, it looks like all table functions
are failing in the same manner:

#4 0x081845fb in pfree (pointer=0x7f7f7f7f) at mcxt.c:470
#5 0x0806bdb2 in heap_freetuple (htup=0x82fc7b8) at heaptuple.c:736
#6 0x080e4cbf in ExecClearTuple (slot=0x82f92f4) at execTuples.c:406
#7 0x0817d3ad in per_MultiFuncCall (fcinfo=0xbfffe9e0) at funcapi.c:88
#8 0x0814af25 in pg_lock_status (fcinfo=0xbfffe9e0) at lockfuncs.c:69
#9 0x080e3990 in ExecMakeTableFunctionResult (funcexpr=0x82e9fa0,

#4 0x081845fb in pfree (pointer=0x7f7f7f7f) at mcxt.c:470
#5 0x0806bdb2 in heap_freetuple (htup=0x82f43a4) at heaptuple.c:736
#6 0x080e4cbf in ExecClearTuple (slot=0x82e9f2c) at execTuples.c:406
#7 0x0817d3ad in per_MultiFuncCall (fcinfo=0xbfffe9e0) at funcapi.c:88
#8 0x40016a4b in dblink_record (fcinfo=0xbfffe9e0) at dblink.c:518
#9 0x080e3990 in ExecMakeTableFunctionResult (funcexpr=0x82e8df8,

#4 0x081845fb in pfree (pointer=0x7f7f7f7f) at mcxt.c:470
#5 0x0806bdb2 in heap_freetuple (htup=0x83026f8) at heaptuple.c:736
#6 0x080e4cbf in ExecClearTuple (slot=0x82f71cc) at execTuples.c:406
#7 0x0817d3ad in per_MultiFuncCall (fcinfo=0xbfffe9e0) at funcapi.c:88
#8 0x08181635 in show_all_settings (fcinfo=0xbfffe9e0) at guc.c:2469
#9 0x080e3990 in ExecMakeTableFunctionResult (funcexpr=0x82f64a0,

Currently all C language table functions are broken :(, but all sql
language table functions seem to work -- which is why regression doesn't
fail (pointing out the need to add a select * from pg_settings to a
regression test somewhere).

I'm looking at this now. I suspect the easy fix is to remove
ExecClearTuple from per_MultiFuncCall, but I'll try to understand what's
going on first.

Joe


From: Joe Conway <mail(at)joeconway(dot)com>
To: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about
Date: 2002-08-30 17:51:35
Message-ID: 3D6FB0A7.1080409@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway wrote:
> I'm looking at this now. I suspect the easy fix is to remove
> ExecClearTuple from per_MultiFuncCall, but I'll try to understand what's
> going on first.
>

On second thought, *all* functions failing is what you expected, right
Tom? I just changed TupleGetDatum() as we discussed:

#define TupleGetDatum(_slot, _tuple) \
PointerGetDatum(ExecStoreTuple(_tuple, _slot, InvalidBuffer, false))

and now everything works again. Is this the preferred fix and/or is it
worth spending more time to dig into this?

Joe


From: Joe Conway <mail(at)joeconway(dot)com>
To:
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about
Date: 2002-08-30 18:51:43
Message-ID: 3D6FBEBF.4000005@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway wrote:
> Joe Conway wrote:
>
>> I'm looking at this now. I suspect the easy fix is to remove
>> ExecClearTuple from per_MultiFuncCall, but I'll try to understand
>> what's going on first.
>>
>
> On second thought, *all* functions failing is what you expected, right
> Tom? I just changed TupleGetDatum() as we discussed:
>
> #define TupleGetDatum(_slot, _tuple) \
> PointerGetDatum(ExecStoreTuple(_tuple, _slot, InvalidBuffer, false))
>
> and now everything works again. Is this the preferred fix and/or is it
> worth spending more time to dig into this?

Here is a patch with the above mentioned fix. It also has an addition to
rangefuncs.sql and rangefuncs.out to ensure a C language table function
gets tested. I did this by adding
SELECT * FROM pg_settings WHERE name LIKE 'enable%';
to the test. I think this should produce reasonably stable results, but
obviously will require some maintenance if we add/remove a GUC variable
matching this criteria. Alternative suggestions welcomed, but if there
are no objections, please commit.

Thanks,

Joe

Attachment Content-Type Size
tablefunc-fix.1.patch text/plain 2.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about
Date: 2002-08-30 19:38:29
Message-ID: 18712.1030736309@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
>> On second thought, *all* functions failing is what you expected, right
>> Tom?

Yeah. I coulda sworn I tested pg_settings yesterday after making those
other changes, but I must not have; it's sure failing for me today.

> Here is a patch with the above mentioned fix. It also has an addition to
> rangefuncs.sql and rangefuncs.out to ensure a C language table function
> gets tested.

Good idea. Will apply.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about
Date: 2002-08-30 20:18:30
Message-ID: 3D6FD316.7000104@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>>Here is a patch with the above mentioned fix. It also has an addition to
>>rangefuncs.sql and rangefuncs.out to ensure a C language table function
>>gets tested.
>
> Good idea. Will apply.

BTW, Neil, do you have a sample plpgsql table function that can be
included in the rangefuncs regression test?

Thanks,

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Neil Conway <neilc(at)samurai(dot)com>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: SRF memory mgmt patch (was [HACKERS] Concern about
Date: 2002-08-30 21:09:23
Message-ID: 25479.1030741763@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> BTW, Neil, do you have a sample plpgsql table function that can be
> included in the rangefuncs regression test?

The plpgsql regression test has 'em, down at the end.

regards, tom lane