Re: SQL/MED - file_fdw

From: Noah Misch <noah(at)leadboat(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, hanada(at)metrosystems(dot)co(dot)jp
Subject: Re: SQL/MED - file_fdw
Date: 2011-02-08 18:49:48
Message-ID: 20110208184948.GA960@tornado.gateway.2wire.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 08, 2011 at 09:25:29PM +0900, Itagaki Takahiro wrote:
> Here is a revised patch, that including jagged csv support.
> A new exported function is named as NextCopyFromRawFields.

Seems a bit incongruous to handle the OID column in that function. That part
fits with the other per-column code in NextCopyFrom().

> On Mon, Feb 7, 2011 at 21:16, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > Perhaps I'm missing something. ??The new API does not expose a "processed" count
> > at all; that count is used for the command tag of a top-level COPY. ??This part
> > of the patch is just changing how we structure the code to maintain that tally,
> > and it has no implications for observers outside copy.c. ??Right?
>
> True, but the counter is tightly bound with INSERT-side of COPY FROM.
> | copy.c (1978)
> | * We count only tuples not suppressed by a BEFORE INSERT trigger;
>
> I think it is cleaner way to separate it from CopyState
> because it is used as a file reader rather than a writer.
> However, if there are objections, I'd revert it.

No significant objection. I just wished to be clear on whether it was pure
refactoring, or something more.

> > ExecEvalExpr(), used to acquire default values, will still use the
> > per-tuple context of CopyState->estate. ??That per-tuple context will never get
> > reset explicitly, so default value computations leak until EndCopyFrom().
>
> Ah, I see. I didn't see the leak because we rarely use per-tuple memory
> context in the estate. We just use CurrentMemoryContext in many cases.
> But the leak could occur, and the code is misleading.
> I moved ResetPerTupleExprContext() into NextCopyFrom(), but
> CurrentMemoryContext still used in it to the result values.

The code still violates the contract of ExecEvalExpr() by calling it with
CurrentMemoryContext != econtext->ecxt_per_tuple_memory.

> Another possible design might be passing EState as an argument of
> NextCopyFrom and remove estate from CopyState. It seems a much cleaner way
> in terms of control flow, but it requires more changes in file_fdw.
> Comments?

Seems sensible and more-consistent with the typical structure of executor code.
Do we place any constraints on sharing of EState among different layers like
this? I could not identify any, offhand.

The new API uses EState for two things. First, BeginCopyFrom() passes it to
ExecPrepareExpr(). Instead, let's use expression_planner() + ExecInitExpr() and
require that we've been called with a memory context of suitable longevity.
Things will break anyway if BeginCopyFrom()'s CurrentMemoryContext gets reset
too early. This way, we no longer need an EState in BeginCopyFrom().

Second, NextCopyFrom() sends the per-output-tuple ExprContext of the EState to
ExecEvalExpr(). It really needs a specific ExprContext, not an EState. A
top-level COPY has a bijection between input and output tuples, making the
distinction unimportant. GetPerTupleExprContext() is wrong for a file_fdw scan,
though. We need the ExprContext of the ForeignScanState or another of
equivalent lifecycle. NextCopyFrom() would then require that it be called with
CurrentMemoryContext == econtext->ecxt_per_tuple_memory.

Sorry, I missed a lot of these memory details on my first couple of reviews.

nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2011-02-08 18:52:03 Scheduled maintenance affecting gitmaster
Previous Message Peter Eisentraut 2011-02-08 18:47:48 Re: Add ENCODING option to COPY