Fixing target-column indirection in INSERT with multiple VALUES

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Fixing target-column indirection in INSERT with multiple VALUES
Date: 2016-07-27 18:47:25
Message-ID: 9578.1469645245@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I looked into the problem reported in bug #14265,
https://www.postgresql.org/message-id/20160727005725.7438.26021@wrigleys.postgresql.org

The core of the problem is a design decision that seems pretty bad in
hindsight: if we have array-subscripting or field-selection decoration
in the target column list of an INSERT with multiple VALUES rows, then
we apply transformAssignedExpr() to each element of each VALUES row
independently. So for example in
INSERT INTO foo (col[1]) VALUES (1), (2), (3)
there are going to be three identical ArrayRef nodes (with different
input expressions) inside the VALUES RTE, and then just a Var referencing
the VALUES RTE in the top-level targetlist. While that works, it's not
so good when we have
INSERT INTO foo (col[1], col[2]) VALUES (1, 2), (3, 4)
because the rewriter needs to merge the assignments to "col" by nesting
the ArrayRefs together, but it fails to find the ArrayRefs in the
top-level targetlist, leading to the complained-of error.

Other reasons not to like this design are the space taken by the redundant
copies of the ArrayRef nodes, and the ugly logic needed in ruleutils.c
to deal with this situation, which is unlike either the INSERT-with-
single-VALUES-row or the INSERT-SELECT cases.

So attached is a proposed patch that fixes this bug by moving the ArrayRef
and/or FieldStore nodes associated with the column target list into the
top-level targetlist. I still have it calling transformAssignedExpr
for each VALUES expression, which means that the parser still generates
(and then throws away) ArrayRef/FieldStore nodes for each VALUES row when
there is indirection decoration in the target list. It would probably be
possible to avoid that, but it would take very invasive refactoring of
transformAssignmentIndirection and allied routines, and I'm dubious that
the case comes up often enough to be worth complicating that code even
more.

Another issue is that this would require an initdb because it changes the
representation of stored rules for cases like this, which means we could
only fix it in HEAD and there wouldn't be a back-patch. Given that it's
been like this forever and we've not had complaints before, I think that's
acceptable.

Conceivably we could fix this without initdb-forcing changes in the
parser's output, if we taught the rewriter to apply rewriteTargetListIU
to each row of the VALUES RTE rather than to the top-level targetlist.
But that would be a complex mess for multiple reasons --- for instance,
it would change the output column set for the RTE. So I'm not at all
excited about pursuing that path.

In short, I propose applying the attached to HEAD but not attempting
to fix the issue in the back branches.

regards, tom lane

Attachment Content-Type Size
fix-multi-values-indirection.patch text/x-diff 19.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-07-27 20:05:31 Re: Rethinking TupleTableSlot deforming
Previous Message Andres Freund 2016-07-27 17:29:34 Re: Macro customizable hashtable / bitmapscan & aggregation perf