Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Date: 2014-11-10 23:33:07
Message-ID: CAM3SWZQz+jYkwfuZvcSf0qtpa2QiY+8NGNcHjfWgz3DDzRfzEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 5, 2014 at 1:09 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Once I fix that (provided it doesn't take too long), I'll publish a
> V1.4. AFAICT, that'll close out all of the current open issues.

Attached is V1.4. As with V1.3, I continue to maintain both approaches
to value locking in parallel, believing this to be the most useful
direction for development to take for the time being. The consensus is
for approach #2 to value locking [1], but I see no reason to deny
reviewers the chance to compare both approaches. It's easy to maintain
the two, as the value locking implementation is well encapsulated -
The executor level stuff that has been altered in the last few
revisions tends to cause very few or no conflicts when rebasing.

Highlights
=======

* Costing of indexes for the purposes of determining which to have
arbitrate whether or not the executor takes the alternative path. So,
a list of expressions is created during parse analysis, and that list
is matched against existing indexes during optimization. It's usually
possible to avoid the work of generating paths, because (it seems
reasonable to suppose) there is usually 0 or 1 possible indexes in
representative cases. If it's 0, we get an error, originating from
where we now do this work -- the optimizer.

* EXCLUDED.* (and TARGET.*) pseudo-aliases (compare OLD.* and NEW.* in
the context of user-defined rules and conditional triggers) are
visible within auxiliary UPDATE (but not parent INSERT). See the
commit message for details on how that works. In short, we still have
a dedicated primnode expression, ExcludedExpr, but it is not ever
generated by the raw grammar (it can only be added by the during the
rewriting stage of query processing). It's just a facade, but a
perfectly convincing one. Note that this means that Vars can be
referenced from "another RTE" in what is actually a relation scan node
of the target:

postgres=# explain INSERT INTO upsert values(1, 'foo') on conflict
(key) update set val = excluded.val where excluded.val != 'bar';
QUERY PLAN
------------------------------------------------------------------------
Insert on upsert (cost=0.00..0.01 rows=1 width=0)
-> Result (cost=0.00..0.01 rows=1 width=0)
-> Conflict Update on upsert (cost=0.00..32.99 rows=1591 width=36)
Filter: ((excluded.val) <> 'bar'::text)
(4 rows)

Here, you're seeing a "Conflict Update" scan (actually, a quasi-hidden
sequential scan) on the upsert table that references a Var from the
facade excluded.* table/RTE. In fact, the Var is on the target table,
but read through our internal expression primnode (ExcludedExpr) so as
to get access to the excluded-from-insertion tuple slot during EPQ
expression evaluation for the UPDATE.

* postgres_fdw support for the IGNORE variant (provided there was no
unique index inference specification - just as with updatable views).

* Documentation clean-up - as I mentioned, I tried to address Simon's
concerns here. Also, as you'd expect, the documentation has been fixed
up to reflect the new syntax. I'll need to take a pass at updating the
UPSERT Wiki page soon, too.

Next steps
========

AFAICT, this revision addresses all open items bar one - the RLS bug,
which I could not decide on a fix for. I refer to the RLS issue
described on the Wiki [2]. As I mentioned before, I'd really like to
get some reviewer time on the executor level aspects of this, which
are relatively new, and have received no scrutiny from anyone else
that I'm aware of. This list of items is a good place to start, for
those that are interested:

https://wiki.postgresql.org/wiki/UPSERT#Miscellaneous_odd_properties_of_proposed_ON_CONFLICT_patch

My use of the EvalPlanQual() mechanism, and the structure of the plan
tree in general could really use some scrutiny too.

Thanks

[1] https://wiki.postgresql.org/wiki/Value_locking#.232._.22Promise.22_heap_tuples_.28Heikki_Linnakangas.29
[2] https://wiki.postgresql.org/wiki/UPSERT#RLS
--
Peter Geoghegan

Attachment Content-Type Size
v1.4.vallock2.tar.gz application/x-gzip 74.8 KB
v1.4.vallock1.tar.gz application/x-gzip 88.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2014-11-11 00:00:28 Re: using custom scan nodes to prototype parallel sequential scan
Previous Message Andres Freund 2014-11-10 23:21:05 Re: using custom scan nodes to prototype parallel sequential scan