Re: UPSERT wiki page, and SQL MERGE syntax

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: UPSERT wiki page, and SQL MERGE syntax
Date: 2014-10-19 23:26:09
Message-ID: CAM3SWZQhiXQi1osT14V7spjQrUpmcnRtbXJe846-EB1bC+9i1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 19, 2014 at 2:52 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> Well OLD and NEW are also not joins yet we expose them this way. It
> always seemed like a hack to me but better one hack than two different
> inconsistent hacks, no?

In my opinion, no. Those hacks do not appear in the parse analysis of
ordinary optimizable statements, only utility statements concerning
rules. ChangeVarNodes() does actual rewriting of magic Vars for stored
rules during rewriting, much later. Within the parser code's
directory, only transformRuleStmt() knows about the magic RTE entries
appearing in rules (PRS2_OLD_VARNO and PRS2_NEW_VARNO), and that's in
the file parse_utilcmd.c. OLD.* and NEW.* are dynamically expanded
placeholders that only exist for the purposes of stored rules. You can
see why at the very least it's a PITA to teach more or less ordinary
parse analysis - my new invocation of transformUpdateStmt() - to care
about that. Note that there are hardly any changes to
transformUpdateStmt() in the patch, and as I've said, I like that a
lot.

Even with stored rules, we only see ChangeVarNodes() changing those
OLD.*/NEW.* placeholders (that have dummy relation RTEs) to refer to
actual, existing RTEs (I think that these are typically added by rule
invocation, but I'm not an expert on the rewriter). We're not talking
about creating a new RTE artificially during ordinary parse analysis
of conventional/optimizable statements. In the patch, parse analysis
thinks that CONFLICTING(my_var) is actually an ordinary reference to
target.my_var that happens to involve ConflictingExpr, which I think
is appropriate. Making available the post-before-insert-row-triggers
tuple is driven by new code called by ExecInsert(). The executor
drives all this. All I require from the planner is a dead simple
sequential scan based update plan, that I can use the EvalPlanQual()
infrastructure with (no more, no less), that doesn't attempt to apply
any optimization (or refer to something external) that isn't
compatible with how the plan needs to be used here.

I want to ensure that the optimizer gives me only that, and magic RTEs
that need to last until execution are bad news, in my (admittedly
non-expert) opinion.

I also happen to think that the two situations (stored rules versus
CONFLICTING()) are fundamentally dissimilar from a user's perspective,
as I've said.

> Independent of all that though. CONFLICTING() isn't clear to me --
> conflicting is a relative property of two or more objects which are
> both (or all) conflicting with each other. Which one does
> "CONFLICTING" refer to here?

I think you're right about that. I thought about changing the name to
"EXCLUDED()", which reflects that the tuple is the actual tuple
excluded from insertion. In particular, a version that carries forward
all of the effects of before row insert triggers. What do you think of
that?

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2014-10-20 00:37:38 Re: Proposal : REINDEX SCHEMA
Previous Message Michael Paquier 2014-10-19 23:25:07 Re: pgcrypto: PGP signatures