Re: ask for review of MERGE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, Boxuan Zhai <bxzhai2010(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: ask for review of MERGE
Date: 2010-10-24 17:43:35
Message-ID: AANLkTineR-rDFWENeddLg=GrkT+epMHk2j9X0YqpiTY8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 24, 2010 at 12:15 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Robert Haas wrote:
>> I am also wondering exactly what the semantics are supposed to be
>> under concurrency.  We can't assume that it's OK to treat WHEN NOT
>> MATCHED as WHEN MATCHED if a unique-key violation is encountered.  The
>> join condition could be something else entirely.  I guess we could
>> scan the target table with a dirty snapshot and block on any in-doubt
>> tuples, similar to what EXCLUDE constraints do internally, but
>> throwing MVCC out the window doesn't seem right either.
>
> [discussion of EPQ behavior for UPDATE statements]
>
> What I think we can do with adding a key reservation is apply the same sort
> of logic to INSERTs too, given a way to lock an index entry before the row
> itself is complete.  If MERGE hits a WHERE condition that finds a key lock
> entry in the index, it has to sit and wait for that other command to finish.
>  There isn't any other sensible behavior I can see in that situation, just
> like there isn't one for UPDATE right now.

Well, there's no guarantee that any index at all exists on the target
table, so any solution based on index locks can't be fully general.

But let's back up and talk about MVCC for a minute. Suppose we have
three source tuples, (1), (2), and (3); and the target table contains
tuples (1) and (2), of which only (1) is visible to our MVCC snapshot;
suppose also an equijoin. Clearly, source tuple (1) should fire the
MATCHED rule and source tuple (3) should fire the NOT MATCHED rule,
but what in the world should source tuple (2) do? AFAICS, the only
sensible behavior is to throw a serialization error, because no matter
what you do the results won't be equivalent to a serial execution of
the transaction that committed target tuple (2) and the transaction
that contains the MERGE.

Even with predicate locks, it's not obvious how to me how solve this
problem. Target tuple (2) may already be there, and its transaction
already committed, by the time the MERGE statement gets around to
looking at the source data. In fact, even taking an
AccessExclusiveLock on the target table doesn't fix this, because the
snapshot would be taken before the lock.

> been done on this feature until now.  This one splits nicely into "get the
> implemenation working" and "improve the concurrency" parts, and I haven't
> heard a good reason yet why those need to coupled together.

Sounds like we're all on the same page.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-10-24 19:28:03 Re: WIP: extensible enums
Previous Message Dean Rasheed 2010-10-24 16:58:57 Re: WIP: extensible enums