Quick Links

Re: Exposing the Xact commit order to the user

Lists:	pgsql-hackers

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	<pgsql-hackers(at)postgresql(dot)org>,<JanWieck(at)Yahoo(dot)com>
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-24 15:24:07
Message-ID:	4BFA53C702000025000319B3@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Jan Wieck wrote:

> In some systems (data warehousing, replication), the order of
> commits is important, since that is the order in which changes
> have become visible.

This issue intersects with the serializable work I've been doing.
While in database transactions using S2PL the above is true, in
snapshot isolation and the SSI implementation of serializable
transactions, it's not. In particular, the snapshot anomalies which
can cause non-serializable behavior happen precisely because the
apparent order of execution doesn't match anything so linear as
order of commit.

I'll raise that receipting example again. You have transactions
which grab the current deposit data and insert it into receipts, as
payments are received. At some point in the afternoon, the deposit
date in a control table is changed to the next day, so that the
receipts up to that point can be deposited during banking hours with
the current date as their deposit date. A report is printed (and
likely a transfer transaction recorded to move "cash in drawer" to
"cash in checking", but I'll ignore that aspect for this example).
Some receipts may not be committed when the update to the date in
the control table is committed.

This is "eventually consistent" -- once all the receipts with the
old date commit or roll back the database is OK, but until then you
might be able to select the new date in the control table and the
set of receipts matching the old date without the database telling
you that you're missing data. The new serializable implementation
fixes this, but there are open R&D items (due to the need to discuss
the issues) on the related Wiki page related to hot standby and
other replication. Will we be able to support transactional
integrity on slave machines?

What if the update to the control table and the insert of receipts
all happen on the master, but someone decides to move the (now
happily working correctly with serializable transactions) reporting
to a slave machine? (And by the way, don't get too hung up on this
particular example, I could generate dozens more on demand -- the
point is that order of commit doesn't always correspond to apparent
order of execution; in this case the receipts *appear* to have
executed first, because they are using a value "later" updated to
something else by a different transaction, even though that other
transaction *committed* first.)

Replicating or recreating the whole predicate locking and conflict
detection on slaves is not feasible for performance reasons. (I
won't elaborate unless someone feels that's not intuitively
obvious.) The only sane way I can see to have a slave database allow
serializable behavior is to WAL-log the acquisition of a snapshot by
a serializable transaction, and the rollback or commit, on the
master, and to have the serializable snapshot build on a slave
exclude any serializable transactions for which there are still
concurrent serializable transactions. Yes, that does mean WAL-
logging the snapshot acquisition even if the transaction doesn't yet
have an xid, and WAL-logging the commit or rollback even if it never
acquires an xid.

I think this solve the issue Jan raises as long as serializable
transactions are used; if they aren't there are no guarantees of
transactional integrity no matter how you track commit sequence,
unless it can be based on S2PL-type blocking locks. I'll have to
leave that to someone else to sort out.

-Kevin

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	pgsql-hackers(at)postgresql(dot)org, JanWieck(at)yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-24 15:43:33
Message-ID:	AANLkTikQETkyM5NcJXp8eX7w2HeBSzujjN53ejsKzN2r@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 24, 2010 at 11:24 AM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Jan Wieck wrote:
>
>> In some systems (data warehousing, replication), the order of
>> commits is important, since that is the order in which changes
>> have become visible.
>
> This issue intersects with the serializable work I've been doing.
> While in database transactions using S2PL the above is true, in
> snapshot isolation and the SSI implementation of serializable
> transactions, it's not.

I think you're confusing two subtly different things. The way to
prove that a set of transactions running under some implementation of
serializability is actually serializable is to construct a serial
order of execution consistent with the view of the database that each
transaction saw. This may or may not match the commit order, as you
say. But the commit order is still the order the effects of those
transactions have become visible - if we inserted a new read-only
transaction into the stream at some arbitrary point in time, it would
see all the transactions which committed before it and none of those
that committed afterward. So I think Jan's statement is correct.

Having said that, I think your concerns about how things will look
from a slave's point of view are possibly valid. A transaction
running on a slave is essentially a read-only transaction that the
master doesn't know about. It's not clear to me whether adding such a
transaction to the timeline could result in either (a) that
transaction being rolled back or (b) some impact on which other
transactions got rolled back. If it did, that would obviously be a
problem for serializability on slaves, though your proposed fix sounds
like it would be prohibitively expensive for many users. But can this
actually happen?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

From:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	pgsql-hackers(at)postgresql(dot)org, JanWieck(at)Yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-24 22:42:59
Message-ID:	20100524224258.GB53044@csail.mit.edu
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 24, 2010 at 10:24:07AM -0500, Kevin Grittner wrote:
> Jan Wieck wrote:
>
> > In some systems (data warehousing, replication), the order of
> > commits is important, since that is the order in which changes
> > have become visible.
>
> This issue intersects with the serializable work I've been doing.
> While in database transactions using S2PL the above is true, in
> snapshot isolation and the SSI implementation of serializable
> transactions, it's not. In particular, the snapshot anomalies which
> can cause non-serializable behavior happen precisely because the
> apparent order of execution doesn't match anything so linear as
> order of commit.

All true, but this doesn't pose a problem in snapshot isolation. Maybe
this is obvious to everyone else, but just to be clear: a transaction's
snapshot is determined entirely by which transactions committed before
it snapshotted (and hence are visible to it). Thus, replaying update
transactions in the sae order on a slave makes the same sequence of
states visible to it.

Of course (as in your example) some of these states could expose
snapshot isolation anomalies. But that's true on a single-replica
system too.

Now, stepping into the SSI world...

> Replicating or recreating the whole predicate locking and conflict
> detection on slaves is not feasible for performance reasons. (I
> won't elaborate unless someone feels that's not intuitively
> obvious.) The only sane way I can see to have a slave database allow
> serializable behavior is to WAL-log the acquisition of a snapshot by
> a serializable transaction, and the rollback or commit, on the
> master, and to have the serializable snapshot build on a slave
> exclude any serializable transactions for which there are still
> concurrent serializable transactions. Yes, that does mean WAL-
> logging the snapshot acquisition even if the transaction doesn't yet
> have an xid, and WAL-logging the commit or rollback even if it never
> acquires an xid.

One important observation is that any anomaly that occurs on the slave
can be resolved by aborting a local read-only transaction. This is a
good thing, because the alternatives are too horrible to consider.

You could possibly cut the costs of predicate locking by having the
master ship with each transaction the list of predicate locks it
acquired. But you'd still have to track locks for read-only
transactions, so maybe that's not a significant cost improvement. On
the other hand, if you're willing to pay the price of serializability
on the master, why not the slaves too?

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)Yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 01:09:05
Message-ID:	8A5A540C-9BDF-4EB2-9FF7-F1EFD87507D6@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 25, 2010, at 0:42 , Dan Ports wrote:
> On Mon, May 24, 2010 at 10:24:07AM -0500, Kevin Grittner wrote:
>> Jan Wieck wrote:
>>
>>> In some systems (data warehousing, replication), the order of
>>> commits is important, since that is the order in which changes
>>> have become visible.
>>
>> This issue intersects with the serializable work I've been doing.
>> While in database transactions using S2PL the above is true, in
>> snapshot isolation and the SSI implementation of serializable
>> transactions, it's not. In particular, the snapshot anomalies which
>> can cause non-serializable behavior happen precisely because the
>> apparent order of execution doesn't match anything so linear as
>> order of commit.
>
> All true, but this doesn't pose a problem in snapshot isolation. Maybe
> this is obvious to everyone else, but just to be clear: a transaction's
> snapshot is determined entirely by which transactions committed before
> it snapshotted (and hence are visible to it). Thus, replaying update
> transactions in the sae order on a slave makes the same sequence of
> states visible to it.

The subtle point here is whether you consider the view from the "outside" (in the sense of what a read-only transaction started at an arbitrary time can or cannot observe), or from the "inside" (what updating transactions can observe and might base their updates on).

The former case is completely determined by the commit ordering of the transactions, while the latter is not - otherwise serializability wouldn't be such a hard problem.

For some problems, like replication, the former ("outside") view is what matters - if slave synthesizes transactions that insert/update/delete the very same tuples as the original transaction did, and commits them in the same order, no read-only transaction can observe the difference. But that is *not* a serial schedule of the original transactions, since the transactions are *not* the same - the merely touch the same tuples. In fact, if you try replaying the original SQL, you *will* get different results on the slave, and not only because of now() and the like.

best regards,
Florian Pflug

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)Yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 01:21:05
Message-ID:	15632.1274750465@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> writes:
> The subtle point here is whether you consider the view from the "outside" (in the sense of what a read-only transaction started at an arbitrary time can or cannot observe), or from the "inside" (what updating transactions can observe and might base their updates on).

> The former case is completely determined by the commit ordering of the transactions, while the latter is not - otherwise serializability wouldn't be such a hard problem.

BTW, doesn't all this logic fall in a heap as soon as you consider
read-committed transactions?

regards, tom lane

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)Yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 11:02:40
Message-ID:	5E04E04F-FC0A-4726-821C-A19F5AD52035@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 25, 2010, at 3:21 , Tom Lane wrote:
> Florian Pflug <fgp(at)phlo(dot)org> writes:
>> The subtle point here is whether you consider the view from the "outside" (in the sense of what a read-only transaction started at an arbitrary time can or cannot observe), or from the "inside" (what updating transactions can observe and might base their updates on).
>
>> The former case is completely determined by the commit ordering of the transactions, while the latter is not - otherwise serializability wouldn't be such a hard problem.
>
> BTW, doesn't all this logic fall in a heap as soon as you consider
> read-committed transactions?

Why would it? There's still a well defined point in time at which the transaction's effects become visible, and every other transaction commits either before that time or after that time. An observer started between two transactions sees the first's changes but not the second's. One replace observing read committed transactions by a series of smaller repeatable read transactions, since the observers are read-only anyway.

This of course says nothing about what state the updating transactions themselves see as the current state. For e.g. replication that is adequate, since you'd not replay the original commands but rather the effects they had in terms of physical tuple updates. On replay, the effects of a transaction to therefor not depend on the state the transaction sees.

best regards,
Florian Pflug

From:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
To:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 12:00:42
Message-ID:	AANLkTinmTICbjAxIOWnjLtU1O7--QUFi3R9kDygPrAqr@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2010/5/25 Dan Ports <drkp(at)csail(dot)mit(dot)edu>:

> On Mon, May 24, 2010 at 10:24:07AM -0500, Kevin Grittner wrote:
>
>> Replicating or recreating the whole predicate locking and conflict
>> detection on slaves is not feasible for performance reasons. (I
>> won't elaborate unless someone feels that's not intuitively
>> obvious.) The only sane way I can see to have a slave database allow
>> serializable behavior is to WAL-log the acquisition of a snapshot by
>> a serializable transaction, and the rollback or commit, on the
>> master, and to have the serializable snapshot build on a slave
>> exclude any serializable transactions for which there are still
>> concurrent serializable transactions. Yes, that does mean WAL-
>> logging the snapshot acquisition even if the transaction doesn't yet
>> have an xid, and WAL-logging the commit or rollback even if it never
>> acquires an xid.
>
> One important observation is that any anomaly that occurs on the slave
> can be resolved by aborting a local read-only transaction. This is a
> good thing, because the alternatives are too horrible to consider.
>
> You could possibly cut the costs of predicate locking by having the
> master ship with each transaction the list of predicate locks it
> acquired. But you'd still have to track locks for read-only
> transactions, so maybe that's not a significant cost improvement. On
> the other hand, if you're willing to pay the price of serializability
> on the master, why not the slaves too?

I don't understand the problem. According to me, in the context of
SSI, a read-only slave can just map SERIALIZABLE to the technical
implementation of REPEATABLE READ (i.e., the currently-existing
"SERIALIZABLE"). The union of the transactions on the master and the
slave(s) will still exhibit SERIALIZABLE behavior because the
transactions on the slave cannot write anything and are therefore
irrelevant.

Is anything wrong with that reasoning?

Nicolas

From:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
To:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 18:18:56
Message-ID:	20100525181856.GC53044@csail.mit.edu
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, May 25, 2010 at 02:00:42PM +0200, Nicolas Barbier wrote:
> I don't understand the problem. According to me, in the context of
> SSI, a read-only slave can just map SERIALIZABLE to the technical
> implementation of REPEATABLE READ (i.e., the currently-existing
> "SERIALIZABLE"). The union of the transactions on the master and the
> slave(s) will still exhibit SERIALIZABLE behavior because the
> transactions on the slave cannot write anything and are therefore
> irrelevant.

This, unfortunately, isn't true in SSI.

Consider read-only transactions on a single node SSI database -- the
situation is the same for read-only transactions that run on a slave.
These transactions can be part of anomalies, so they need to be checked
for conflicts and potentially aborted.

Consider Kevin's favorite example, where one table contains the current
date and the other is a list of receipts (initially empty).
T1 inserts (select current_date) into receipts, but doesn't commit
T2 increments current_date and commits
T3 reads both current_date and the receipt table
T1 commits

T3, which is a read-only transaction, sees the incremented date and an
empty list of receipts. But T1 later commits a new entry in the
receipts table with the old date. No serializable ordering allows this.
However, if T3 hadn't performed its read, there'd be no problem; we'd
just serialize T1 before T2 and no one would be the wiser.

SSI would detect a potential conflict here, which we could resolve by
aborting T3. (We could also abort T1, but if this is a replicated
system this isn't always an option -- T3 might be running on the
slave, so only the slave will know about the conflict, and it can't
very well abort an update transaction on the master.)

There's another example of a read-only transaction anomaly that could
cause similar problems at
http://portal.acm.org/citation.cfm?doid=1031570.1031573, but I think
this one is easier to follow.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
Cc:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 18:35:44
Message-ID:	1B65D4DC-B0F5-4CD3-8718-B7CBC9243C74@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 25, 2010, at 20:18 , Dan Ports wrote:
> On Tue, May 25, 2010 at 02:00:42PM +0200, Nicolas Barbier wrote:
>> I don't understand the problem. According to me, in the context of
>> SSI, a read-only slave can just map SERIALIZABLE to the technical
>> implementation of REPEATABLE READ (i.e., the currently-existing
>> "SERIALIZABLE"). The union of the transactions on the master and the
>> slave(s) will still exhibit SERIALIZABLE behavior because the
>> transactions on the slave cannot write anything and are therefore
>> irrelevant.
>
> This, unfortunately, isn't true in SSI.
>
> Consider read-only transactions on a single node SSI database -- the
> situation is the same for read-only transactions that run on a slave.
> These transactions can be part of anomalies, so they need to be checked
> for conflicts and potentially aborted.
>
> Consider Kevin's favorite example, where one table contains the current
> date and the other is a list of receipts (initially empty).
> T1 inserts (select current_date) into receipts, but doesn't commit
> T2 increments current_date and commits
> T3 reads both current_date and the receipt table
> T1 commits
>
> T3, which is a read-only transaction, sees the incremented date and an
> empty list of receipts. But T1 later commits a new entry in the
> receipts table with the old date. No serializable ordering allows this.
>
> However, if T3 hadn't performed its read, there'd be no problem; we'd
> just serialize T1 before T2 and no one would be the wiser.

Hm, so in fact SSI sometimes allows the database to be inconsistent, but only as long as nobody tries to observe it?

Btw, I still don't get how this follows from the Cahill paper. For a transaction to lie on a dangerous circle, it needs incoming and outgoing edges in the conflict graph, right? But I'd have though that conflicts are always between a reader and a writer or between two writers. So how can a read-only transaction have incoming and outgoing edges?

best regards,
Florian Pflug

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Dan Ports" <drkp(at)csail(dot)mit(dot)edu>,"Florian Pflug" <fgp(at)phlo(dot)org>
Cc:	"Nicolas Barbier" <nicolas(dot)barbier(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>,<JanWieck(at)yahoo(dot)com>
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 18:47:28
Message-ID:	4BFBD4F00200002500031A79@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> wrote:

> Hm, so in fact SSI sometimes allows the database to be
> inconsistent, but only as long as nobody tries to observe it?

Not exactly. The eventually-persisted state is always consistent,
but there can be a transitory committed state which would violate
user-defined constraints or business rules *if viewed*. This is
what I've been on about -- the commit sequence is not necessarily
the same as the apparent order of execution. A read-only
transaction, if run before the overlapping commits "settle", can
view a state which is not consistent with any serial order of
execution, and might therefore break the rules. SSI detects that
and rolls one of the transactions back if they're all running at
serializable transaction isolation in a single SSI database, but the
question is how to handle this when the read happens in a replica.

> Btw, I still don't get how this follows from the Cahill paper. For
> a transaction to lie on a dangerous circle, it needs incoming and
> outgoing edges in the conflict graph, right?

At least one of the transactions participating in the cycle does.
There's no requirement that they all do.

-Kevin

From:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 18:48:44
Message-ID:	20100525184844.GD53044@csail.mit.edu
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, May 25, 2010 at 08:35:44PM +0200, Florian Pflug wrote:
> Hm, so in fact SSI sometimes allows the database to be inconsistent, but only as long as nobody tries to observe it?

Yes. Note that even while it's in an inconsistent state, you can still
perform any query that doesn't observe the inconsistency -- hopefully
most queries fall into this category.

> Btw, I still don't get how this follows from the Cahill paper. For a transaction to lie on a dangerous circle, it needs incoming and outgoing edges in the conflict graph, right? But I'd have though that conflicts are always between a reader and a writer or between two writers. So how can a read-only transaction have incoming and outgoing edges?

Right, the read-only transaction can't have incoming edges, but it can
have outgoing edges. So it can't be the "pivot" itself (the transaction
with both outgoing and incoming edges), but it can cause *another*
transaction to be.

In the example I gave, T3 (the r/o transaction) has an outgoing edge to
T1, because it didn't see T1's concurrent update. T1 already had an
outgoing edge to T2, so adding in this incoming edge from T3 creates
the dangerous structure.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

From:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
To:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 19:31:26
Message-ID:	AANLkTikOdbHFZaBrrVookOhdncmpZp86C7nHhIWxdKTU@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2010/5/25 Dan Ports <drkp(at)csail(dot)mit(dot)edu>:

> On Tue, May 25, 2010 at 02:00:42PM +0200, Nicolas Barbier wrote:
>
>> I don't understand the problem. According to me, in the context of
>> SSI, a read-only slave can just map SERIALIZABLE to the technical
>> implementation of REPEATABLE READ (i.e., the currently-existing
>> "SERIALIZABLE"). The union of the transactions on the master and the
>> slave(s) will still exhibit SERIALIZABLE behavior because the
>> transactions on the slave cannot write anything and are therefore
>> irrelevant.
>
> This, unfortunately, isn't true in SSI.
>
> Consider read-only transactions on a single node SSI database -- the
> situation is the same for read-only transactions that run on a slave.
> These transactions can be part of anomalies, so they need to be checked
> for conflicts and potentially aborted.
>
> Consider Kevin's favorite example, where one table contains the current
> date and the other is a list of receipts (initially empty).
> T1 inserts (select current_date) into receipts, but doesn't commit
> T2 increments current_date and commits
> T3 reads both current_date and the receipt table
> T1 commits
>
> T3, which is a read-only transaction, sees the incremented date and an
> empty list of receipts. But T1 later commits a new entry in the
> receipts table with the old date. No serializable ordering allows this.
> However, if T3 hadn't performed its read, there'd be no problem; we'd
> just serialize T1 before T2 and no one would be the wiser.
>
> SSI would detect a potential conflict here, which we could resolve by
> aborting T3. (We could also abort T1, but if this is a replicated
> system this isn't always an option -- T3 might be running on the
> slave, so only the slave will know about the conflict, and it can't
> very well abort an update transaction on the master.)

Ah, indeed. I made the same reasoning mistake as Florian (presumably)
did: I didn't think of the fact that the read-only transaction doesn't
need to be the pivot.

Nicolas

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
Cc:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 19:35:13
Message-ID:	AA80F291-4DBB-4FF9-9458-1AF19C1F4FB5@phlo.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On May 25, 2010, at 20:48 , Dan Ports wrote:
> On Tue, May 25, 2010 at 08:35:44PM +0200, Florian Pflug wrote:
>> Hm, so in fact SSI sometimes allows the database to be inconsistent, but only as long as nobody tries to observe it?
>
> Yes. Note that even while it's in an inconsistent state, you can still
> perform any query that doesn't observe the inconsistency -- hopefully
> most queries fall into this category.

Yeah, as long as you just walk by without looking, the database is happy ;-)

>> Btw, I still don't get how this follows from the Cahill paper. For a transaction to lie on a dangerous circle, it needs incoming and outgoing edges in the conflict graph, right? But I'd have though that conflicts are always between a reader and a writer or between two writers. So how can a read-only transaction have incoming and outgoing edges?
>
> Right, the read-only transaction can't have incoming edges, but it can
> have outgoing edges. So it can't be the "pivot" itself (the transaction
> with both outgoing and incoming edges), but it can cause *another*
> transaction to be.
>
> In the example I gave, T3 (the r/o transaction) has an outgoing edge to
> T1, because it didn't see T1's concurrent update. T1 already had an
> outgoing edge to T2, so adding in this incoming edge from T3 creates
> the dangerous structure.

Hm, but for there to be an actual problem (and not a false positive), an actual dangerous circle has to exist in the dependency graph. The existence of a dangerous structure is just a necessary (but not sufficient) and easily checked-for condition for that, right? Now, if a read-only transaction only ever has outgoing edges, it cannot be part of a (dangerous or not) circle, and hence any dangerous structure it is part of is a false positive.

I guess my line of reasoning is flawed somehow, but I cannot figure out why...

best regards,
Florian Pflug

From:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 19:35:36
Message-ID:	AANLkTik1w8IH91TXq7DcuVL2WDjW51x_1TUshggmUX93@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2010/5/25 Florian Pflug <fgp(at)phlo(dot)org>:

> On May 25, 2010, at 20:18 , Dan Ports wrote:
>
>> T3, which is a read-only transaction, sees the incremented date and an
>> empty list of receipts. But T1 later commits a new entry in the
>> receipts table with the old date. No serializable ordering allows this.
>>
>> However, if T3 hadn't performed its read, there'd be no problem; we'd
>> just serialize T1 before T2 and no one would be the wiser.
>
> Hm, so in fact SSI sometimes allows the database to be inconsistent, but only as long as nobody tries to observe it?

I would not call this an inconsistent state: it would become
inconsistent only after someone (e.g., T3) has observed it _and_ T1
commits.

Nicolas

From:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
To:	Florian Pflug <fgp(at)phlo(dot)org>
Cc:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, JanWieck(at)yahoo(dot)com
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 19:53:30
Message-ID:	AANLkTikKP2fNCgt_wqiavkgwD1BpbMcC_swS4qig3wsm@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2010/5/25 Florian Pflug <fgp(at)phlo(dot)org>:

> Hm, but for there to be an actual problem (and not a false positive), an
> actual dangerous circle has to exist in the dependency graph. The
> existence of a dangerous structure is just a necessary (but not
> sufficient) and easily checked-for condition for that, right? Now, if a
> read-only transaction only ever has outgoing edges, it cannot be part
> of a (dangerous or not) circle, and hence any dangerous structure it is
> part of is a false positive.
>
> I guess my line of reasoning is flawed somehow, but I cannot figure out why...

In the general case, "wr" dependencies also create "must be serialized
before" edges. It seems that those edges can be discarded when finding
a pivot, but if you want to go "back to basics":

("<" means "must be serialized before".)

* T1 < T2, because T1 reads a version of a data element for which T2
later creates a newer version (rw between T1 and T2).
* T3 < T1, because T3 reads a version of a data element for which T1
later creates a newer version (rw between T3 and T1).
* T2 < T3, because T2 creates a version of a data element, which is
then read by T3 (wr between T2 and T3).

(As you can see, those 3 edges form a cycle.)

Nicolas

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Dan Ports" <drkp(at)csail(dot)mit(dot)edu>,"Florian Pflug" <fgp(at)phlo(dot)org>
Cc:	"Nicolas Barbier" <nicolas(dot)barbier(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>,<JanWieck(at)yahoo(dot)com>
Subject:	Re: Exposing the Xact commit order to the user
Date:	2010-05-25 19:57:29
Message-ID:	4BFBE5590200002500031A8D@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> wrote:

> Hm, but for there to be an actual problem (and not a false
> positive), an actual dangerous circle has to exist in the
> dependency graph. The existence of a dangerous structure is just a
> necessary (but not sufficient) and easily checked-for condition
> for that, right? Now, if a read-only transaction only ever has
> outgoing edges, it cannot be part of a (dangerous or not) circle,
> and hence any dangerous structure it is part of is a false
> positive.
>
> I guess my line of reasoning is flawed somehow, but I cannot
> figure out why...

Here's why:

We're tracking rw-dependencies, where the "time-arrow" showing
effective order of execution points from the reader to the writer
(since the reader sees a state prior to the write, it effectively
executes before it). These are important because there have to be
two such dependencies, one in to the pivot and one out from the
pivot, for a problem to exist. (See various works by Dr. Alan
Fekete, et al, for details.) But other dependencies can imply an
order of execution. In particular, a wr-dependency, where a
transaction *can* see data committed by another transaction, implies
that the *writer* came first in the order of execution. In this
example, the transaction which lists the receipts successfully reads
the control table update, but is not able to read the receipt
insert. This completes the cycle, making it a real anomaly and not
a false positive.

Note that the wr-dependency can actually exist outside the database,
making it pretty much impossible to accurately tell a false positive
from a true anomaly when the pivot exists and the transaction
writing data which the pivot can't read commits first. For example,
let's say that the update to the control table is committed from an
application which, seeing that its update came back without error,
proceeds to list the receipts for the old date in a subsequent
transaction. You have a wr-dependency which is, in reality, quite
real and solid with no way to notice it within the database engine.
That's why the techniques used in SSI are pretty hard to improve
upon beyond more detailed and accurate tracking of rw-conflicts.

-Kevin