Quick Links

Serializable Isolation without blocking

Lists:	pgsql-hackers

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	<pgsql-hackers(at)postgresql(dot)org>
Cc:	<Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Serializable Isolation without blocking
Date:	2009-05-05 15:50:22
Message-ID:	4A0019EE.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

While discussing potential changes to PostgreSQL documentation of
transaction isolation levels, Emmanuel Cecchet pointed out an
intriguing new paper[1] on a new algorithm to provide true
serializable behavior in a MVCC based database, with no additional
blocking; although, there being no such things as a free lunch, there
is an increase in serialization failures under contention. I have
been hesitant to raise the issue while everyone was busy trying to
wrap up release 8.4; but since that is now in beta testing and PGCon
is fast approaching, I wanted to put the issue out there so that
people have a chance to review it and discuss it.

Michael Cahill has given me permission to quote from the paper. He
has also provided a URL to his personal version of the work[2], which
people may directly access for their personal use, although
redistribution is prohibited by the ACM copyright. He has asked to be
copied on the discussion here.

I know that some here have questioned why anyone would want
serializable transactions. Our environment has 21 programmers, 21
business process analysts, and four DBAs. A major part of the time
for this staff is enhancement of existing software and development of
new software. We have many distinct databases, the largest of which
has a schema of over 300 tables. (That's well normalized and not
partitioned -- the structure of the data really is that complex.) We
have over 8,700 queries against these various databases, including
OLTP, reporting, statistics, public access, web queries, etc. If one
were to go through the well-know techniques to identify all possible
interactions between these queries against these tables, it would not
only be a massive undertaking, the results would immediately be
obsolete.

The nice thing about serializable transactions is that if you can show
that a transaction does the right thing when run by itself, you
automatically know that it will function correctly when run in any
mix, or it will fail with a serializable error and can be safely
retried. Our framework is designed so that serialization errors are
automatically detected and the transaction is retried without any user
interaction or application programming needed -- a serialization
failure appears to the application code and the user the same as
simple blocking.

Quoting the paper's abstract:

"Many popular database management systems offer snapshot isolation
rather than full serializability. There are well known anomalies
permitted by snapshot isolation that can lead to violations of data
consistency by interleaving transactions that individually maintain
consistency. Until now, the only way to prevent these anomalies was to
modify the applications by introducing artificial locking or update
conflicts, following careful analysis of conflicts between all pairs
of transactions.

"This paper describes a modification to the concurrency control
algorithm of a database management system that automatically detects
and prevents snapshot isolation anomalies at runtime for arbitrary
applications, thus providing serializable isolation. The new algorithm
preserves the properties that make snapshot isolation attractive,
including that readers do not block writers and vice versa. An
implementation and performance study of the algorithm are described,
showing that the throughput approaches that of snapshot isolation in
most cases."

Quoting a portion of the conclusions:

"A prototype of the algorithm has been implemented in Oracle Berkeley
DB and shown to perform significantly better that two-phase locking in
a variety of cases, and often comparably with snapshot isolation.

"One property of Berkeley DB that simplified our implementation was
working with page level locking and versioning. In databases that
version and lock at row-level granularity (or finer), additional
effort would be required to avoid phantoms, analogous to standard two
phase locking approaches such as multigranularity locking."

Quoting a snippet from the implementation section:

"Making these changes to Berkeley DB involved only modest changes to
the source code. In total, only 692 lines of code (LOC) were modified
out of a total of over 200,000 lines of code in Berkeley DB."

Michael J. Cahill has since implemented these techniques in InnoDB as
part of his PhD work. While Microsoft SQL Server does provide full
serializability in an MVCC implementation, I believe they do it with
blocking rather than this newer and faster technique.

The paper is a very interesting read, and I fear that if we don't
pursue these techniques, InnoDB users will have legitimate bragging
rights over PostgreSQL users in an important area.

Oh, and I know that these issues are well known, and I know that the
solution involves predicate locks; although these won't necessarily be
locks which cause blocking.

-Kevin

[1] Michael J. Cahill, Uwe Röhm, Alan D. Fekete. Serializable
Isolation for Snapshot Databases. In the Proceedings of the 2008 ACM
SIGMOD international conference on management of data, pages 729-738.
http://doi.acm.org/10.1145/1376616.1376690

[2]
http://cahill.net.au/wp-content/uploads/2009/01/real-serializable.pdf

From:	Neil Conway <neil(dot)conway(at)gmail(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	pgsql-hackers(at)postgresql(dot)org, "<Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-05 22:01:43
Message-ID:	b4e5ce320905051501y1ed193f8ye153b0afa167a3d4@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, May 5, 2009 at 8:50 AM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> While discussing potential changes to PostgreSQL documentation of
> transaction isolation levels, Emmanuel Cecchet pointed out an
> intriguing new paper[1] on a new algorithm to provide true
> serializable behavior in a MVCC based database

I agree, this is very interesting work. I blogged about it a while ago[1].

> "Making these changes to Berkeley DB involved only modest changes to
> the source code. In total, only 692 lines of code (LOC) were modified
> out of a total of over 200,000 lines of code in Berkeley DB."

Tracking the read sets of each transaction would be very expensive.
Worse still, that information needs to be kept around after
end-of-transaction, which raises questions about where it should be
stored and how it should be cleaned up. Note that the benchmarks in
the paper involve transactions that perform "a small number of simple
read and update operations", which reduces the bookkeeping overhead.

Neil

[1] http://everythingisdata.wordpress.com/2009/02/25/february-25-2009/

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Neil Conway" <neil(dot)conway(at)gmail(dot)com>
Cc:	"<Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-05 23:08:54
Message-ID:	4A0080B6.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Neil Conway <neil(dot)conway(at)gmail(dot)com> wrote:

> Tracking the read sets of each transaction would be very expensive.
> Worse still, that information needs to be kept around after
> end-of-transaction, which raises questions about where it should be
> stored and how it should be cleaned up. Note that the benchmarks in
> the paper involve transactions that perform "a small number of
> simple read and update operations", which reduces the bookkeeping
> overhead.

I know that some of the simplifying assumptions listed in 3.1 do not
currently hold for PostgreSQL. A prerequisite for using the algorithm
would be to make them hold for PostgreSQL, or find some way to work
around their absence. I hope people will read the paper and mull it
over, but these assumptions are probably the crux or whether this
enhancement is feasible. I guess it would be best to throw that much
out to the list for discussion.

To quote:

> 1. For any data item x, we can efficiently get the list of
> locks held on x.
> 2. For any lock l in the system, we can efficiently get
> l.owner, the transaction object that requested the lock.
> 3. For any version xt of a data item in the system, we
> can efficiently get xt.creator, the transaction object
> that created that version.
> 4. When *nding a version of item x valid at some given
> timestamp, we can efficiently get the list of other ver-
> sions of x that have later timestamps.

Based on my limited understanding, I don't get the impression 2 or 3
are a problem.

I'm assuming that we would use the structures which back the pg_locks
view for the SIREAD locks implicit in 1, possibly with some scope
escalation as counts for a table rise (row to page to extent to
table). This may require a larger allocation for this information by
those wanting to use serializable transactions.

I'm not sure how 4 could be handled.

I don't know how much work would be required to track the transaction
information listed in section 4.1 (or its functional equivalent).

I'd be happy to hear the opinion of those more familiar with the
internals than I.

-Kevin

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, <pgsql-hackers(at)postgresql(dot)org>
Cc:	<mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 09:51:10
Message-ID:	D960CB61B694CF459DCFB4B0128514C202FF65AF@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kevin Grittner wrote:
> While discussing potential changes to PostgreSQL documentation of
> transaction isolation levels, Emmanuel Cecchet pointed out an
> intriguing new paper[1] on a new algorithm to provide true
> serializable behavior in a MVCC based database, with no additional
> blocking; although, there being no such things as a free lunch, there
> is an increase in serialization failures under contention.

I have read through the paper and will share my comments.

I hope I got it right:

To put it in a nutshell, the approach to true serializability presented
in the paper involves "intention locks" which do not block writers,
but cause transactions with conflict potential to be aborted.

The main cost incurred is the maintenance of these intention locks, which
need to be kept for a while even after a transaction has committed.
Moreover, there will potentially be many of these locks (think of
SELECT COUNT(*) FROM ...), so some lock escalation mechanism (to
page or table locks) will be necessary.

I don't know how hard that would be to implement, but I'd argue
that it is only worth considering if it guarantees true serializability.

The paper describes only intention locks for rows in the select list
of a statement and deliberately ignores rows which are examined in
the WHERE clause. The authors argue in section 3.4 that this is no
problem in their implementation since "Berkeley DB does all locking
and versioning on page granularity".

I don't buy that; maybe I misunderstood something.

Consider a function
"makehighlander(personid integer) RETURNS void"
defined like this:

SELECT ishighlander INTO b FROM scots WHERE id=personid;
IF b THEN
RETURN; /* no need to do anything */
END IF;
UPDATE scots SET ishighlander=TRUE WHERE id=personid;
SELECT count(*) INTO n FROM scots WHERE ishighlander;
IF (n > 1) THEN
RAISE EXCEPTION 'There can be only one';
END IF;

If we assume that "ishighlander" is false for all records in
the beginning, and there are two calls to the function with
two personid's of records *in different pages*, then there cannot be
any conflicts since all (write and intention) locks taken by each of
these calls should only affect the one page that contains the one
record that is updated and then found in the subsequent SELECT.

Yet if the two execute concurrently and the two first SELECTs are
executed before the two UPDATEs, then both functions have a snapshot
so that the final SELECT statements will return 1 and both functions will
succeed, leaving the table with two highlanders.

So I think one would have to add intention locks for rows considered
in the WHERE clause to guarantee true serializability.

It would be interesting to know how many lines of code would have
to be added to implement that and how performance would be affected.
I'm afraid that concurrency would suffer because more conflicting
transactions would be aborted.

Another thing I wonder is whether the authors have considered the
possibility that there are serializable and "cursor stability"
transactions at the same time, where the latter would not take
intention locks. Could that affect the considerations about
correctness of the algorithm?

Yours,
Laurenz Albe

From:	Gregory Stark <stark(at)enterprisedb(dot)com>
To:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	"Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, <pgsql-hackers(at)postgresql(dot)org>, <mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 10:34:16
Message-ID:	87ocu5urpz.fsf@oxford.xeocode.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:

> So I think one would have to add intention locks for rows considered
> in the WHERE clause to guarantee true serializability.

Does the paper explain how to deal with rows "considered" in the WHERE clause
which don't yet exist? Ie, "SELECT count(*) WHERE foo" needs to take out a
lock which would cause any transaction which inserts a new record where foo is
true to be abort.

In MSSQL this requires locking the page of the index where such records would
be inserted (or the entire table if there's no index). In Predicate locking
schemes this requires a separate storage structure for storing such predicates
which can be arbitrarily complex expressions to check any new tuple being
inserted against.

Are these intention locks predicate locks, in that they're not associated with
actual pages or records but with potential records which might be inserted in
the future?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 12:01:16
Message-ID:	1241697676.6109.111.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2009-05-05 at 10:50 -0500, Kevin Grittner wrote:

> "This paper describes a modification to the concurrency control
> algorithm of a database management system that automatically detects
> and prevents snapshot isolation anomalies at runtime for arbitrary
> applications, thus providing serializable isolation. The new algorithm
> preserves the properties that make snapshot isolation attractive,
> including that readers do not block writers and vice versa. An
> implementation and performance study of the algorithm are described,
> showing that the throughput approaches that of snapshot isolation in
> most cases."

I think this is important work in database theory and a good future
direction for us. It's the right thing to keep an eye on developments
and to consider implementation.

We would need to decide whether intention locks were indeed necessary
when we have MVCC also. Other DBMS without visibility may need to be
more restrictive than we would have to be to implement this. Not sure.

It wouldn't be 692 lines of code and even if it were the impact of that
code would be such that it would need to be optional, since it would
differ in definition from an existing SQL Standard isolation mode and it
would have additional performance implications.

If the use is optional, I would currently prefer the existing mechanism
for implementing serialization, which is to serialize access directly
using either a LOCK statement or an exclusive advisory lock. It's clear
that any new-theory solution will cost significantly more as the number
of users increases, at least O(N^2), whereas simply waiting is only
O(N), AFAICS. (Michael et al don't discuss the algorithmic complexity).
So it seems its use would require some thought and care and possibly
further research to uncover areas of applicability in real usage.

So for me, I would say we leave this be until the SQLStandard changes to
recognise the additional mode. I don't see much advantage for us in
breaking the ground on this feature and it will be costly to implement,
so is a good PhD project.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 12:26:31
Message-ID:	4A02D377.6080009@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> It wouldn't be 692 lines of code and even if it were the impact of that
> code would be such that it would need to be optional, since it would
> differ in definition from an existing SQL Standard isolation mode and it
> would have additional performance implications.

I thought it would be equal to the SQL standard Serializable mode,
whereas what we currently call serializable is in fact not as strong as
the SQL standard Serializable mode.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Gregory Stark EXTERN" <stark(at)enterprisedb(dot)com>
Cc:	"Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, <pgsql-hackers(at)postgresql(dot)org>, <mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 12:47:22
Message-ID:	D960CB61B694CF459DCFB4B0128514C202FF65B0@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark wrote:
> > So I think one would have to add intention locks for rows considered
> > in the WHERE clause to guarantee true serializability.
>
> Does the paper explain how to deal with rows "considered" in the WHERE clause
> which don't yet exist? Ie, "SELECT count(*) WHERE foo" needs to take out a
> lock which would cause any transaction which inserts a new record where foo is
> true to be abort.

Quote:
"To prevent phantoms in a system with row-level locking and versioning,
the algorithm described here would need to be extended to take SIREAD locks
on larger granules analogously to multigranularity intention locks in
traditional two-phase locking systems."

[...]

"We have not pursued the details in this paper because the phantom
issue does not arise in our prototype implementation, since Oracle
Berkeley DB does all locking and versioning at page granularity."

End quote.

> Are these intention locks predicate locks, in that they're not associated with
> actual pages or records but with potential records which might be inserted in
> the future?

No, they are associated with the page that contains the actual record.

I think that's also meant with the "larger granules" in the above quote:
Take an intention lock on every page which might affect the condition.

Yours,
Laurenz Albe

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 12:48:03
Message-ID:	1241700483.6109.151.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2009-05-07 at 15:26 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > It wouldn't be 692 lines of code and even if it were the impact of that
> > code would be such that it would need to be optional, since it would
> > differ in definition from an existing SQL Standard isolation mode and it
> > would have additional performance implications.
>
> I thought it would be equal to the SQL standard Serializable mode,
> whereas what we currently call serializable is in fact not as strong as
> the SQL standard Serializable mode.

Our serializable is the same as Oracle's implementation. I think it
would be confusing and non-useful to redefine ours, when it has already
been accepted that the Oracle definition implements the standard
reasonably closely. If that changes, we should also, however.

Perhaps the key point is the O(N^2) complexity of the new algorithm.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	<pgsql-hackers(at)postgresql(dot)org>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	<mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 13:55:19
Message-ID:	4A02A1F7.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:

> maybe I misunderstood something.
>
> Consider a function
> "makehighlander(personid integer) RETURNS void"
> defined like this:
>
> SELECT ishighlander INTO b FROM scots WHERE id=personid;
> IF b THEN
> RETURN; /* no need to do anything */
> END IF;
> UPDATE scots SET ishighlander=TRUE WHERE id=personid;
> SELECT count(*) INTO n FROM scots WHERE ishighlander;
> IF (n > 1) THEN
> RAISE EXCEPTION 'There can be only one';
> END IF;
>
> If we assume that "ishighlander" is false for all records in
> the beginning, and there are two calls to the function with
> two personid's of records *in different pages*, then there cannot be
> any conflicts since all (write and intention) locks taken by each of
> these calls should only affect the one page that contains the one
> record that is updated and then found in the subsequent SELECT.
>
> Yet if the two execute concurrently and the two first SELECTs are
> executed before the two UPDATEs, then both functions have a snapshot
> so that the final SELECT statements will return 1 and both functions
> will succeed, leaving the table with two highlanders.

I do think you misunderstood. If there are two concurrent executions
and each reads one row, there will be an SIREAD lock for each of those
rows. As an example, let's say that one of them (T0) updates its row
and does its count, finds everything looks fine, and commits. In
reading the row the other transaction (T1) modified it sets the
T0.outConflict flag to true and the T1.inConflict flag to true. No
blocking occurs. Now T1 updates its row. Still no problem, because
if it committed there, there would still be a sequence of transactions
(T0 followed by T1) which would be consistent with the results; but it
selects rows which include the one modified by T0, which causes
T0.inConflict and T1.outConflict to be set to true. These would both
be pivots in an unsafe pattern of updates. No mystery which one needs
to be rolled back -- T0 has already committed; so T1 is rolled back
with a serialization failure (probably indicating that it is an unsafe
update versus an update conflict or a deadlock, which are two other
forms of serialization failure). Assuming that the software
recognizes the serialization failure code and retries, it now finds
that there is already a highlander and fails for real.

-Kevin

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Gregory Stark" <stark(at)enterprisedb(dot)com>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	<mjc(at)it(dot)usyd(dot)edu(dot)au>,<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 14:07:01
Message-ID:	4A02A4B5.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
> "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
>
>> So I think one would have to add intention locks for rows
>> considered in the WHERE clause to guarantee true serializability.
>
> Does the paper explain how to deal with rows "considered" in the
> WHERE clause which don't yet exist? Ie, "SELECT count(*) WHERE foo"
> needs to take out a lock which would cause any transaction which
> inserts a new record where foo is true to be abort.

The issue is mentioned, along with the note, quoted in my original
post, of why they were able to dodge the issue in the Berkeley DB
implementation.

> In MSSQL this requires locking the page of the index where such
> records would be inserted (or the entire table if there's no index).

This is the only form of predicate locking I've seen in real-world
production databases which provide true serializable behavior.

> In Predicate locking schemes this requires a separate storage
> structure for storing such predicates which can be arbitrarily
> complex expressions to check any new tuple being inserted against.

I've never seen that done in real-world production databases, although
I'm sure it's pretty in theory.

> Are these intention locks predicate locks, in that they're not
> associated with actual pages or records but with potential records
> which might be inserted in the future?

They are predicate locks in the sense that they detect all conflicts
which could occur based on the actual predicate, though they tend to
indicate conflicts in some situations where a rigorous (and expensive)
analisys of the actual predicates might not; but please note that such
locks would be SIREAD locks, which don't block any data modification,
but are only used to detect dangerous update patterns.

-Kevin

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, <pgsql-hackers(at)postgresql(dot)org>
Cc:	<mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 14:13:12
Message-ID:	D960CB61B694CF459DCFB4B0128514C202FF65B2@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kevin Grittner wrote:
> > maybe I misunderstood something.
> >
> > Consider a function
> > "makehighlander(personid integer) RETURNS void"
> > defined like this:
> >
> > SELECT ishighlander INTO b FROM scots WHERE id=personid;
> > IF b THEN
> > RETURN; /* no need to do anything */
> > END IF;
> > UPDATE scots SET ishighlander=TRUE WHERE id=personid;
> > SELECT count(*) INTO n FROM scots WHERE ishighlander;
> > IF (n > 1) THEN
> > RAISE EXCEPTION 'There can be only one';
> > END IF;
> >
> > If we assume that "ishighlander" is false for all records in
> > the beginning, and there are two calls to the function with
> > two personid's of records *in different pages*, then there cannot be
> > any conflicts since all (write and intention) locks taken by each of
> > these calls should only affect the one page that contains the one
> > record that is updated and then found in the subsequent SELECT.
> >
> > Yet if the two execute concurrently and the two first SELECTs are
> > executed before the two UPDATEs, then both functions have a snapshot
> > so that the final SELECT statements will return 1 and both functions
> > will succeed, leaving the table with two highlanders.
>
> I do think you misunderstood. If there are two concurrent executions
> and each reads one row, there will be an SIREAD lock for each of those
> rows. As an example, let's say that one of them (T0) updates its row
> and does its count, finds everything looks fine, and commits. In
> reading the row the other transaction (T1) modified it sets the
> T0.outConflict flag to true and the T1.inConflict flag to true.

Where does T0 read the row that T1 modified?

> No
> blocking occurs. Now T1 updates its row.

Wait a minute, I am confused. I thought T1 had already modified the row
before T0 committed? Or is "modify" not the update?

> Still no problem, because
> if it committed there, there would still be a sequence of transactions
> (T0 followed by T1) which would be consistent with the results; but it
> selects rows which include the one modified by T0, which causes
> T0.inConflict and T1.outConflict to be set to true.

Where does T1 select rows that were modified by T0? It selects only one
row, the one it modified itself, right?

> These would both
> be pivots in an unsafe pattern of updates. No mystery which one needs
> to be rolled back -- T0 has already committed; so T1 is rolled back
> with a serialization failure (probably indicating that it is an unsafe
> update versus an update conflict or a deadlock, which are two other
> forms of serialization failure). Assuming that the software
> recognizes the serialization failure code and retries, it now finds
> that there is already a highlander and fails for real.

You see, there must be something fundamental I am getting wrong.

Yours,
Laurenz Albe

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndQuadrant(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 14:16:25
Message-ID:	4A02A6E9.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Simon Riggs wrote:
>> It wouldn't be 692 lines of code and even if it were the impact of
>> that code would be such that it would need to be optional, since it
>> would differ in definition from an existing SQL Standard isolation
>> mode and it would have additional performance implications.
>
> I thought it would be equal to the SQL standard Serializable mode,
> whereas what we currently call serializable is in fact not as strong
> as the SQL standard Serializable mode.

Exactly. The standard probably *should* add SNAPSHOT between
REPEATABLE READ and SERIALIZABLE, but so far have not. As of the 2003
version of the SQL spec, they added explicit language that makes it
clear that what you get when you ask for SERIALIZABLE mode in
PostgreSQL is *not* compliant (although it is more than adequate for
REPEATABLE READ).

By the way, the other modes are all optional, as you're allowed to
escalate to a higher level whenever a lower level is requested;
SERIALIZABLE is required by the standard and is specified as the
default.

-Kevin

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	<pgsql-hackers(at)postgresql(dot)org>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	<mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 14:27:13
Message-ID:	4A02A971.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
> Kevin Grittner wrote:

>> I do think you misunderstood. If there are two concurrent
>> executions and each reads one row, there will be an SIREAD lock for
>> each of those rows. As an example, let's say that one of them (T0)
>> updates its row and does its count, finds everything looks fine,
>> and commits. In reading the row the other transaction (T1)
>> modified it sets the T0.outConflict flag to true and the
>> T1.inConflict flag to true.
>
> Where does T0 read the row that T1 modified?

As I said in the original post, I think we would need to track SIREAD
locks in the structures which back the pg_locks view.

>> blocking occurs. Now T1 updates its row.
>
> Wait a minute, I am confused. I thought T1 had already modified the
> row before T0 committed? Or is "modify" not the update?

There are so many sequences that I didn't think it was worthwhile to
step through them all, I did say "As an example, let's say that one of
them (T0) updates its row and does its count, finds everything looks
fine, and commits." If you want to work through the case where they
both UPDATE their rows before either commits, OK; it's not that
different. Things are OK as far as the first select of a modified row
by the other transaction; you record inConflict for one and
outConflict for the other. At the point where it goes both
directions, it is clear that there is a dangerous interaction and one
or the other is rolled back.

> Where does T1 select rows that were modified by T0? It selects only
> one row, the one it modified itself, right?

You have to select it to know whether to count it, right?

> You see, there must be something fundamental I am getting wrong.

It is such a radical departure from traditional blocking approaches,
that it can be hard to get your head around it. :)

-Kevin

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, <pgsql-hackers(at)postgresql(dot)org>
Cc:	<mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 14:40:47
Message-ID:	D960CB61B694CF459DCFB4B0128514C202FF65B3@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kevin Grittner wrote:
> > Where does T1 select rows that were modified by T0? It selects only
> > one row, the one it modified itself, right?
>
> You have to select it to know whether to count it, right?

We are getting closer.

So an SIREAD lock is taken for every row that is examined during
the execution of an execution plan?

Ah.

What if there is an index on the "ishighlander" row?
Then an index scan would find only one candidate to examine,
and the other rows would not even be touched by the execution plan.
Then how would they contract an SIREAD lock?

Yours,
Laurenz Albe

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	<pgsql-hackers(at)postgresql(dot)org>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	<mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 15:16:00
Message-ID:	4A02B4E0.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:

> What if there is an index on the "ishighlander" row?
> Then an index scan would find only one candidate to examine,
> and the other rows would not even be touched by the execution plan.

I assume you're talking about this line of your function:

SELECT count(*) INTO n FROM scots WHERE ishighlander;

I'm further assuming that you meant an index on the ishighlander
*column*.

I can think of more than one way to handle that. Off the top of my
head, I think it would work to acquire an update lock on both old and
new index entries for a row when it is updated, and to lock the range
of an index used for a scan with the new SIREAD lock. Or perhaps,
since the row must be visited to test visibility, the update lock
could be on the old and new rows, and the index scan would find the
conflict in that way. Or it could keep track of the various tuples
which represent different mutations of a row, and link back from the
"not visible to me" row which has been updated to true, and find that
it is a mutation of a visible row.

These are important implementation details to be worked out (very
carefully!). I don't claim to have worked through all such details
yet, or even to be familiar enough with the PostgreSQL internals to do
so in a reasonable time. :-(

-Kevin

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndQuadrant(dot)com>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 15:56:48
Message-ID:	4A02BE70.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

> It wouldn't be 692 lines of code

Agreed. The original implementation was in an MVCC database which
already supported full serializability using strict 2 phase locking
and used page level locks. Both of these made the implementation
simpler than it would be in PostgreSQL. (And that's not even
mentioning sub-transactions and distributed transactions!)

> and even if it were the impact of that
> code would be such that it would need to be optional

I was thinking perhaps a GUC to allow "traditional" behavior when
SERIALIZABLE is requested versus using snapshot isolation for
REPEATABLE READ and this new technique for SERIALIZABLE. Would that
be sane?

> If the use is optional, I would currently prefer the existing
> mechanism for implementing serialization, which is to serialize
> access directly using either a LOCK statement or an exclusive
> advisory lock.

I'm sure many will, particularly where the number of tables is less
than 100 and the number of queries which can be run concurrently is
only a thousand or two. Picking out the potential conflicts and
hand-coding serialization techniques becomes more feasible on a small
scale like that.

That said, there's a lot less room for mistakes here, once this new
technique is implemented and settled in. When I was discussing the
receipting and deposit scenario while trying to clarify the
documentation of current behavior, I received several suggestions from
respected members of this community for how that could be handled with
existing techniques which didn't, in fact, correct the problem. That
just points out to me how tricky it is to solve on an ad hoc basis, as
opposed to a more rigorous technique like the one described in the
paper.

The only suggested fix which *did* work forced actual serialization of
all receipts as well as actual serialization of those with the deposit
report query. The beauty of this new technique is that there would
not be any blocking in the described scenario, and there would be a
rollback with serialization failure if (and only if) there was an
attempt to run the deposit report query while a transaction for a
receipt on the old date was still pending. I suspect that the
concurrency improvements of the new technique over existing safe
techniques would allow it to scale well, at least in our environment.

> It's clear that any new-theory solution will cost significantly more
> as the number of users increases, at least O(N^2), whereas simply
> waiting is only O(N), AFAICS.

I'm not following your reasoning on the O(N^2). Could you explain why
you think it would follow that curve?

> So it seems its use would require some thought and care and possibly
> further research to uncover areas of applicability in real usage.

Care -- of course. Real usage for serializable transactions -- well
known already. (Or are you just questioning performance here?)

> So for me, I would say we leave this be until the SQLStandard
> changes to recognise the additional mode.

It already recognizes this mode; it doesn't yet recognize snapshot
isolation (more's the pity).

> I don't see much advantage for us in breaking the ground on this
> feature and it will be costly to > implement, so is a good PhD
> project.

Apparently it's already been done as a PhD project -- by Michael
Cahill, against InnoDB.

-Kevin

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndQuadrant(dot)com>
Cc:	<Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 16:15:57
Message-ID:	4A02C2ED.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Please keep Michael Cahill copied on this thread, per his request.

I just noticed the omission on a few messages and will forward them to
him.

-Kevin

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 17:13:08
Message-ID:	1241716388.6109.297.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2009-05-07 at 11:15 -0500, Kevin Grittner wrote:

> Please keep Michael Cahill copied on this thread, per his request.
>
> I just noticed the omission on a few messages and will forward them to
> him.

Apologies Michael, I see that my mail did remove you. That was a
unconscious error; I was particularly interested in your comments
regarding my assessment of the algorithmic complexity of the new theory
and existing serialization technique.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Michael Cahill mjc"(at)it(dot)usyd(dot)edu(dot)au
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 17:31:19
Message-ID:	1241717479.6109.314.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2009-05-07 at 10:56 -0500, Kevin Grittner wrote:
> > It's clear that any new-theory solution will cost significantly more
> > as the number of users increases, at least O(N^2), whereas simply
> > waiting is only O(N), AFAICS.
>
> I'm not following your reasoning on the O(N^2). Could you explain why
> you think it would follow that curve?

Each user must compare against work performed by all other users. O(N).

There are N users, so O(N^2).

With reasonable tuning we can make that work with 10 users each checking
the other's data, but with a 100 we'll end up spending more time
checking for aborts (and aborting) than we would if we had just queued
up for it.

If you want this, the simplest implementation is to quite literally
allow only a single SERIALIZABLE txn onto the system at any time. All
other SERIALIZABLEs queue. Note that simple serialization requires no
special handling for aborted transactions. Implementing that will be
fast, proving it works is trivial and it seems will work better in the
general case.

Yeh, it sucks for medium arrival rate transactions, but its great for
low or high arrival rate transactions. The new model is good for medium
arrival rates only and will take a lot of work to implement, correctly
and sufficiently optimally to keep the applicability window wide enough
to justify the effort.

Optimising it would basically entail implementing the equivalent of
block-level locking.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndQuadrant(dot)com>, "Michael Cahill mjc"(at)it(dot)usyd(dot)edu(dot)au>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 17:39:25
Message-ID:	4A02D67D.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

> Each user must compare against work performed by all other users.
O(N).
>
> There are N users, so O(N^2).

Why does that apply here and not in the update conflict detection?

-Kevin

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 19:36:37
Message-ID:	1241724997.6109.377.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2009-05-07 at 12:39 -0500, Kevin Grittner wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>
> > Each user must compare against work performed by all other users.
> O(N).
> >
> > There are N users, so O(N^2).
>
> Why does that apply here and not in the update conflict detection?

I think the shoe is on the other foot. :-)

Explain what you think the algorithmic complexity is, and why, if that's
not correct. Can you beat O(N), with Postgres?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndQuadrant(dot)com>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 20:10:41
Message-ID:	4A02F9F1.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
> On Thu, 2009-05-07 at 12:39 -0500, Kevin Grittner wrote:
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>>
>> > Each user must compare against work performed by all other users.
>> > O(N).
>> >
>> > There are N users, so O(N^2).
>>
>> Why does that apply here and not in the update conflict detection?
>
> I think the shoe is on the other foot. :-)

That's a question, and I think a fair one. As with update conflict
detection, you check whether there are any conflicting locks for what
you are currently accessing. For most usage patterns you won't find
conflicting access the vast majority of the time. The assertion that
there is some need for each session to wade through something for
every other session seems baseless to me. I'm wondering what I might
be missing.

If you throw a formula out there, I do think it's incumbent on you to
explain why you think it fits. If I threw a formula out there, then
it would be fair of you to ask me to explain how I got to it. I'm not
at a point where I think I can estimate performance impact. I guess I
would tend to start from the benchmarks published in the paper, some
of which were confirmed by the ACM SIGMOD repeatability committee.
Eyeballing that, it looks to me like the worst case they found was
about a 15% performance hit, with large stretches of some of the
graphs hanging within 1% of the performance of straight snapshot
isolation.

I think that given published benchmarks with confirmation from an
independent organization like ACM, it would be incumbent on anyone who
questions the benchmarks to explain why they think they're not
accurate or useful. The only concern I've seen so far has been that
these benchmarks lack long and complex database transactions, which
seems like a fair concern. Scalability with additional concurrent
sessions seems good as far as they took it, which was 50 sessions.
Even on a server with 16 CPUs backing a database with 3 million to 4
million hit per day, with tens of millions of database transactions
per day, we use a connection pool with fewer sessions than that.

-Kevin

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 20:28:54
Message-ID:	1241728134.6109.388.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2009-05-07 at 15:10 -0500, Kevin Grittner wrote:
> The assertion that
> there is some need for each session to wade through something for
> every other session seems baseless to me. I'm wondering what I might
> be missing.

That's Greg's point. Do we need full locking of everything we might
touch, or tracking of what we have touched? That question is still
unanswered.

If you need the "might touch" then you either need to implement locking
that will effect everybody (which ain't ever gonna fly round here), or
you implement a scheme that is harder work but avoids locking. That is
clearly O(N^2) for non-locking design.

If you track "have touched" only then we can do that with a hash table
in shared memory. That would be O(k), if it is possible.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Greg Stark <stark(at)enterprisedb(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Michael Cahill mjc"(at)it(dot)usyd(dot)edu(dot)au, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 21:32:12
Message-ID:	4136ffa0905071432p67ce9343h3d1e78702233cb36@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, May 7, 2009 at 6:31 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Each user must compare against work performed by all other users. O(N).
>
> There are N users, so O(N^2).

i think this logic only works if you must scan every item for every
other user every time. If you have data structures like binary trees
or whatever to fine any matching predicate locks or intent locks or
whatever we're calling them then you can hopefully find them in faster
than O(N) time.

I'm not sure you can do better than a full linear search though. If I
do something like "SELECT count(*) FROM tab WHERE
complex_function(a,b) = 5"

And then you "INSERT INTO tab (a,b) VALUES (1,2)". How would you store
any record of the fact that there's a serialization failure iff
complex_function(1,2)=5 in any way that lets you look it up in any way
other than evaluating complex_function for every set of values
inserted?

--
greg

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndQuadrant(dot)com>
Cc:	<Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 21:37:56
Message-ID:	4A030E64.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

> Do we need full locking of everything we might
> touch, or tracking of what we have touched?

> If you need the "might touch" then you either need to implement
> locking that will effect everybody (which ain't ever gonna fly round
> here), or you implement a scheme that is harder work but avoids
> locking. That is clearly O(N^2) for non-locking design.
>
> If you track "have touched" only then we can do that with a hash
> table in shared memory. That would be O(k), if it is possible.

To quote what I think is a relevant section from the paper:

> One property of Berkeley DB that simplified our implementation was
> working with page level locking and versioning. In databases that
> version and lock at row-level granularity (or finer), additional
> effort would be required to avoid phantoms, analogous to standard
> two phase locking approaches such as multigranularity locking.

Since these techniques are used in quite a few databases, I assume
that implementation is fairly well understood. The big difference is
that rather than traditional read locks which block updates, it would
be these new non-blocking SIREAD locks. As I understand it, the point
of this technique is to approximate "might touch" through locking
"have touched" on both rows and index ranges. I know that is
considered crude by some, but the O(N^2) cost of actual predicate lock
calculation would be insane in most real-world environments.

I do have to concede that the paper is silent on how transactions at
other isolation levels behave in this mix. On a first think-through,
it doesn't seem like they would need to obtain SILOCKs for their
reads, since there is no guarantee that they see things in a state
which would be consistent with some serial execution of the database
transactions. I don't think transactions at less stringent
transaction isolation levels need to look for SILOCKs, either. I
wouldn't consider my first pass thinking it through to be particularly
definitive, though.

That interpretation would mean, however, that while the serializable
transactions would satisfy the new, more stringent requirements of
recent versions of the SQL standard, they would still not provide
quite the same guarantees as traditional blocking serializable
transactions. In my receipting example, traditional techniques would
cause the attempt to update the control record to block until the
receipts on the old date committed or rolled back, and the attempt to
report the day's receipts couldn't proceed until the control record
update was committed, so as long as the transactions which modify data
were serializable, no select at READ COMMITTED or highter could see a
state inconsistent with some serial application of the serializable
transactions. With this interpretation, even a SELECT-only
transaction would need to be SERIALIZABLE to ensure that that it did
not see the new deposit date when there were still pending receipts
for the old deposit date. I think I'm OK with that if everyone else
is.

-Kevin

From:	Greg Stark <stark(at)enterprisedb(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 21:47:27
Message-ID:	4136ffa0905071447y3cdd3143v896d95de1795f488@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, May 7, 2009 at 6:13 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Apologies Michael, I see that my mail did remove you. That was a
> unconscious error; I was particularly interested in your comments
> regarding my assessment of the algorithmic complexity of the new theory
> and existing serialization technique.

confusingly you didn't CC him on this message either?

However on subsequent messages you attempted to re-add him but got his
email address wrong. I assume everyone else got a bounce like I got?

--
greg

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 21:55:14
Message-ID:	4A031272.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:

> However on subsequent messages you attempted to re-add him but got
> his email address wrong. I assume everyone else got a bounce like I
> got?

Some of my emails are getting through; some not. I haven't figured
out why. I'm calling it "best effort" for now, and will send him a
link to the thread in the archives.

-Kevin

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc:	"Michael Cahill mjc"(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 22:08:21
Message-ID:	4A031585.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:

> If I do something like "SELECT count(*) FROM tab WHERE
> complex_function(a,b) = 5"
>
> And then you "INSERT INTO tab (a,b) VALUES (1,2)". How would you
> store any record of the fact that there's a serialization failure
> iff complex_function(1,2)=5 in any way that lets you look it up in
> any way other than evaluating complex_function for every set of
> values inserted?

I'd be the last one to shoot down a brighter idea if someone has one,
but I would assume that SELECT shown above would either resolve to a
table scan, in which case you would have to have an SIREAD lock at the
table level, or there would be an index on that function, in which
case you could take out an SIREAD range lock on the appropriate part
of the index.

That said, the above would not cause a serialization failure. It
would not cause any blocking. Even if both queries were concurrent,
this would be fine in any order of the steps executing, and it would
meet the requirements of the standard because there is *some order of
serial execution* which would generate the same results as the
concurrent execution -- specifically, the SELECT would appear to have
run before the INSERT.

It would create an edge which would be *halfway* to a problem. If the
transaction doing the SELECT also modified data which was selected by
some other transaction, or the transaction doing the insert also
selected data which was modified by some other transaction, *then*
something would need to roll back.

-Kevin

From:	Greg Stark <stark(at)enterprisedb(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-07 22:39:58
Message-ID:	4136ffa0905071539m298ff09fqa67da89abd204a1d@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, May 7, 2009 at 11:08 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> I would assume that SELECT shown above would either resolve to a
> table scan, in which case you would have to have an SIREAD lock at the
> table level

That sounds like we're back to the MSSQL/Sybase way of doing things
where you have to understand the query plan to understand why you're
getting spurious serialization failures. I don't think that's terribly
appealing. Consider, for example, that we might not *want* to do an
index scan just because there's an index. Or there could be multiple
indexes on the function, we definitely wouldn't want to have to check
for range locks on every index.

We have to think outside of the box and get away from the pre-existing
implementations which have solutions which aren't really applicable.

If we were to look at doing predicate locks in any way they would
probably be stored in a whole new data structure, not related to the
indexes on the table. We would need some way to index them so that we
can look for conflicting locks efficiently independently from the
query plan and table access methods.

I've removed the broken email address for now -- please re-add the
correct email address.

--
greg

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Greg Stark <stark(at)enterprisedb(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 04:17:42
Message-ID:	1241756262.6109.414.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2009-05-07 at 22:47 +0100, Greg Stark wrote:
> On Thu, May 7, 2009 at 6:13 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > Apologies Michael, I see that my mail did remove you. That was a
> > unconscious error; I was particularly interested in your comments
> > regarding my assessment of the algorithmic complexity of the new theory
> > and existing serialization technique.
>
> confusingly you didn't CC him on this message either?
>
> However on subsequent messages you attempted to re-add him but got his
> email address wrong. I assume everyone else got a bounce like I got?

Something wrong with the email address causes it to be removed after I
send. Not seen anything like that before; I'm not consciously removing
Michael anyway.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, <pgsql-hackers(at)postgresql(dot)org>
Cc:	<mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 06:18:11
Message-ID:	D960CB61B694CF459DCFB4B0128514C202FF65B4@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kevin Grittner wrote:
> > What if there is an index on the "ishighlander" row?
> > Then an index scan would find only one candidate to examine,
> > and the other rows would not even be touched by the execution plan.
>
> I assume you're talking about this line of your function:
>
> SELECT count(*) INTO n FROM scots WHERE ishighlander;

Right.

> I'm further assuming that you meant an index on the ishighlander
> *column*.

Of course. Sorry for the sloppiness.

> I can think of more than one way to handle that. Off the top of my
> head, I think it would work to acquire an update lock on both old and
> new index entries for a row when it is updated, and to lock the range
> of an index used for a scan with the new SIREAD lock. Or perhaps,
> since the row must be visited to test visibility,

As far as I know, only the table rows that are found in the index scan
are examined for visibility. Which would be only one in my example.

> the update lock
> could be on the old and new rows, and the index scan would find the
> conflict in that way. Or it could keep track of the various tuples
> which represent different mutations of a row, and link back from the
> "not visible to me" row which has been updated to true, and find that
> it is a mutation of a visible row.
>
> These are important implementation details to be worked out (very
> carefully!). I don't claim to have worked through all such details
> yet, or even to be familiar enough with the PostgreSQL internals to do
> so in a reasonable time. :-(

Of course, and that is leading us too far. Thanks for your patience.

But in your attempt to sketch a way how true serializability could
be implemented, you went beyond the scope of the original paper,
which does not claim to tackle "phantoms".

I think the idea is promising, and it would be interesting to see
performance results for an implementation that covers predicates.

As you mentioned in your first post, there will not be a free lunch.
What hurts concurrency in an implementation with blocking read locks
might also hurt concurrency in an implementation where transactions
are frequently aborted and have to be retried.

Yours,
Laurenz Albe

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Greg Stark" <stark(at)enterprisedb(dot)com>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 14:00:15
Message-ID:	4A03F49F.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> On Thu, May 7, 2009 at 11:08 PM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> I would assume that SELECT shown above would either resolve to a
>> table scan, in which case you would have to have an SIREAD lock at
>> the table level
>
> That sounds like we're back to the MSSQL/Sybase way of doing things

Other than not blocking, I suppose.

> where you have to understand the query plan to understand why you're
> getting spurious serialization failures.

You have to know a lot more than that to solve serialization problems
in PostgreSQL with current techniques.

> I don't think that's terribly appealing. Consider, for example, that
> we might not *want* to do an index scan just because there's an
> index.

Someone more familiar than I with S2PL would be better able to
respond, but I *think* you just need to track this on whatever path
is actually chosen, not on all possible paths.

> Or there could be multiple indexes on the function, we definitely
> wouldn't want to have to check for range locks on every index.

Agreed. And I don't think we have to.

> We have to think outside of the box and get away from the
> pre-existing implementations which have solutions which aren't
> really applicable.

Well, this paper is well outside the box in many respects, although it
still does use well-worn techniques for some portions of the
processing. If PostgreSQL developers can expand on that with their
own innovations, fantastic!

> If we were to look at doing predicate locks in any way they would
> probably be stored in a whole new data structure, not related to the
> indexes on the table. We would need some way to index them so that
> we can look for conflicting locks efficiently independently from the
> query plan and table access methods.

That might burden this with much worse performance. I think that the
reason most products *do* base it on the actual rows, blocks, or
tables referenced is that it gives the needed behaviors with good
performance. Like I said, if we want to combine the innovations in
the paper with something clever and new in the predicate locking area,
OK -- as long as that isn't the cause for sinking the part that can
already be shown to work.

> I've removed the broken email address for now -- please re-add the
> correct email address.

I'll see what I can find. When I last corresponded with him, he was
still at the University of Sidney, working on his PhD by applying
these techniques to InnoDB. He specified that email address for
copying him when I opened the dialog. I don't think either of us
expected it to take this long for PostgreSQL to get to beta so that I
could do so. He said that when that was complete, he would be working
full-time for Oracle, so he's probably moved on to a new location and
this email address has gone stale. I'll see what I can track down.

-Kevin

From:	Greg Stark <stark(at)enterprisedb(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 14:05:17
Message-ID:	4136ffa0905080705q57dd90f3q45d1d551b09c65c4@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, May 8, 2009 at 3:00 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>
>> I've removed the broken email address for now -- please re-add the
>> correct email address.
>
> I'll see what I can find. When I last corresponded with him, he was
> still at the University of Sidney, working on his PhD by applying
> these techniques to InnoDB. He specified that email address for
> copying him when I opened the dialog. I don't think either of us
> expected it to take this long for PostgreSQL to get to beta so that I
> could do so. He said that when that was complete, he would be working
> full-time for Oracle, so he's probably moved on to a new location and
> this email address has gone stale. I'll see what I can track down.

For what it's worth I don't think this address has gone stale, I think
someone just got the email address messed up when adding it manually.
The address that was in the headers before was:

"Michael Cahill mjc"@it.usyd.edu.au

Whereas I suspect the right address was:

"Michael Cahill" mjc(at)it(dot)usyd(dot)edu(dot)au

I'll add that on my followups, if others could do the same rif they're
responding to a message where he isn't he should end up back on all
the responses.

--
greg

From:	Greg Stark <stark(at)enterprisedb(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 14:14:39
Message-ID:	4136ffa0905080714g70845e02if8b971ce24480120@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, May 8, 2009 at 3:00 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Greg Stark <stark(at)enterprisedb(dot)com> wrote:
>> On Thu, May 7, 2009 at 11:08 PM, Kevin Grittner
>> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>>> I would assume that SELECT shown above would either resolve to a
>>> table scan, in which case you would have to have an SIREAD lock at
>>> the table level
>>
>> That sounds like we're back to the MSSQL/Sybase way of doing things
>
> Other than not blocking, I suppose.
>
>> where you have to understand the query plan to understand why you're
>> getting spurious serialization failures.
>
> You have to know a lot more than that to solve serialization problems
> in PostgreSQL with current techniques.

Well you have to understand how Postgres locks work, but that's an
invariant. You don't have to know how Postgres is going to plan your
specific queries -- which you can't ever be sure of since it could
change as the data changes.

>> I don't think that's terribly appealing. Consider, for example, that
>> we might not *want* to do an index scan just because there's an
>> index.
>
> Someone more familiar than I with S2PL would be better able to
> respond, but I *think* you just need to track this on whatever path
> is actually chosen, not on all possible paths.

Well I don't understand what storing locks in an index can accomplish
if other queries might use other indexes or sequential scans to access
the records and never see those locks.

Or does this method only require that writers discover the locks and
therefore only writers can ever fail due to serialization failures
they cause?

I still haven't actually read the paper so I should probably bow out
from the conversation until I do. I was apparently already under one
misapprehension as Laurenz just claimed the paper does not show how to
prevent "phantoms" (phantom reads I assume?). Perhaps it's not as
ambitious as achieving true serializability after all?

--
greg

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	<pgsql-hackers(at)postgresql(dot)org>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	<mjc(at)it(dot)usyd(dot)edu(dot)au>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 14:26:35
Message-ID:	4A03FACB.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:

> As far as I know, only the table rows that are found in the index
> scan are examined for visibility. Which would be only one in my
> example.

S2PL locking schemes routinely use index range locks.

> But in your attempt to sketch a way how true serializability could
> be implemented, you went beyond the scope of the original paper,
> which does not claim to tackle "phantoms".

It doesn't put forward techniques to tackle that, apparently
considering the issues to be so well-known and obvious as to not
require more than a brief mention. They have a section titled
"Phantoms":

>> Throughout the discussion so far, we have followed typi-
>> cal concurrency control theory and assumed that a transac-
>> tion is a sequence of reads and writes on named data items.
>> In general, a relational database engine must also deal with
>> predicate operations (such as SQL *where* clauses). These
>> mean that a concurrency control algorithm must also con-
>> sider phantoms, where an item created or deleted in one
>> transaction would change the result of a predicate operation
>> in a concurrent transaction if the two transactions executed
>> serially. The solution used in traditional two-phase locking
>> is multigranularity locking, where a predicate operation
>> takes intention locks on larger units, such as pages or ta-
>> bles, to prevent insertion of records that might match the
>> predicate.

> As you mentioned in your first post, there will not be a free lunch.
> What hurts concurrency in an implementation with blocking read locks
> might also hurt concurrency in an implementation where transactions
> are frequently aborted and have to be retried.

Possibly; although the benchmarks published in the paper show much
better throughput than traditional blocking techniques with long and
high-conflict transactions. Eyeballing the graph (based on 20
concurrent sessions), blocking techniques got a 2% deadlock rate in
this benchmark, snapshot isolation got a 2.6% update conflict rate,
and the technique published in the paper got a 0.1% update conflict
rate plus another 7.2% snapshot unsafe rate. Oddly, in spite of a
serialization failure rate of 7.3% versus snapshot isolation's 2.6%,
the performance of the new technique edged out snapshot isolation when
the number of connections was 35 or higher. I assume that is because
the rollbacks and restarts somehow reduced thrashing. The traditional
serializable techniques started to drop below the non-blocking
techniques at about 10 concurrent sessions, and performance kept
dropping from that point while the non-blocking performance continued
to rise all the way to 50.

-Kevin

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Greg Stark" <stark(at)enterprisedb(dot)com>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 14:49:17
Message-ID:	4A04001D.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:

> Well I don't understand what storing locks in an index can
> accomplish if other queries might use other indexes or sequential
> scans to access the records and never see those locks.
>
> Or does this method only require that writers discover the locks and
> therefore only writers can ever fail due to serialization failures
> they cause?

Well, readers don't need to find the SIREAD locks which readers set.
Conflicts between writers are handled the same as current PostgreSQL
techniques. Readers need to look for write locks, and writers need to
look for SIREAD locks. Neither is blocked by the other, but finding a
conflict sets both transactions with a directional "edge" boolean
flag. (So we would need to track two booleans per transaction in
addition to the new SIREAD locks.) When a transaction reaches a state
where both "edge" booleans are set, one of the two transactions
involved in setting that must be rolled back.

The prototype implementation in the Berkeley DB preferred to roll back
a "pivot" transaction (one with both edges set) where possible, so the
failure would probably usually be on a transaction which modified
data, but not necessarily -- if the writers involved have committed
and the reader transaction might see an invalid database state, the
reader would be rolled back.

> I still haven't actually read the paper so I should probably bow out
> from the conversation until I do. I was apparently already under
> one misapprehension as Laurenz just claimed the paper does not show
> how to prevent "phantoms" (phantom reads I assume?). Perhaps it's
> not as ambitious as achieving true serializability after all?

It does achieve true serializability in terms of the definitions I've
read, although I've discovered at least one way in which its
guarantees aren't as strong as traditional blocking techniques -- it
doesn't guarantee that transactions at a level less strict than
serializable will see a state which would exist between some serial
execution of serializable transactions which modify the data, as the
blocking schemes do. As I said in an earlier post, I'm OK with that,
personally. We should probably document it as a difference, to alert
someone converting, but the standard doesn't seem to require the
behavior that traditional blocking approaches provide on this point.

-Kevin

From:	Greg Stark <stark(at)enterprisedb(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 15:12:20
Message-ID:	4136ffa0905080812k369554eci7b2e1d8930d2c637@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, May 8, 2009 at 3:49 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Greg Stark <stark(at)enterprisedb(dot)com> wrote:
>
>> Well I don't understand what storing locks in an index can
>> accomplish if other queries might use other indexes or sequential
>> scans to access the records and never see those locks.
>>
>> Or does this method only require that writers discover the locks and
>> therefore only writers can ever fail due to serialization failures
>> they cause?
>
> Well, readers don't need to find the SIREAD locks which readers set.
> Conflicts between writers are handled the same as current PostgreSQL
> techniques. Readers need to look for write locks, and writers need to
> look for SIREAD locks.

Well this is where I'm failing to follow. If readers need to be sure
they'll find write locks then surely they can't be stored in indexes
without losing any flexibility.

You would need to be sure other readers will look at the same index
you put the lock in -- so you either need to put the lock in every
index, have other readers look in every index, or have a very limited
predictable way of ensuring everyone will use the same index for any
queries where that lock matters.

--
greg

From:	Greg Stark <stark(at)enterprisedb(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 15:15:10
Message-ID:	4136ffa0905080815n20fc047av91b93a9c217a041a@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, May 8, 2009 at 4:12 PM, Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> On Fri, May 8, 2009 at 3:49 PM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> Greg Stark <stark(at)enterprisedb(dot)com> wrote:
>>
>>> Well I don't understand what storing locks in an index can
>>> accomplish if other queries might use other indexes or sequential
>>> scans to access the records and never see those locks.
>>>
>>> Or does this method only require that writers discover the locks and
>>> therefore only writers can ever fail due to serialization failures
>>> they cause?
>>
>> Well, readers don't need to find the SIREAD locks which readers set.
>> Conflicts between writers are handled the same as current PostgreSQL
>> techniques. Readers need to look for write locks, and writers need to
>> look for SIREAD locks.
>
>
> Well this is where I'm failing to follow. If readers need to be sure
> they'll find write locks then surely they can't be stored in indexes
> without losing any flexibility.

Argh, sorry, as soon as I hit send I realized this is wrong. Writers
already need to insert into every index, so that's not a problem. The
problem is if readers need to see other reader locks. If that's not
true then I guess I'm all wet and I will wait until I read the paper.

--
greg

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Greg Stark <stark(at)enterprisedb(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 15:30:57
Message-ID:	3341.1241796657@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> writes:
> ... Argh, sorry, as soon as I hit send I realized this is wrong. Writers
> already need to insert into every index, so that's not a problem.

What about HOT?

regards, tom lane

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Greg Stark" <stark(at)enterprisedb(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-08 15:39:43
Message-ID:	4A040BEF.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Greg Stark <stark(at)enterprisedb(dot)com> writes:
>> ... Argh, sorry, as soon as I hit send I realized this is wrong.
>> Writers already need to insert into every index, so that's not a
>> problem.
>
> What about HOT?

I think that a read would need to lock both the row or tuple (not sure
exactly how that would work) and any index used to find the row or
tuple (or detect its absence). If a table scan is used, the lock
would be at the table level (keeping in mind that this is not a lock
which ever blocks anything). An insert or an update which created a
new conflicting tuple would hit the table or index lock. A HOT update
would hit the row lock.

I think....

-Kevin

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-11 07:25:06
Message-ID:	D960CB61B694CF459DCFB4B0128514C202FF65B7@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kevin Grittner wrote:
> > I still haven't actually read the paper so I should probably bow out
> > from the conversation until I do. I was apparently already under
> > one misapprehension as Laurenz just claimed the paper does not show
> > how to prevent "phantoms" (phantom reads I assume?). Perhaps it's
> > not as ambitious as achieving true serializability after all?
>
> It does achieve true serializability in terms of the definitions I've
> read, although I've discovered at least one way in which its
> guarantees aren't as strong as traditional blocking techniques -- it
> doesn't guarantee that transactions at a level less strict than
> serializable will see a state which would exist between some serial
> execution of serializable transactions which modify the data, as the
> blocking schemes do.

I still don't buy that this implementation guarantees serializability.

All the authors show with regard to predicate handling is handwaving,
and while you tried to come up with ideas how that could be improved
that is not what the implementation described in the paper does.

So this paper shows a performant implementation of something that is
closer to serializability than "snapshot isolation", but did not go
all the way.

As I said, I think it is promising, and it can only be hoped that
the authors pursue the path they have taken and share their experiences
with an implementation of full serializability with their technique.

Yours,
Laurenz Albe

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Greg Stark" <stark(at)enterprisedb(dot)com>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-11 13:49:52
Message-ID:	4A07E6B0.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:

> All the authors show with regard to predicate handling is
> handwaving,

That is because predicate locking is a mature technology with many
known implementations. The best technique for any database product
will depend on that product, and their technique doesn't depend on
which implementation is used. Assuming some form of predicate
locking, do you have any other qualms about the the algorithm
presented in the paper?

-Kevin

From:	Greg Stark <stark(at)enterprisedb(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-11 13:59:48
Message-ID:	4136ffa0905110659q5823f78m7ae6338b291e87b1@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 11, 2009 at 2:49 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
>
>> All the authors show with regard to predicate handling is
>> handwaving,
>
> That is because predicate locking is a mature technology with many
> known implementations. The best technique for any database product
> will depend on that product, and their technique doesn't depend on
> which implementation is used. Assuming some form of predicate
> locking, do you have any other qualms about the the algorithm
> presented in the paper?

I thought the big problem with providing true serializability was the
predicate locking. If it doesn't address that need then does this get
us any closer?

Is this like saying walls are a well understood technology so these
antilock brakes work great for stopping your car as long as you
combine them with a wall? :)

--
greg

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-11 14:07:47
Message-ID:	D960CB61B694CF459DCFB4B0128514C202FF65B8@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kevin Grittner wrote:
> > All the authors show with regard to predicate handling is
> > handwaving,
>
> That is because predicate locking is a mature technology with many
> known implementations. The best technique for any database product
> will depend on that product, and their technique doesn't depend on
> which implementation is used. Assuming some form of predicate
> locking, do you have any other qualms about the the algorithm
> presented in the paper?

No - given that the algorithm is correct (which the authors cite from
another paper which I cannot easily access).

In my first reply I wondered if the presence of concurrent "read committed"
transactions would somehow affect the correctness of the algorithm,
as the authors don't mention that.

Yours,
Laurenz Albe

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Greg Stark" <stark(at)enterprisedb(dot)com>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-11 14:11:55
Message-ID:	4A07EBDB.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:

> I thought the big problem with providing true serializability was
> the predicate locking. If it doesn't address that need then does
> this get us any closer?

I thought the big problem was the perception that performance would
suffer and that the level of blocking required would be unacceptable.
This technique (based on available benchmarks from the prototype
implementation) seems to give performance very close to snapshot
isolation with no additional blocking beyond what snapshot isolation
already has to support "first committer wins" update conflict
detection. Benchmarks showed much better performance than traditional
blocking techniques for achieving serializability.

Since it can markedly increase serialization failure rollbacks, the
software needs to be able to respond to those gracefully, but since
our framework automatically re-submits transactions which are
terminated with that SQLSTATE, this approach sound very useful for us.

> Is this like saying walls are a well understood technology so these
> antilock brakes work great for stopping your car as long as you
> combine them with a wall? :)

I see it more like saying that walls are a well understood technology,
and this is a proposal for a way to use them in putting up a
particular useful building.

-Kevin

From:	Greg Stark <stark(at)enterprisedb(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>, pgsql-hackers(at)postgresql(dot)org, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-11 14:24:27
Message-ID:	4136ffa0905110724u28277c48n43a955273c463c1d@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, May 11, 2009 at 3:11 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Greg Stark <stark(at)enterprisedb(dot)com> wrote:
>
>> I thought the big problem with providing true serializability was
>> the predicate locking. If it doesn't address that need then does
>> this get us any closer?
>
> I thought the big problem was the perception that performance would
> suffer and that the level of blocking required would be unacceptable.

This thread has really been one of those cases where everyone thought
they were having a different kind of discussion.

If predicate locking is so well understood and if someone who
understands it and understands what kind of implementation would work
well in Postgres steps forward with an implementation which doesn't
cause major downsides then I suspect we might revisit our prejudices
against it. But as it stands I think the assumption is that having to
maintain locks on hypothetical records which don't exist would be an
expensive cost to impose on every query which would unduly impact
performance.

I, for one, certainly assumed if we did anything like that it would
work like our existing locks in that it wouldn't impose any additional
blocking. If there was any question of that then it sounds like this
paper might be a step forward in that you're on-side at least on that
question now?

--
greg

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Greg Stark" <stark(at)enterprisedb(dot)com>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-11 14:29:47
Message-ID:	4A07F00B.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:

> In my first reply I wondered if the presence of concurrent "read
> committed" transactions would somehow affect the correctness of the
> algorithm, as the authors don't mention that.

Yeah, I was concerned about that, too. In thinking it through I've
convinced myself that there is a choice in implementation, which seems
to have a pretty obvious winner.

(1) If READ COMMITTED and SNAPSHOT isolation levels don't change at
all, there would be a behavioral difference between this technique and
strict two phase locking (S2PL) implementations of serializable
transactions. With S2PL, even READ COMMITTED transactions can only
view the database in a state which is consistent with some serial
application of SERIALIZABLE transactions. Under the algorithm from
this paper, without changes to other isolation levels, if you want to
view the database in a coherent state relative to SERIALIZABLE
transactions, you must use a SERIALIZABLE transaction.

(2) Promote everything to SERIALIZABLE by having all transactions,
regardless of isolation level, take out SIREAD locks and check for
unsafe access patterns. This would, strictly speaking, conform to the
SQL standard, because an implementation is free to promote requests
for any level of isolation to a more strict level; however, it hardly
seems useful.

So, I think the only sane thing to do in this regard would be to
document that there is a difference from blocking implementations of
SERIALIZABLE in the guarantees provided for non-serializable
transactions.

-Kevin

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Greg Stark" <stark(at)enterprisedb(dot)com>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-11 15:08:28
Message-ID:	4A07F91C.EE98.0025.0@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> Greg Stark <stark(at)enterprisedb(dot)com> wrote:
>>
>>> I thought the big problem with providing true serializability was
>>> the predicate locking. If it doesn't address that need then does
>>> this get us any closer?
>>
>> I thought the big problem was the perception that performance would
>> suffer and that the level of blocking required would be
>> unacceptable.
>
> This thread has really been one of those cases where everyone
> thought they were having a different kind of discussion.

Apparently so.

> If predicate locking is so well understood and if someone who
> understands it and understands what kind of implementation would
> work well in Postgres steps forward with an implementation which
> doesn't cause major downsides then I suspect we might revisit our
> prejudices against it. But as it stands I think the assumption is
> that having to maintain locks on hypothetical records which don't
> exist would be an expensive cost to impose on every query which
> would unduly impact performance.

It would only impact transactions running at the full serializable
isolation level, and I'm guessing that the performance would be
reasonable if an implementation similar to that in DB2, Sybase,
Microsoft SQL Server, etc. is used. Some here have derided that
approach as crude and implied that only something more aesthetically
pleasing would be considered, but that such implementations would be
prohibitively slow (which, of course, is exactly why they are not used
in these other products).

> I, for one, certainly assumed if we did anything like that it would
> work like our existing locks in that it wouldn't impose any
> additional blocking.

Until this paper, implementation of serializable transactions, even in
an MVCC database required S2PL techniques which caused a lot of
blocking, including readers blocking on writes and vice versa. The
advance of this paper isn't any novel implementation of predicate
locking, but the reduction of the locks generated by reads to a new
SIREAD level lock which would not introduce any blocking; but instead
would assist in the detection of unsafe patterns of reads and writes
to allow rollbacks to prevent serialization anomalies.

> If there was any question of that then it sounds like this
> paper might be a step forward in that you're on-side at least on
> that question now?

I was never on the other side of that. I know that some apparently
thought that my attempts to document PostgreSQL's deviation from
current standards in this regard, and to provide more real-world
examples of where people might run into trouble, were really sly
attempts to persuade people to implement full support for serializable
transactions. That really wasn't the case.

We had been slightly burned by the differences in spite of my having
read the current documentation, because the example given is so
far-fetched and bizarre, that I rather naively thought "Well, if
that's how far out you have to go to hit a problem, the risk is quite
low." I was trying to find something which gave people a clearer
picture of the issue, so others didn't make the same mistake. I
wasn't advocating for full serializable support at that point, and
probably would have been reluctant to use it if available because of
the performance issues (assuming a traditional implementation).

In the course of discussing the issue, this paper, published by ACM
earlier in the year, was brought to my attention. I see it as the
best of both worlds -- MVCC performance with the protections of
serializable transactions. Back when I first read the paper, though,
it looked to be a struggle to get 8.4 to beta testing, so I sat on it
until now.

-Kevin

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	<pgsql-hackers(at)postgresql(dot)org>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, <Greg Stark <stark(at)enterprisedb(dot)com>, "Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-27 18:34:00
Message-ID:	4A1D4148.EE98.0025.1@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> Greg Stark <stark(at)enterprisedb(dot)com> wrote:

>> This thread has really been one of those cases where everyone
>> thought they were having a different kind of discussion.
>
> Apparently so.

In light of discussions at PGCon, I'm going to abandon this thread in
favor of a discussion of what this feature would look like from a user
perspective. Only after there is consensus (if any) on the
desirability of the behavior will I move on to a discussion of
implementation techniques -- probably on a fresh thread.

For the record, it became clear that I did a bad job of communicating
on this thread, so I'll finish it with some clarifications and
observations "just for the record."

First, the paper didn't propose any new techniques for predicate
locking, other than using a lock level which doesn't block anything
else. It did reference existing papers on locking techniques.

Second, Robert Haas had an interesting insight: if the techniques from
the paper were implemented in the absence of predicate locks, that
would still reduce the frequency of serialization anomalies from
current snapshot isolation techniques. Since the PostgreSQL community
generally prefers incremental enhancement patches on the way to a
final result, he suggested that this would be a good path -- do
everything except the predicate locks, show the incremental benefits,
then add the predicate locks. We could probably even add the
predicate locks in stages: first implement table level predicate
locks, since that has to exist and would provide a complete, if
somewhat clumsy, serializable solution; then move on to more
fine-grained locks. It would probably be workable, and possibly
optimal, to have just table and page locks; although it seems likely
that index range locks and row locks would also be worth it,
eventually.

But all that is just to get it on the record; I'm getting ahead of
myself here. I'll be posting on the user-facing aspects of this soon.

-Kevin

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	pgsql-hackers(at)postgresql(dot)org, Michael Cahill <mjc(at)it(dot)usyd(dot)edu(dot)au>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-27 19:00:24
Message-ID:	1243450824.24860.352.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2009-05-27 at 13:34 -0500, Kevin Grittner wrote:

> For the record, it became clear that I did a bad job of communicating
> on this thread...

You did good work, IMHO. Not everything will reach consensus and that's
not your fault.

> first implement table level predicate
> locks, since that has to exist and would provide a complete, if
> somewhat clumsy, serializable solution; then move on to more
> fine-grained locks. It would probably be workable, and possibly
> optimal, to have just table and page locks; although it seems likely
> that index range locks and row locks would also be worth it,
> eventually.

Do we need table-level predicate locks at all? What would they give us?
Why not just go straight for fine-grained page-level locks?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Simon Riggs" <simon(at)2ndQuadrant(dot)com>
Cc:	"Michael Cahill" <mjc(at)it(dot)usyd(dot)edu(dot)au>, <pgsql-hackers(at)postgresql(dot)org>, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject:	Re: Serializable Isolation without blocking
Date:	2009-05-27 19:07:24
Message-ID:	4A1D491B.EE98.0025.1@wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

> Do we need table-level predicate locks at all? What would they give
> us? Why not just go straight for fine-grained page-level locks?

I don't want to get too far into implementation discussions at this
phase (see Tom's slides ;-)), but suffice it to say that a table scan
can cover more pages than we'd want to track individually....

The coursest possible resolution allows proof of concept. Tests can
be written that work at that level which should not break as
finer-grained locks are implemented. (See how I'm drawing from
another presentation? ;-))

-Kevin

From:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	"Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-12-31 12:45:49
Message-ID:	b0f3f5a10912310445y5b581178u80e3b3f098ff26d6@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

[ Reviving this old thread because a recent one referred to it. ]

2009/5/7 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:

> Kevin Grittner wrote:
>
>> > maybe I misunderstood something.
>> >
>> > Consider a function
>> > "makehighlander(personid integer) RETURNS void"
>> > defined like this:
>> >
>> > SELECT ishighlander INTO b FROM scots WHERE id=personid;
>> > IF b THEN
>> > RETURN; /* no need to do anything */
>> > END IF;
>> > UPDATE scots SET ishighlander=TRUE WHERE id=personid;
>> > SELECT count(*) INTO n FROM scots WHERE ishighlander;
>> > IF (n > 1) THEN
>> > RAISE EXCEPTION 'There can be only one';
>> > END IF;
>> >
>> > If we assume that "ishighlander" is false for all records in
>> > the beginning, and there are two calls to the function with
>> > two personid's of records *in different pages*, then there cannot be
>> > any conflicts since all (write and intention) locks taken by each of
>> > these calls should only affect the one page that contains the one
>> > record that is updated and then found in the subsequent SELECT.
>> >
>> > Yet if the two execute concurrently and the two first SELECTs are
>> > executed before the two UPDATEs, then both functions have a snapshot
>> > so that the final SELECT statements will return 1 and both functions
>> > will succeed, leaving the table with two highlanders.
>>
>> I do think you misunderstood. If there are two concurrent executions
>> and each reads one row, there will be an SIREAD lock for each of those
>> rows. As an example, let's say that one of them (T0) updates its row
>> and does its count, finds everything looks fine, and commits. In
>> reading the row the other transaction (T1) modified it sets the
>> T0.outConflict flag to true and the T1.inConflict flag to true.
>
> Where does T0 read the row that T1 modified?

* Typically, concurrency theory doesn't care about the specifics of
relational databases: it works on a (possibly countably infinite)
number of data items (sometimes called "variables").
* If a certain concurrency control technique works for such data items
(i.e., can only result in serializable executions or whatever), then
it must necessarily also work for relational databases which map their
data in "pages", if those pages are treated the same way the data
items are. Indexes and any other structures that can be used to *find
out* which other pages to read/write must then also be treated this
way.
* To answer your specific question: T0 might not read that specific
row, but the COUNT(..) definitely must read *something* that must be
modified by T1 when it updates the ishighlander field: either the row
itself (which I would expect if no index on ishighlander exists), or
some page in an index that it used to find out that it didn't need to
inspect the row itself. Otherwise, the update wasn't effective because
re-executing the COUNT(..) later on would not result in any change in
the result (which leads to a contradiction: changing the ishighlander
field of one row must result in a change in the number of
highlanders).

Nicolas

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "Kevin Grittner EXTERN" <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Serializable Isolation without blocking
Date:	2009-12-31 14:08:45
Message-ID:	407d949e0912310608w68a5aaf4s4f9b7286e0e9fc9c@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Dec 31, 2009 at 12:45 PM, Nicolas Barbier
<nicolas(dot)barbier(at)gmail(dot)com> wrote:
> * To answer your specific question: T0 might not read that specific
> row, but the COUNT(..) definitely must read *something* that must be
> modified by T1 when it updates the ishighlander field:

The problem occurs when the update happens. It doesn't have any way to
know currently that a SELECT has already looked at that record and
that the same transaction has performed an update which this
transaction has already ignored when performing the count(*).

The unsolved problems that have been raised are:

- How and where do we have SELECTs note all the rows they read -- and
all the rows they *would* have read that don't exist already. Note
that something like select count(*) where id=? needs to be able to
detect new rows from being inserted with the specified value, not
merely lock the existing rows.

- Can we do it without requiring major code changes in every index am
and destroying modularity between the transaction management and the
index api.

- How do we do that without causing SELECTS to perform tons of write
i/o they don't have to do now. People already complain about the hint
bit updates the first time you do selects, doing i/o on every select
would be disastrous.

- Can we do that without destroying concurrency with course "locks" a
la MySQL ISAM tables.

- Can we do it without introducing unexpected serialization failure
between transactions updating unrelated rows. Ideally, can we do it in
a way that serialization errors are predictable rather than depending
on access paths the planner chooses so they don't just randomly start
happening when plans change.

--
greg