serializable read only deferrable

Lists: pgsql-hackers
From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-hackers(at)postgresql(dot)org>
Cc: <drkp(at)csail(dot)mit(dot)edu>
Subject: serializable read only deferrable
Date: 2010-12-05 15:11:35
Message-ID: 4CFB574702000025000382FD@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'm reviving the discussion on the subject topic because I just had
an epiphany which makes it seem simple to implement. The concept of
this is that if you start a SERIALIZABLE READ ONLY transaction in an
SSI environment when certain conditions are true, it doesn't need to
acquire predicate locks or test for rw-conflicts. This would be
particularly useful for pg_dump or large reports, as it would allow
them to read data which was guaranteed to be consistent with later
states of the database without risking serialization failure or
contributing to the failure of other transactions. They should also
run a bit faster without the overhead of locking and checking.

Having completed the switch from a pair of rw-conflict pointers per
serializable transaction to a list of rw-conflicts, I'm working
through the more aggressive transaction clean-up strategies thereby
allowed in preparation for the graceful degradation code. Along the
way, I noticed how easy it is to allow a READ ONLY transaction to opt
out of predicate locking and conflict detection when it starts with
no concurrent non READ ONLY transactions active, or even to remove
READ ONLY transactions from those activities when such a state is
reached during the execution of READ ONLY transactions; while
properly recognizing the *additional* conditions under which this
would be valid is rather painful. (Those additional conditions being
that no concurrent non-read-only transaction may overlap a committed
non-read-only transaction which wrote data and committed before the
read-only transaction acquired its snapshot.)

The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under
SSI would be to have each non-read-only serializable transaction
acquire a heavyweight lock which can coexist with other locks at the
same level (SHARE looks good) on some common object and hold that for
the duration of the transaction, while a SERIALIZABLE READ ONLY
DEFERRABLE transaction would need to acquire a conflicting lock
(EXCLUSIVE looks good) before it could acquire a snapshot, and
release the lock immediately after acquiring the snapshot.

For these purposes, it appears that advisory locks could work, as
long as the lock release does not wait for the end of the transaction
(which it doesn't, if I'm reading the docs right) and as long as I
can pick a lock ID which won't conflict with other uses. That latter
part is the only iffy aspect of the whole thing that I can see. Of
course, I could add a third lock method, but that seems like overkill
to be able to get one single lock.

Since I'm already allowing a transaction to opt out of predicate
locking and conflict detection if there are no non-read-only
transactions active when it acquires its snapshot, the work needed
within the SSI code is pretty trivial; it's all in adding the
DEFERRABLE word as a non-standard extension to SET TRANSACTION et al,
and finding a heavyweight lock to use.

Thoughts?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org, drkp(at)csail(dot)mit(dot)edu
Subject: Re: serializable read only deferrable
Date: 2010-12-05 17:13:32
Message-ID: 26546.1291569212@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> I'm reviving the discussion on the subject topic because I just had
> an epiphany which makes it seem simple to implement. The concept of
> this is that if you start a SERIALIZABLE READ ONLY transaction in an
> SSI environment when certain conditions are true, it doesn't need to
> acquire predicate locks or test for rw-conflicts.

I assume this would have to be a "hard" definition of READ ONLY, not
the rather squishy definition we use now? How would we manage the
compatibility implications?

regards, tom lane


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: <pgsql-hackers(at)postgresql(dot)org>, <drkp(at)csail(dot)mit(dot)edu>
Subject: Re: serializable read only deferrable
Date: 2010-12-06 20:41:11
Message-ID: 94C154B6-0F9E-43B0-A245-1CFA8630064A@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec5, 2010, at 16:11 , Kevin Grittner wrote:
> The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under
> SSI would be to have each non-read-only serializable transaction
> acquire a heavyweight lock which can coexist with other locks at the
> same level (SHARE looks good) on some common object and hold that for
> the duration of the transaction, while a SERIALIZABLE READ ONLY
> DEFERRABLE transaction would need to acquire a conflicting lock
> (EXCLUSIVE looks good) before it could acquire a snapshot, and
> release the lock immediately after acquiring the snapshot.

Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to acquire the lock, no other transaction would be allowed to start until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been able to acquire its snapshot. For pg_dump's purposes at least, that seems undesirable, since a single long-running transaction at the time you start pg_dump would effectly DoS your system until the long-running transaction finishes.

The alternative seems to be to drop the guarantee that a SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a stream of overlapping non-READ ONLY transactions. Then a flag in the proc array that marks non-READ ONLY transactions should be sufficient, plus a wait-and-retry loop to take snapshots for SERIALIZABLE READ ONLY DEFERRABLE transactions.

best regards,
Florian Pflug


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Florian Pflug" <fgp(at)phlo(dot)org>
Cc: <drkp(at)csail(dot)mit(dot)edu>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: serializable read only deferrable
Date: 2010-12-06 21:53:24
Message-ID: 4CFD06F4020000250003837F@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Dec5, 2010, at 16:11 , Kevin Grittner wrote:
>> The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE
>> under SSI would be to have each non-read-only serializable
>> transaction acquire a heavyweight lock which can coexist with
>> other locks at the same level (SHARE looks good) on some common
>> object and hold that for the duration of the transaction, while a
>> SERIALIZABLE READ ONLY DEFERRABLE transaction would need to
>> acquire a conflicting lock (EXCLUSIVE looks good) before it could
>> acquire a snapshot, and release the lock immediately after
>> acquiring the snapshot.
>
> Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to
> acquire the lock, no other transaction would be allowed to start
> until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been
> able to acquire its snapshot. For pg_dump's purposes at least,
> that seems undesirable, since a single long-running transaction at
> the time you start pg_dump would effectly DoS your system until
> the long-running transaction finishes.

Well, when you put it that way, it sounds pretty grim. :-( Since
one of the bragging points of SSI is that it doesn't introduce any
blocking beyond current snapshot isolation, I don't want to do
something here which blocks anything except the transaction which
has explicitly requested the DEFERRABLE property. I guess that,
simple as that technique might be, it just isn't a good idea.

> The alternative seems to be to drop the guarantee that a
> SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a
> stream of overlapping non-READ ONLY transactions. Then a flag in
> the proc array that marks non-READ ONLY transactions should be
> sufficient, plus a wait-and-retry loop to take snapshots for
> SERIALIZABLE READ ONLY DEFERRABLE transactions.

If I can find a way to pause an active process I already have
functions in which I maintain the count of active SERIALIZABLE READ
WRITE transactions as they begin and end -- I could release pending
DEFERRABLE transactions when the count hits zero without any
separate loop. That has the added attraction of being a path to the
more complex checking which could allow the deferrable process to
start sooner in some circumstances. The "simple" solution with the
heavyweight lock would not have been a good path to that.

What would be the correct way for a process to put itself to sleep,
and for another process to later wake it up?

-Kevin


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Florian Pflug <fgp(at)phlo(dot)org>, drkp(at)csail(dot)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: serializable read only deferrable
Date: 2010-12-06 22:02:59
Message-ID: 4CFD5D93.7040009@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06.12.2010 22:53, Kevin Grittner wrote:
> What would be the correct way for a process to put itself to sleep,
> and for another process to later wake it up?

See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: <drkp(at)csail(dot)mit(dot)edu>,"Florian Pflug" <fgp(at)phlo(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: serializable read only deferrable
Date: 2010-12-06 22:20:53
Message-ID: 4CFD0D650200002500038385@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 06.12.2010 22:53, Kevin Grittner wrote:
>> What would be the correct way for a process to put itself to
>> sleep, and for another process to later wake it up?
>
> See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code.

Is there a reason to prefer one over the other?

-Kevin


From: Florian Pflug <fgp(at)phlo(dot)org>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: <drkp(at)csail(dot)mit(dot)edu>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: serializable read only deferrable
Date: 2010-12-07 13:50:38
Message-ID: 103259ED-DAA2-48F6-9341-8FC620A41EF9@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec6, 2010, at 22:53 , Kevin Grittner wrote:
>> The alternative seems to be to drop the guarantee that a
>> SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a
>> stream of overlapping non-READ ONLY transactions. Then a flag in
>> the proc array that marks non-READ ONLY transactions should be
>> sufficient, plus a wait-and-retry loop to take snapshots for
>> SERIALIZABLE READ ONLY DEFERRABLE transactions.
>
> If I can find a way to pause an active process I already have
> functions in which I maintain the count of active SERIALIZABLE READ
> WRITE transactions as they begin and end -- I could release pending
> DEFERRABLE transactions when the count hits zero without any
> separate loop. That has the added attraction of being a path to the
> more complex checking which could allow the deferrable process to
> start sooner in some circumstances. The "simple" solution with the
> heavyweight lock would not have been a good path to that.

I'm starting to wonder if you couldn't get a weaker form of the non-starvation guarantee back by doing the waiting *after* you acquire the snapshot of a SERIALIZABLE RAD ONLY transaction instead of before. AFAICS, the main reason for a SERIALIZABLE RAD ONLY transaction's snapshot to be inconsistent that it sees some transaction A as committed and B as uncommitted when on the other hand B must happen before A in any serial schedule. In other words, if there is no dangerous structure even if you add an rw-dependency edge from the SERIALIZABLE RAD ONLY transaction to every concurrent transaction, the SERIALIZABLE RAD ONLY transaction's snapshot is consistent. I'm thus envisioning something along the line of

1) Take a snapshot, flag the transaction as SERIALIZABLE READ ONLY DEFERRED, and add a rw-dependency to every other running READ WRITE transaction
2) Wait for all these concurrent transaction to either COMMIT or ABORT
3) Check if the transaction has been marked INCONSISTENT. If not, let the transaction proceed. If it was, start over with (1)

*) During conflict detection, you'd check if one of the participating transaction is flagged as SERIALIZABLE READ ONLY DEFERRED and mark it INCONSISTENT if it is.

Essentially, instead of adding dependencies as you go along and abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED transactions would assume the worst case from the start and thus be able to bypass the more detailed checks later on.

With this scheme, you'd at least stand some chance of eventually acquiring a consistent snapshot, even in the case of an endless stream of overlapping READ WRITE transactions.

I have to admit though that I didn't really think this through thoroughly yet, it was more of a quick idea I got after pondering this for a bit before I went to bed yesterday.

best regards,
Florian Pflug


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Florian Pflug" <fgp(at)phlo(dot)org>
Cc: <drkp(at)csail(dot)mit(dot)edu>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: serializable read only deferrable
Date: 2010-12-07 16:14:24
Message-ID: 4CFE090002000025000383A4@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> wrote:

> reason for a SERIALIZABLE READ ONLY transaction's snapshot to be
> inconsistent that it sees some transaction A as committed and B as
> uncommitted when on the other hand B must happen before A in any
> serial schedule.

Precisely right, and very well stated.

> I'm thus envisioning something along the line of
>
> 1) Take a snapshot, flag the transaction as SERIALIZABLE READ ONLY
> DEFERRED, and add a rw-dependency to every other running READ
> WRITE transaction
> 2) Wait for all these concurrent transaction to either COMMIT or
> ABORT
> 3) Check if the transaction has been marked INCONSISTENT. If not,
> let the transaction proceed. If it was, start over with (1)
>
> *) During conflict detection, you'd check if one of the
> participating transaction is flagged as SERIALIZABLE READ ONLY
> DEFERRED and mark it INCONSISTENT if it is.

That is brilliant.

> Essentially, instead of adding dependencies as you go along and
> abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED
> transactions would assume the worst case from the start and thus
> be able to bypass the more detailed checks later on.

Right -- such a transaction, having acquired a good snapshot, could
release all SSI resources and run without any of the SSI overhead.

> With this scheme, you'd at least stand some chance of eventually
> acquiring a consistent snapshot, even in the case of an endless
> stream of overlapping READ WRITE transactions.

Yeah, I'd been twisting ideas around trying to find a good way to do
this; you've got it right at the conceptual level, I think.

> I have to admit though that I didn't really think this through
> thoroughly yet, it was more of a quick idea I got after pondering
> this for a bit before I went to bed yesterday.

[reads through it a few more times, sips caffeine, and thinks]

Really, what you care about is whether any of the READ WRITE
transactions active at the time the snapshot was acquired commit
after developing a rw-conflict with a transaction which committed
before the READ ONLY DEFERRABLE snapshot was acquired. (The reader
would have to appear first in any serial schedule, yet the READ ONLY
transaction can see the effects of the writer but not the reader.)
Which brings up another point, that reader must also write to a
permanent table before it commits in order to become the pivot in
the dangerous structure.

Pseudo-code of idea (conveniently ignoring locking issues and
non-serializable transactions):

// serializable read only deferrable xact
do
{
get a snapshot
clear inconsistent flag
if (no concurrent read write xacts)
break; // we got it the easy way
associate all active read write xacts with this xact
block until told to wake
} while (inconsistent);
clear xact from any SSI structures its in
run with the snapshot

// each xact associated with the above
on transaction completion
if (commit
and has written
and has conflict out
to xact committed before deferrable snapshot)
{
flag deferrable as inconsistent
unblock deferrable xact
}
else
if this is termination of last associated read write xact
unblock deferrable xact

Seem sane?

-Kevin


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <drkp(at)csail(dot)mit(dot)edu>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: serializable read only deferrable
Date: 2010-12-08 00:45:48
Message-ID: 4CFE80DC02000025000383E9@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I assume this would have to be a "hard" definition of READ ONLY,
> not the rather squishy definition we use now?

Oh, I just went through the code on setting READ ONLY and discovered
that contrary to the standard *and* the PostgreSQL documentation,
you can change the status of a transaction between READ ONLY and
READ WRITE at will. Yeah, that's a problem for my intended use.
Many optimizations would need to go right out the window, and the
false positive rate under SSI would be high.

> How would we manage the compatibility implications?

Comply with the standard. The bright side of this is that it
wouldn't require any change to our user docs.

http://www.postgresql.org/docs/current/interactive/sql-start-transaction.html

| This command begins a new transaction block. If the isolation
| level or read/write mode is specified, the new transaction has
| those characteristics, as if SET TRANSACTION was executed. This is
| the same as the BEGIN command.

and on the same page:

| Compatibility
|
| In the standard, it is not necessary to issue START TRANSACTION to
| start a transaction block: any SQL command implicitly begins a
| block. PostgreSQL's behavior can be seen as implicitly issuing a
| COMMIT after each command that does not follow START TRANSACTION
| (or BEGIN), and it is therefore often called "autocommit". Other
| relational database systems might offer an autocommit feature as a
| convenience.

No mention of "and you can change back and forth between READ ONLY
and READ WRITE any time during the transaction, including between
reads and writes, as many times as you like."

Was there a justification for this behavior, or was it just not
implemented carefully? Does anyone currently depend on the current
behavior?

test=# create table asdf (id int not null primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"asdf_pkey" for table "asdf"
CREATE TABLE
test=# set default_transaction_isolation = serializable;
SET
test=# set transaction read only;
SET
BEGIN
test=# set transaction read only;
SET
test=# select 1;
?column?
----------
1
(1 row)

test=# set transaction read write;
SET
test=# insert into asdf values (1);
INSERT 0 1
test=# set transaction read only;
SET
test=# select * from asdf;
id
----
1
(1 row)

test=# set transaction read write;
SET
test=# insert into asdf values (2);
INSERT 0 1
test=# commit;
COMMIT

I find that to be a huge POLA violation. I will happily prepare a
patch to fix this if there is agreement that we want it. I really
need READ ONLY *transactions*, not READ ONLY *moments* within
transactions to do any optimization based on the property.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: drkp(at)csail(dot)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: serializable read only deferrable
Date: 2010-12-08 01:36:13
Message-ID: 2844.1291772173@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Oh, I just went through the code on setting READ ONLY and discovered
> that contrary to the standard *and* the PostgreSQL documentation,
> you can change the status of a transaction between READ ONLY and
> READ WRITE at will. Yeah, that's a problem for my intended use.
> Many optimizations would need to go right out the window, and the
> false positive rate under SSI would be high.

I believe you had better support the locution

begin;
set transaction read only;
...

I agree that letting it be changed back to read/write after that is
surprising and unnecessary. Perhaps locking down the setting at the
time of first grabbing a snapshot would be appropriate. IIRC that's
how it works for transaction isolation level, and this seems like it
ought to work the same.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <drkp(at)csail(dot)mit(dot)edu>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: serializable read only deferrable
Date: 2010-12-08 14:48:52
Message-ID: 4CFF46740200002500038420@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I agree that letting it be changed back to read/write after that
> is surprising and unnecessary. Perhaps locking down the setting
> at the time of first grabbing a snapshot would be appropriate.
> IIRC that's how it works for transaction isolation level, and this
> seems like it ought to work the same.

Agreed. I can create a patch today to implement this. The thing
which jumps out first is that assign_transaction_read_only probably
needs to move to variable.c so that it can reference
FirstSnapshotSet as the transaction isolation code does. The
alternative would be to include snapmgr.h in guc.c, which seems less
appealing. Agreed? Other ideas?

-Kevin


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <drkp(at)csail(dot)mit(dot)edu>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: serializable read only deferrable
Date: 2010-12-08 16:56:43
Message-ID: 4CFF646B020000250003844D@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> I agree that letting it be changed back to read/write after that
>> is surprising and unnecessary. Perhaps locking down the setting
>> at the time of first grabbing a snapshot would be appropriate.
>> IIRC that's how it works for transaction isolation level, and
>> this seems like it ought to work the same.
>
> Agreed. I can create a patch today to implement this.

Attached.

Accomplished more through mimicry (based on setting transaction
isolation level) than profound understanding of the code involved;
but it passes all regression tests on both `make check` and `make
installcheck-world`. This includes a new regression test that an
attempt to change it after a query fails. I've poked at it with
various ad hoc tests, and it is behaving as expected in those.

I wasn't too confident how to word the new failure messages.

-Kevin

Attachment Content-Type Size
read-only-1.patch text/plain 4.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: drkp(at)csail(dot)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: serializable read only deferrable
Date: 2010-12-08 18:32:16
Message-ID: 18498.1291833136@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Attached.

> Accomplished more through mimicry (based on setting transaction
> isolation level) than profound understanding of the code involved;
> but it passes all regression tests on both `make check` and `make
> installcheck-world`. This includes a new regression test that an
> attempt to change it after a query fails. I've poked at it with
> various ad hoc tests, and it is behaving as expected in those.

Hmm. This patch disallows the case of creating a read-only
subtransaction of a read-write parent. That's a step backwards.
I'm not sure how we could enforce that the property not change
after the first query of a subxact, but maybe we don't care that much?
Do your optimizations pay attention to local read-only in a subxact?

regards, tom lane


From: Dan Ports <drkp(at)csail(dot)mit(dot)edu>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Florian Pflug <fgp(at)phlo(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: serializable read only deferrable
Date: 2010-12-10 05:42:01
Message-ID: 20101210054201.GB25308@csail.mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 07, 2010 at 10:14:24AM -0600, Kevin Grittner wrote:
> > Essentially, instead of adding dependencies as you go along and
> > abort once you hit a conflict, SERIALIZABLE READ ONLY DEFERRED
> > transactions would assume the worst case from the start and thus
> > be able to bypass the more detailed checks later on.
>
> Right -- such a transaction, having acquired a good snapshot, could
> release all SSI resources and run without any of the SSI overhead.

Yes, this makes sense. If no running transaction has ever read, and
will never read before COMMIT, any value that's modified by a
concurrent transaction, then they will not create snapshot anomalies,
and the current snapshot has a place in the serial ordering.

> > With this scheme, you'd at least stand some chance of eventually
> > acquiring a consistent snapshot, even in the case of an endless
> > stream of overlapping READ WRITE transactions.
>
> Yeah, I'd been twisting ideas around trying to find a good way to do
> this; you've got it right at the conceptual level, I think.

The only thing I'm worried about here is how much risk of starvation
remains. You'd need to wait until there are no running r/w transactions
accessing overlapping data sets; for some applications that might not
be any better than waiting for the system to be idle. But I think
there's no way around that, it's just the price you have to pay to get
a snapshot that can never see an anomaly.

> Pseudo-code of idea (conveniently ignoring locking issues and
> non-serializable transactions):

This seems reasonable to me. Let me know if you need help implementing
it; I have some spare cycles right now.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/