Transaction Snapshot Cloning

Lists: pgsql-hackers
From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Transaction Snapshot Cloning
Date: 2008-01-11 16:15:06
Message-ID: 1200068106.4266.1131.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Now that we have a txid_snapshot datatype, it seems easy to imagine that
we might use that to pass snapshot information around between sessions.

If we had a function
replace_serializable_snapshot(master_xid, txid_snapshot)

this would allow us to use the txid_snapshot values to replace our
transaction's serializable snapshot. We would only allow this to execute
when IsXactIsoLevelSerializable and XactReadOnly are true. The
master_xid parameter would be used as a crosscheck to ensure we were
being passed snapshot information about a current transaction on an
active backend, using TransactionIdIsActive(master_xid).

What I'm thinking about is how we might use this to have multiple
sessions working simultaneously on tasks like unloading data, plus its
the first step on the way to running other kinds of parallel operations
as well.

Anything wrong with that idea?

Thanks,

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-11 20:05:34
Message-ID: 24020.1200081934@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> If we had a function
> replace_serializable_snapshot(master_xid, txid_snapshot)
> this would allow us to use the txid_snapshot values to replace our
> transaction's serializable snapshot.

... whereupon we'd get wrong answers. Certainly you could not allow
transaction xmin to go backwards, and I'm not sure what other
restrictions there would be, but the whole thing gives me the willies.

> What I'm thinking about is how we might use this to have multiple
> sessions working simultaneously on tasks like unloading data,

Then what you want is a function that says "clone the snapshot of that
specified other transaction". Not a function that lets the user
substitute random snapshot data and tell you he thinks it's valid.
The user isn't going to have any legal way to transfer the data between
backends anyway, since no transaction can see results of an uncommitted
other transaction. There *has* to be some backdoor channel involved
there, and you might as well make it carry the data without the user
touching it.

The whole thing seems a bit backwards anyway. What you'd really want
for ease of use is some kind of "fork this session" operation, that
is push the info to a new process not pull it.

regards, tom lane


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-11 20:31:08
Message-ID: 9362e74e0801111231i491b7126od4601c142d14c067@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
> The user isn't going to have any legal way to transfer the data between
> backends anyway, since no transaction can see results of an uncommitted
> other transaction. There *has* to be some backdoor channel involved
> there, and you might as well make it carry the data without the user
> touching it.
>
> The whole thing seems a bit backwards anyway. What you'd really want
> for ease of use is some kind of "fork this session" operation, that
> is push the info to a new process not pull it.

Is it a good idea to fork this new process under the same transaction id?.
In that way the backends will be seeing the same versions of data among
themselves.... Are you mentioning the same here?

Thanks,
Gokul.


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-11 20:39:04
Message-ID: 1200083944.4266.1249.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2008-01-11 at 15:05 -0500, Tom Lane wrote:

> the whole thing gives me the willies.

Me too :-)

> > What I'm thinking about is how we might use this to have multiple
> > sessions working simultaneously on tasks like unloading data,
>
> Then what you want is a function that says "clone the snapshot of that
> specified other transaction".

That's exactly what I want.

I was thinking of a few use cases:

1. parallel query, where multiple backends work on parts of one query

2. parallel unload, where multiple backends work on different tables
that form part of the same set of tables to be unloaded

> Not a function that lets the user
> substitute random snapshot data and tell you he thinks it's valid.
> The user isn't going to have any legal way to transfer the data between
> backends anyway, since no transaction can see results of an uncommitted
> other transaction. There *has* to be some backdoor channel involved
> there, and you might as well make it carry the data without the user
> touching it.
>
> The whole thing seems a bit backwards anyway. What you'd really want
> for ease of use is some kind of "fork this session" operation, that
> is push the info to a new process not pull it.

For (1) I think a "fork this session" operation sounds right.

For (2) I definitely want to connect multiple times and yet have all
sessions see the same snapshot. Yes we want multiple backends, but we
also want multiple paths to the client.

For (2) there's a very simple way of transferring the data between
sessions:
a) we connect on session 1 as a serializable transaction
b) we ask session 1 for its snapshot
c) we then connect on session 2 as a serializable transaction
d) we then execute "select replace_serializable_snapshot(...)"

We already have everything in place to do a), b) and c)

So yes, its a backdoor channel, via a single client with multiple
sessions and the xact datatype.

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > If we had a function
> > replace_serializable_snapshot(master_xid, txid_snapshot)
> > this would allow us to use the txid_snapshot values to replace our
> > transaction's serializable snapshot.
>
> ... whereupon we'd get wrong answers. Certainly you could not allow
> transaction xmin to go backwards, and I'm not sure what other
> restrictions there would be, but the whole thing gives me the willies.

So sure it gives me the willies, but I don't see a wrong answer there.
We're not looking for a general time-travel utility, I just want to
connect and run a COPY TO operation that sees the same data that another
session sees. Nothing fancy, so caveats can be as long as your arm as
long as we can run COPY TO on a naked table.

There are uses of that for parallel pg_dump, parallel slony etc..,
helping us upgrade faster to new releases.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-11 20:42:42
Message-ID: 1200084162.4266.1252.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2008-01-11 at 20:39 +0000, Simon Riggs wrote:
> On Fri, 2008-01-11 at 15:05 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > > If we had a function
> > > replace_serializable_snapshot(master_xid, txid_snapshot)
> > > this would allow us to use the txid_snapshot values to replace our
> > > transaction's serializable snapshot.
> >
> > ... whereupon we'd get wrong answers. Certainly you could not allow
> > transaction xmin to go backwards, and I'm not sure what other
> > restrictions there would be, but the whole thing gives me the willies.

Sorry, forgot to add
- global xmin isn't going backwards
- neither is latest completed xid

The xmin of the transaction will go backwards, but as long as we don't
do anything prior to the setting of the cloned snapshot, what can go
wrong? :-)

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-11 22:53:21
Message-ID: 60odbsug7y.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

simon(at)2ndquadrant(dot)com (Simon Riggs) writes:
> On Fri, 2008-01-11 at 20:39 +0000, Simon Riggs wrote:
>> On Fri, 2008-01-11 at 15:05 -0500, Tom Lane wrote:
>> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
>> > > If we had a function
>> > > replace_serializable_snapshot(master_xid, txid_snapshot)
>> > > this would allow us to use the txid_snapshot values to replace our
>> > > transaction's serializable snapshot.
>> >
>> > ... whereupon we'd get wrong answers. Certainly you could not allow
>> > transaction xmin to go backwards, and I'm not sure what other
>> > restrictions there would be, but the whole thing gives me the willies.
>
> Sorry, forgot to add
> - global xmin isn't going backwards
> - neither is latest completed xid
>
> The xmin of the transaction will go backwards, but as long as we don't
> do anything prior to the setting of the cloned snapshot, what can go
> wrong? :-)

Note that we required that the "provider transaction" have the
attributes IsXactIsoLevelSerializable and XactReadOnly both being
true, so we have the mandates that the resultant backend process:

a) Is in read only mode, and
b) Is in serializable mode.

That's a pair of (possibly stretching-wide!) suspenders worth of
support from Evil...
--
(format nil "~S(at)~S" "cbbrowne" "cbbrowne.com")
http://linuxfinances.info/info/internet.html
Trying to be happy is like trying to build a machine for which the
only specification is that it should run noiselessly.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 00:23:10
Message-ID: 28701.1200097390@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> For (2) there's a very simple way of transferring the data between
> sessions:
> a) we connect on session 1 as a serializable transaction
> b) we ask session 1 for its snapshot
> c) we then connect on session 2 as a serializable transaction
> d) we then execute "select replace_serializable_snapshot(...)"

[ blanches... ] Can you say "security hole"?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 00:57:01
Message-ID: 29072.1200099421@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Chris Browne <cbbrowne(at)acm(dot)org> writes:
> Note that we required that the "provider transaction" have the
> attributes IsXactIsoLevelSerializable and XactReadOnly both being
> true, so we have the mandates that the resultant backend process:

> a) Is in read only mode, and
> b) Is in serializable mode.

If XactReadOnly were a "hard" read only constraint, that argument
might be worth the electrons it's written on. I quote TFM:

: When a transaction is read-only, the following SQL commands are
: disallowed: INSERT, UPDATE, DELETE, and COPY FROM if the table they
: would write to is not a temporary table; all CREATE, ALTER, and DROP
: commands; COMMENT, GRANT, REVOKE, TRUNCATE; and EXPLAIN ANALYZE and
: EXECUTE if the command they would execute is among those listed. This is
: a high-level notion of read-only that does not prevent all writes to
: disk.

... and it doesn't prevent the need for a separate XID, either.

Now I think someone was looking into a "hard" read only mode for
use in doing read-only queries against a PITR slave; if that
ever happens it might be adaptable to serve this purpose too.
But we haven't got it today.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 09:32:26
Message-ID: 1200130346.4266.1263.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2008-01-11 at 19:23 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > For (2) there's a very simple way of transferring the data between
> > sessions:
> > a) we connect on session 1 as a serializable transaction
> > b) we ask session 1 for its snapshot
> > c) we then connect on session 2 as a serializable transaction
> > d) we then execute "select replace_serializable_snapshot(...)"
>
> [ blanches... ] Can you say "security hole"?

Static on the line, sorry.

I'm hearing "useful superuser-only capability". ;-)

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 16:57:25
Message-ID: 7120.1200157045@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Fri, 2008-01-11 at 19:23 -0500, Tom Lane wrote:
>> [ blanches... ] Can you say "security hole"?

> Static on the line, sorry.
> I'm hearing "useful superuser-only capability". ;-)

It would be far *more* useful if it didn't have to be superuser-only.
And since the actual details of the snapshot content are really of
zero interest to the user, I think making it pass through his hands
is simply misdesign.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 18:46:16
Message-ID: 87prw6g9vr.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
>> On Fri, 2008-01-11 at 19:23 -0500, Tom Lane wrote:
>>> [ blanches... ] Can you say "security hole"?
>
>> Static on the line, sorry.
>> I'm hearing "useful superuser-only capability". ;-)
>
> It would be far *more* useful if it didn't have to be superuser-only.
> And since the actual details of the snapshot content are really of
> zero interest to the user, I think making it pass through his hands
> is simply misdesign.

Well we already have the snapshot appearing in txid_current_snapshot(). It
wouldn't be too hard to go through that and verify that it satisfies the
current minimum requirements (xmin >= globalxmin and all visible xids are
committed).

The risk is that it would impose restrictions on what we can do in the future.
We were talking about doing snapshot bookkeeping which would allow us to
vacuum transactions which are known invisible to everyone even if they haven't
passed globalxmin yet. This might make it harder to do that.

If we instead pass in an xid or virtualxid for a live transaction to adopt the
serializable snapshot of that snapshot then we can be sure it doesn't change
any invariants about what snapshots can appear in the future. So vacuuming
strategy wouldn't have to change at all.

The flip side is that that limits the use cases the feature could be used for.
You wouldn't be able to store snapshots in a table somewhere so you can
generate old reports or something like that.

To do something like that the user would have to create a prepared transaction
to save the snapshot. I think that makes sense though since effectively it's
just requiring that the user explicitly do what would otherwise be a hidden
implicit requirement -- that the user do something to hold globalxmin back to
avoid having the snapshots expire.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 19:41:36
Message-ID: e51f66da0801121141n42fcdcado6c136868bb3ca7b2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/12/08, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> > Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> >> On Fri, 2008-01-11 at 19:23 -0500, Tom Lane wrote:
> >>> [ blanches... ] Can you say "security hole"?
> >
> >> Static on the line, sorry.
> >> I'm hearing "useful superuser-only capability". ;-)
> >
> > It would be far *more* useful if it didn't have to be superuser-only.
> > And since the actual details of the snapshot content are really of
> > zero interest to the user, I think making it pass through his hands
> > is simply misdesign.
>
> Well we already have the snapshot appearing in txid_current_snapshot(). It
> wouldn't be too hard to go through that and verify that it satisfies the
> current minimum requirements (xmin >= globalxmin and all visible xids are
> committed).

IMHO the txid_snapshot is inappropriate for such use. It is really
not meant to be fed back to core code. The whole 8byteness would
be totally pointless. The data passing user hands creates new
error situations.

Better would be something like this:

pg_publish_transaction_state()

returns unique random unguessable cookie. Internally it stores
the xid, snapshot and potentially any other interesting state.
The state will be valid until the transaction end.

Then later user can call in some other connection:

pg_use_transaction_state(<cookie>)

that will then use the state. Main plus of such scheme is that
user can freely decide how it wants to use the cookie, without
any security problems. Also it hides actual state saved from
user which means it can be changed easily.

--
marko


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 20:25:10
Message-ID: 1200169510.4266.1378.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2008-01-12 at 11:57 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Fri, 2008-01-11 at 19:23 -0500, Tom Lane wrote:
> >> [ blanches... ] Can you say "security hole"?
>
> > Static on the line, sorry.
> > I'm hearing "useful superuser-only capability". ;-)
>
> It would be far *more* useful if it didn't have to be superuser-only.
> And since the actual details of the snapshot content are really of
> zero interest to the user, I think making it pass through his hands
> is simply misdesign.

No, its deliberate, because I want this to work on prior releases. I
didn't clearly explain that, sorry.

For 8.4+ I will design the internal route for this feature.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 20:42:27
Message-ID: 21791.1200170547@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Sat, 2008-01-12 at 11:57 -0500, Tom Lane wrote:
>> It would be far *more* useful if it didn't have to be superuser-only.
>> And since the actual details of the snapshot content are really of
>> zero interest to the user, I think making it pass through his hands
>> is simply misdesign.

> No, its deliberate, because I want this to work on prior releases. I
> didn't clearly explain that, sorry.

Why are you trying to get the community to do design for things that
are certainly not going to appear in any community release?

In any case, even if you were trying to back-port this to prior
releases, I don't see how exposing the innards to clients will make
it any easier. All the hard parts will be the same.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 21:08:32
Message-ID: 1200172112.4266.1403.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2008-01-12 at 15:42 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Sat, 2008-01-12 at 11:57 -0500, Tom Lane wrote:
> >> It would be far *more* useful if it didn't have to be superuser-only.
> >> And since the actual details of the snapshot content are really of
> >> zero interest to the user, I think making it pass through his hands
> >> is simply misdesign.
>
> > No, its deliberate, because I want this to work on prior releases. I
> > didn't clearly explain that, sorry.
>
> Why are you trying to get the community to do design for things that
> are certainly not going to appear in any community release?

Reasonable question, but its not really my way to do such things.

I started looking at this for an 8.4 only solution and will be
submitting a patch for the internal route, eventually.

After looking at this for a while, I realised the snapshots aren't
easily accessible, so would require a specific publishing function. My
main focus for this was Slony, so I realised that Slony already has this
facility. So if we did this via an installable module approach then we
would be able to use it much sooner for upgrading to 8.3, rather than
waiting for 8.4

It's too late to get anything into 8.3, but its not too late to get into
a Slony version that can help upgrade to 8.3.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-12 21:22:12
Message-ID: 22219.1200172932@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> ... So if we did this via an installable module approach then we
> would be able to use it much sooner for upgrading to 8.3, rather than
> waiting for 8.4

I think it borders on ludicrous to imagine making this work with just an
installable module and no core-code changes. So no, I refuse to spend
any time helping to design an implementation for 8.3. It'll be a hard
enough problem to make this work for 8.4.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-14 09:34:45
Message-ID: 1200303285.4266.1487.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2008-01-12 at 16:22 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > ... So if we did this via an installable module approach then we
> > would be able to use it much sooner for upgrading to 8.3, rather than
> > waiting for 8.4
>
> I think it borders on ludicrous to imagine making this work with just an
> installable module and no core-code changes. So no, I refuse to spend
> any time helping to design an implementation for 8.3.

Refuse is a strong word, so apologies if I've offended.

I already know how to write it, the main question was how dangerous is
it and I think we answered that. Misuse seems to be the issue you seem
to be worried about. In the wrong hands it could prove to be a Foot Gun
with a larger than normal kill zone.

Dangerous enough that you actively want me to not write it? Or just
saying you're not sure it can be done?

> It'll be a hard enough problem to make this work for 8.4.

Agreed.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Chris Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-16 16:07:04
Message-ID: 478E2BA8.8060004@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Chris Browne <cbbrowne(at)acm(dot)org> writes:
>> Note that we required that the "provider transaction" have the
>> attributes IsXactIsoLevelSerializable and XactReadOnly both being
>> true, so we have the mandates that the resultant backend process:
>
>> a) Is in read only mode, and
>> b) Is in serializable mode.
>
> If XactReadOnly were a "hard" read only constraint, that argument
> might be worth the electrons it's written on. I quote TFM:
>
> Now I think someone was looking into a "hard" read only mode for
> use in doing read-only queries against a PITR slave; if that
> ever happens it might be adaptable to serve this purpose too.
> But we haven't got it today.

That would haven been me then ;-)

I think that lazy xid assignment actually got us 90% of the way towards
a "hard" transaction read-only constraint - nearly all data-modfying
operation surely depend on the xact having an xid assigned, no? (The
only exception might be nextval() and friends).

I seem to remember there being some pushback to the idea of changing the
semantics of "set transaction isolation read only" from "soft" to "hard"
semantics though - on the basis that it might break existing
applications. If that has changed (or my memory tricks me ;-) ) I'd
volunteer to create a patch for 8.4 to make "set transaction read only"
a hard constraint.

regards, Florian Pflug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Cc: Chris Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-16 16:49:28
Message-ID: 24795.1200502168@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Florian G. Pflug" <fgp(at)phlo(dot)org> writes:
> I seem to remember there being some pushback to the idea of changing the
> semantics of "set transaction isolation read only" from "soft" to "hard"
> semantics though - on the basis that it might break existing
> applications. If that has changed (or my memory tricks me ;-) ) I'd
> volunteer to create a patch for 8.4 to make "set transaction read only"
> a hard constraint.

AFAICT it would violate the SQL spec as well as breaking backward
compatibility. SQL99 4.32 saith

An SQL-transaction has an access mode that is either read-only
or read-write. The access mode may be explicitly set by a <set
transaction statement> before the start of an SQL-transaction or
by the use of a <start transaction statement> to start an SQL-
transaction; otherwise, it is implicitly set to the default access
mode for the SQL-session before each SQL-transaction begins. If no
<set session characteristics statement> has set the default access
mode for the SQL-session, then the default access mode for the SQL-
session is read-write. The term read-only applies only to viewed
tables and persistent base tables.

That last sentence is the basis for exempting temp tables from the
read-only restriction.

I'm not sure what the most convenient user API would be for an on-demand
hard-read-only mode, but we can't use SET TRANSACTION READ ONLY for it.
It'd have to be some other syntax. Maybe just use a GUC variable
instead of bespoke syntax? SET TRANSACTION is really just syntactic
sugar for GUC SET operations anyway ...

regards, tom lane


From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Chris Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-17 10:56:48
Message-ID: 478F3470.1020206@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> I'm not sure what the most convenient user API would be for an on-demand
> hard-read-only mode, but we can't use SET TRANSACTION READ ONLY for it.
> It'd have to be some other syntax. Maybe just use a GUC variable
> instead of bespoke syntax? SET TRANSACTION is really just syntactic
> sugar for GUC SET operations anyway ...

We could reuse the transaction_read_only GUC, adding "strict" as a 3rd
allowed value beside "on" and "off". And maybe make "ansi" an alias for
"on" to emphasize that one behavior is what the standard wants, and the
other is a postgres extension.

regards, Florian Pflug


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-20 12:13:54
Message-ID: 1200831234.4255.476.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2008-01-12 at 18:46 +0000, Gregory Stark wrote:

> To do something like that the user would have to create a prepared transaction
> to save the snapshot. I think that makes sense though since effectively it's
> just requiring that the user explicitly do what would otherwise be a hidden
> implicit requirement -- that the user do something to hold globalxmin back to
> avoid having the snapshots expire.

This is a good idea which I will want to develop in the future, not yet
though.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Chris Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-20 12:16:02
Message-ID: 1200831362.4255.479.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2008-01-17 at 11:56 +0100, Florian G. Pflug wrote:
> Tom Lane wrote:
> > I'm not sure what the most convenient user API would be for an on-demand
> > hard-read-only mode, but we can't use SET TRANSACTION READ ONLY for it.
> > It'd have to be some other syntax. Maybe just use a GUC variable
> > instead of bespoke syntax? SET TRANSACTION is really just syntactic
> > sugar for GUC SET operations anyway ...
>
> We could reuse the transaction_read_only GUC, adding "strict" as a 3rd
> allowed value beside "on" and "off". And maybe make "ansi" an alias for
> "on" to emphasize that one behavior is what the standard wants, and the
> other is a postgres extension.

Sounds OK to me. We need this to be enforced for Hot Standby, though it
seems useful of itself. If we can break down the Hot Standby stuff into
smaller chunks, it will make it easier for everybody to agree.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-20 12:37:23
Message-ID: 47934083.5050904@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Sat, 2008-01-12 at 18:46 +0000, Gregory Stark wrote:
>
>> To do something like that the user would have to create a prepared transaction
>> to save the snapshot. I think that makes sense though since effectively it's
>> just requiring that the user explicitly do what would otherwise be a hidden
>> implicit requirement -- that the user do something to hold globalxmin back to
>> avoid having the snapshots expire.
>
> This is a good idea which I will want to develop in the future, not yet
> though.

I haven't been following this thread in detail, but I'd just like to
point out that there's a couple features in the XA spec that we don't
currently support:

- ability to "stop" a transaction, and resume it later, executing other
transactions in between.
- ability to stop a transaction, and resume it later in another connection.

Neither of these are essential for two-phase commit, which is what the
spec is for, but if they happened to fall out of some other work, it
would be nice...

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-20 13:18:04
Message-ID: 1200835084.4255.514.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2008-01-20 at 12:37 +0000, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Sat, 2008-01-12 at 18:46 +0000, Gregory Stark wrote:
> >
> >> To do something like that the user would have to create a prepared transaction
> >> to save the snapshot. I think that makes sense though since effectively it's
> >> just requiring that the user explicitly do what would otherwise be a hidden
> >> implicit requirement -- that the user do something to hold globalxmin back to
> >> avoid having the snapshots expire.
> >
> > This is a good idea which I will want to develop in the future, not yet
> > though.
>
> I haven't been following this thread in detail, but I'd just like to
> point out that there's a couple features in the XA spec that we don't
> currently support:
>
> - ability to "stop" a transaction, and resume it later, executing other
> transactions in between.
> - ability to stop a transaction, and resume it later in another connection.
>
> Neither of these are essential for two-phase commit, which is what the
> spec is for, but if they happened to fall out of some other work, it
> would be nice...

Thanks for the info.

Sounds like this idea would provide some of the groundwork for the
latter concept.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-20 15:11:28
Message-ID: 87ve5obkgv.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:

> On Sat, 2008-01-12 at 18:46 +0000, Gregory Stark wrote:
>
>> To do something like that the user would have to create a prepared transaction
>> to save the snapshot. I think that makes sense though since effectively it's
>> just requiring that the user explicitly do what would otherwise be a hidden
>> implicit requirement -- that the user do something to hold globalxmin back to
>> avoid having the snapshots expire.
>
> This is a good idea which I will want to develop in the future, not yet
> though.

I didn't mean this as an additional feature. I'm talking about how users would
use the two very different proposed interfaces.

In your version the user can save the actual snapshot somewhere and then use
it later. He'll presumably get an error if the snapshot is no longer usable
and there's no way for him to protect it and guarantee it's still usable.

In Tom's version the user can only copy the snapshot from some other running
session. It's necessarily still valid because the session is using it. But if
the user wants to save it for later he'll have to create a session (or
prepared transaction) to hold the snapshot.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-01-20 19:31:41
Message-ID: 1200857501.4255.549.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2008-01-20 at 15:11 +0000, Gregory Stark wrote:
> "Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
>
> > On Sat, 2008-01-12 at 18:46 +0000, Gregory Stark wrote:
> >
> >> To do something like that the user would have to create a prepared transaction
> >> to save the snapshot. I think that makes sense though since effectively it's
> >> just requiring that the user explicitly do what would otherwise be a hidden
> >> implicit requirement -- that the user do something to hold globalxmin back to
> >> avoid having the snapshots expire.
> >
> > This is a good idea which I will want to develop in the future, not yet
> > though.
>
> I didn't mean this as an additional feature. I'm talking about how users would
> use the two very different proposed interfaces.
>
> In your version the user can save the actual snapshot somewhere and then use
> it later. He'll presumably get an error if the snapshot is no longer usable
> and there's no way for him to protect it and guarantee it's still usable.
>
> In Tom's version the user can only copy the snapshot from some other running
> session. It's necessarily still valid because the session is using it. But if
> the user wants to save it for later he'll have to create a session (or
> prepared transaction) to hold the snapshot.

OK, misunderstanding. "My version" is being done now, so we can use it
now; it will be published as BSD licenced open source software, but as
yet seems unlikely to ever be part of a main distribution of Postgres.
It will probably be published on pgfoundry, though possibly elsewhere
also. I don't take credit for the general idea, but I am responsible for
the idea to do this now as an external function.

I prefer this done in the backend in the long term, much safer, which we
are agreed upon. I'll come back to that so we get it into 8.4.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-03-28 02:36:34
Message-ID: 200803280236.m2S2aY816209@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Added to TODO:

* Allow one transaction to see tuples using the snapshot of another
transaction

This would assist multiple backends in working together.
http://archives.postgresql.org/pgsql-hackers/2008-01/msg00400.php

---------------------------------------------------------------------------

Chris Browne wrote:
> simon(at)2ndquadrant(dot)com (Simon Riggs) writes:
> > On Fri, 2008-01-11 at 20:39 +0000, Simon Riggs wrote:
> >> On Fri, 2008-01-11 at 15:05 -0500, Tom Lane wrote:
> >> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> >> > > If we had a function
> >> > > replace_serializable_snapshot(master_xid, txid_snapshot)
> >> > > this would allow us to use the txid_snapshot values to replace our
> >> > > transaction's serializable snapshot.
> >> >
> >> > ... whereupon we'd get wrong answers. Certainly you could not allow
> >> > transaction xmin to go backwards, and I'm not sure what other
> >> > restrictions there would be, but the whole thing gives me the willies.
> >
> > Sorry, forgot to add
> > - global xmin isn't going backwards
> > - neither is latest completed xid
> >
> > The xmin of the transaction will go backwards, but as long as we don't
> > do anything prior to the setting of the cloned snapshot, what can go
> > wrong? :-)
>
> Note that we required that the "provider transaction" have the
> attributes IsXactIsoLevelSerializable and XactReadOnly both being
> true, so we have the mandates that the resultant backend process:
>
> a) Is in read only mode, and
> b) Is in serializable mode.
>
> That's a pair of (possibly stretching-wide!) suspenders worth of
> support from Evil...
> --
> (format nil "~S(at)~S" "cbbrowne" "cbbrowne.com")
> http://linuxfinances.info/info/internet.html
> Trying to be happy is like trying to build a machine for which the
> only specification is that it should run noiselessly.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-03-28 15:33:42
Message-ID: 60abki6ey1.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

bruce(at)momjian(dot)us (Bruce Momjian) writes:
> Added to TODO:
>
> * Allow one transaction to see tuples using the snapshot of another
> transaction
>
> This would assist multiple backends in working together.
> http://archives.postgresql.org/pgsql-hackers/2008-01/msg00400.php

FYI, code for this is presently available on pgFoundry, at:
<http://pgfoundry.org/projects/snapclone/>.

I have some benchmarking scripts to commit to it, once I get added to
the project. (Simon? :-))

FYI, preliminary testing, on a machine hooked up to an EMC CX700 disk
array is showing that, if I do concurrent dumps of 4 tables, it gives
a bit better than a 2x speedup over dumping the four tables serially,
so there's definitely some fruit here.
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www3.sympatico.ca/cbbrowne/rdbms.html
Talk a lot, don't you?


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshot Cloning
Date: 2008-03-28 18:58:14
Message-ID: 1206730694.4285.1771.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2008-03-28 at 11:33 -0400, Chris Browne wrote:
> bruce(at)momjian(dot)us (Bruce Momjian) writes:
> > Added to TODO:
> >
> > * Allow one transaction to see tuples using the snapshot of another
> > transaction
> >
> > This would assist multiple backends in working together.
> > http://archives.postgresql.org/pgsql-hackers/2008-01/msg00400.php
>
> FYI, code for this is presently available on pgFoundry, at:
> <http://pgfoundry.org/projects/snapclone/>.
>
> I have some benchmarking scripts to commit to it, once I get added to
> the project. (Simon? :-))
>
> FYI, preliminary testing, on a machine hooked up to an EMC CX700 disk
> array is showing that, if I do concurrent dumps of 4 tables, it gives
> a bit better than a 2x speedup over dumping the four tables serially,
> so there's definitely some fruit here.

Oh, very cool. Thanks for testing.

I'll add you now - must have missed that, sorry.

My view is that the TODO item is still needed because we want to work
this into the backend more fully.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk