Re: Nested transactions

Lists: pgsql-jdbcpgsql-patches
From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Nested transactions
Date: 2004-06-08 22:23:56
Message-ID: 20040608222356.GC13820@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

Hackers,

Here is the latest installment of the nested transactions patch.

What's in the current patch:

- transaction block support. The user can start subtransactions inside
each transaction block and rollback each one independently. The
nesting level has no practical limit, though each transaction tree can
have at most 2^32 commands (subtransaction start counts as one
command).

- Most subsystems have a cleanup function called at subtransaction
abort: deferred triggers, relcache refcount, catcache refcount, smgr
pending deletes, gist/rtree/hash scan lists, bufmgr refcount,
on-commit actions.

- SPI closes connections opened during the aborting subtransaction.

- The lock manager releases locks taken by the aborting subtransaction,
but _holds_ locks taken by a committing subtransaction. This is
needed by MVCC semantics.

- A special memory context, CommitContext is created for each
subtransaction. It is destroyed if the subtransaction aborts, but
if it commits then it will be kept until main transaction commit
(or its parent subtransaction aborts).

- The pg_subtrans module is created to keep the parent Xid of each
subtransaction. It's similar (and based on) the pg_clog code.

- The pg_clog Xlog code was moved to Slru and merged with pg_subtrans
Xlog code.

- When we wait for a transaction to finish (XactLockTableWait), we
actually wait for the topmost transaction.

- SET TRANSACTION ISOLATION LEVEL and SET READ ONLY are not allowed
inside a subtransaction.

What's missing to complete the nested transactions work:

- Make GUC vars work correctly.
- Lots of testing.
- Documentation
- Optimizations:
* the phantom Xid idea is not implemented here.
* optimize TransactionIdIsInProgress
* profile
- More regression testing. Need additional concurrent infrastructure.

Many thanks to:

- Manfred Koizar for making the pg_subtrans code possible and discussion
on several issues.
- Stephan Szabo for the idea on which the deferred trigger code was
based.
- Tom Lane for asking the right questions.
- Bruce Momjian for useful ideas.

IMHO this patch should be reviewed and applied to achieve wide testing
as soon as possible before feature freeze. More changes can be applied
later as available/needed. (The patch is already huge and contains
several important changes that need careful review. Adding more to the
mix will only make things more difficult.)

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"No single strategy is always right (Unless the boss says so)"
(Larry Wall)

Attachment Content-Type Size
nested-all-13.patch.gz application/octet-stream 44.6 KB

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-10 04:11:26
Message-ID: 200406100411.i5A4BQL14526@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches


Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

I will try to apply it after review.

---------------------------------------------------------------------------
Alvaro Herrera wrote:
> Hackers,
>
> Here is the latest installment of the nested transactions patch.
>
> What's in the current patch:
>
> - transaction block support. The user can start subtransactions inside
> each transaction block and rollback each one independently. The
> nesting level has no practical limit, though each transaction tree can
> have at most 2^32 commands (subtransaction start counts as one
> command).
>
> - Most subsystems have a cleanup function called at subtransaction
> abort: deferred triggers, relcache refcount, catcache refcount, smgr
> pending deletes, gist/rtree/hash scan lists, bufmgr refcount,
> on-commit actions.
>
> - SPI closes connections opened during the aborting subtransaction.
>
> - The lock manager releases locks taken by the aborting subtransaction,
> but _holds_ locks taken by a committing subtransaction. This is
> needed by MVCC semantics.
>
> - A special memory context, CommitContext is created for each
> subtransaction. It is destroyed if the subtransaction aborts, but
> if it commits then it will be kept until main transaction commit
> (or its parent subtransaction aborts).
>
> - The pg_subtrans module is created to keep the parent Xid of each
> subtransaction. It's similar (and based on) the pg_clog code.
>
> - The pg_clog Xlog code was moved to Slru and merged with pg_subtrans
> Xlog code.
>
> - When we wait for a transaction to finish (XactLockTableWait), we
> actually wait for the topmost transaction.
>
> - SET TRANSACTION ISOLATION LEVEL and SET READ ONLY are not allowed
> inside a subtransaction.
>
> What's missing to complete the nested transactions work:
>
> - Make GUC vars work correctly.
> - Lots of testing.
> - Documentation
> - Optimizations:
> * the phantom Xid idea is not implemented here.
> * optimize TransactionIdIsInProgress
> * profile
> - More regression testing. Need additional concurrent infrastructure.
>
> Many thanks to:
>
> - Manfred Koizar for making the pg_subtrans code possible and discussion
> on several issues.
> - Stephan Szabo for the idea on which the deferred trigger code was
> based.
> - Tom Lane for asking the right questions.
> - Bruce Momjian for useful ideas.
>
>
> IMHO this patch should be reviewed and applied to achieve wide testing
> as soon as possible before feature freeze. More changes can be applied
> later as available/needed. (The patch is already huge and contains
> several important changes that need careful review. Adding more to the
> mix will only make things more difficult.)
>
> --
> Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
> "No single strategy is always right (Unless the boss says so)"
> (Larry Wall)

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-16 22:45:36
Message-ID: 1087425935.12015.1154.camel@stromboli
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

On Tue, 2004-06-08 at 23:23, Alvaro Herrera wrote:
> Hackers,
>
> Here is the latest installment of the nested transactions patch.
>
> What's in the current patch:
>

First of all, thank you for all your helpful comments recently.

The patch looks impressively technical, but overall I'm not exactly sure
what it does...I guess I'm just not clear why I would want it, except as
the main technical pre-work to later syntax changes. I'm sure some short
explanations would clear that up for me very quickly... :)

The Todo items were:
-Allow savepoints / nested transactions
-Use nested transactions to prevent syntax errors from aborting a
transaction

both of which I thought I understood:

The first one provides the SQL commands SAVEPOINT and ROLLBACK TO
SAVEPOINT as with Oracle/DB2, and also now ANSI SQL if I recall...

The second one again provides Oracle/DB2 support by conforming to their
interpretation of the ANSI transactional semantics definition. i.e. one
statement failure doesn't roll back the transaction, just the statement
that failed.

Being able to issue multiple BEGIN/END pairs isn't really (to me) the
same thing as the above, nor do I understand why I'd ever want to do
that - especially down to N levels....

Perhaps what I've just asked about is trivial icing on the cake you've
just baked, so forgive me, but could you explain the outward form of
your work and what that gives me? (or at least...what you think it gives
you...which I accept may be different)

Best regards, Simon Riggs


From: Barry Lind <blind(at)xythos(dot)com>
To: simon(at)2ndquadrant(dot)com
Cc: Patches <pgsql-patches(at)postgresql(dot)org>, alvherre(at)dcc(dot)uchile(dot)cl
Subject: Re: Nested transactions
Date: 2004-06-17 00:59:43
Message-ID: 40D0ECFF.2070501@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

I agree with Simon's comments. And to them I would add: I had assumed
that the requirements for 'nested transactions' was following some
standard definition or specification (i.e. the ANSI SQL spec). But from
what I can tell, we are rolling our own definition here, not following a
specification or standard, and not following any of the other major
commercial databases lead.

I think venturing out on our own and inventing new symantics for
transactions and sql syntax to support them without giving this a lot of
thought is bound to lead to problems.

Perhaps I am completely wrong here and there is a clear standard or spec
that is being implemented, if so, please let me know what that is as it
would help me in better understanding this patch.

I have been reviewing what Oracle does in this area and it doesn't at
all resemble what this patch is exposing (especially as far as syntax
goes). I plan to look at DB2 and MSSQL next.

thanks,
--Barry

Simon Riggs wrote:
> On Tue, 2004-06-08 at 23:23, Alvaro Herrera wrote:
>
>>Hackers,
>>
>>Here is the latest installment of the nested transactions patch.
>>
>>What's in the current patch:
>>
>
>
> First of all, thank you for all your helpful comments recently.
>
> The patch looks impressively technical, but overall I'm not exactly sure
> what it does...I guess I'm just not clear why I would want it, except as
> the main technical pre-work to later syntax changes. I'm sure some short
> explanations would clear that up for me very quickly... :)
>
> The Todo items were:
> -Allow savepoints / nested transactions
> -Use nested transactions to prevent syntax errors from aborting a
> transaction
>
> both of which I thought I understood:
>
> The first one provides the SQL commands SAVEPOINT and ROLLBACK TO
> SAVEPOINT as with Oracle/DB2, and also now ANSI SQL if I recall...
>
> The second one again provides Oracle/DB2 support by conforming to their
> interpretation of the ANSI transactional semantics definition. i.e. one
> statement failure doesn't roll back the transaction, just the statement
> that failed.
>
> Being able to issue multiple BEGIN/END pairs isn't really (to me) the
> same thing as the above, nor do I understand why I'd ever want to do
> that - especially down to N levels....
>
> Perhaps what I've just asked about is trivial icing on the cake you've
> just baked, so forgive me, but could you explain the outward form of
> your work and what that gives me? (or at least...what you think it gives
> you...which I accept may be different)
>
> Best regards, Simon Riggs
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Barry Lind <blind(at)xythos(dot)com>
Cc: simon(at)2ndquadrant(dot)com, Patches <pgsql-patches(at)postgresql(dot)org>, alvherre(at)dcc(dot)uchile(dot)cl
Subject: Re: Nested transactions
Date: 2004-06-17 01:36:33
Message-ID: 200406170136.i5H1aX727064@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

Barry Lind wrote:
> I agree with Simon's comments. And to them I would add: I had assumed
> that the requirements for 'nested transactions' was following some
> standard definition or specification (i.e. the ANSI SQL spec). But from
> what I can tell, we are rolling our own definition here, not following a
> specification or standard, and not following any of the other major
> commercial databases lead.
>
> I think venturing out on our own and inventing new symantics for
> transactions and sql syntax to support them without giving this a lot of
> thought is bound to lead to problems.
>
> Perhaps I am completely wrong here and there is a clear standard or spec
> that is being implemented, if so, please let me know what that is as it
> would help me in better understanding this patch.
>
> I have been reviewing what Oracle does in this area and it doesn't at
> all resemble what this patch is exposing (especially as far as syntax
> goes). I plan to look at DB2 and MSSQL next.

I realized about three weeks ago that our syntax for nested transactions
doesn't follow anyone else's method. I think I might be the person who
suggested the idea because it seemed more logical to me to allow
BEGIN;BEGIN;COMMIT;COMMIT rather than naming arbitrary locations as
savepoints and doing a rollback to that savepoint name.

And consider this case:

BEGIN;
...
SAVEPOINT x;
SELECT func_call();
SELECT func_call();
COMMIT;

Now if func_call has a savepoint, it is really nested because it can't
know whether the savepoint X will be used to roll back, so its status is
dependent on the status of X. Now, if we used savepoints in func_call,
what happens in the second function call when we define a savepoint with
the same name? I assume we overwrite the original, but using nested
transaction syntax seems much clearer.

Basically, we have to implement this in a nested way. Once it is done,
we can add the window dressing to support the ANSI syntax.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-17 01:44:00
Message-ID: 20040617014359.GA5925@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

On Wed, Jun 16, 2004 at 11:45:36PM +0100, Simon Riggs wrote:

> The patch looks impressively technical, but overall I'm not exactly sure
> what it does...I guess I'm just not clear why I would want it, except as
> the main technical pre-work to later syntax changes. I'm sure some short
> explanations would clear that up for me very quickly... :)

Right. I have never intended to be implementing a known SQL standard
feature. What I'm doing is allowing the whole backend to go back to a
know state after an error is encountered.

With this in place, implementing SAVEPOINTs the way SQL expects them to
work appears to be a very trivial exercise.

> Perhaps what I've just asked about is trivial icing on the cake you've
> just baked,

I think this phrase very precisely describes it. At least, that's what
I expect.

You may not see it, but a savepoint is just the start of a nested
transaction in disguise. Consider:

begin;
insert into foo values (1);
savepoint dammit;
insert into foo values (2);
select foo; -- fails
rollback to dammit;
insert into foo values (3);
commit;

You expect the transaction to finish with tuples 1 and 3 in table foo,
right? Well, this is exactly the same as

begin;
insert into foo values (1);
begin; -- dammit
insert into foo values (2);
select foo; -- fails, goes to aborted state
rollback;
insert into foo values (3);
commit;

So all that's needed for the former to work is to be able to define a
"name" for a transaction (using a cute syntax) and being able to
rollback to it. Definitely trivial, after all the work I have put into
making the latter work.

In extant releases you can only do this:
begin;
insert into foo values (1);
insert into foo values (2);
select foo; -- oops, can't go back!
rollback;
begin;
insert into foo values (1);
insert into foo values (3);
commit;

You are forced to send all the commands before the aborting one to the
server again. And there's no way to "undo" a command in the
transaction, short of aborting it completely.

I don't know what Oracle or other DBMSs expect in this area. Anyone
care to give me a few pointers? If I'm missing something, I want to
know as soon as possible.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are." -- Charles J. Sykes' advice to teenagers


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Barry Lind <blind(at)xythos(dot)com>, simon(at)2ndquadrant(dot)com, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-17 01:58:42
Message-ID: 20040617015842.GB5925@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

On Wed, Jun 16, 2004 at 09:36:33PM -0400, Bruce Momjian wrote:

> And consider this case:
>
> BEGIN;
> ...
> SAVEPOINT x;
> SELECT func_call();
> SELECT func_call();
> COMMIT;
>
> Now if func_call has a savepoint, it is really nested because it can't
> know whether the savepoint X will be used to roll back, so its status is
> dependent on the status of X. Now, if we used savepoints in func_call,
> what happens in the second function call when we define a savepoint with
> the same name?

Hm, that's a good question. What happens if you define two savepoints
with the same name? According to SQL2003, the previous savepoint "is
destroyed", but it's not clear to me whether this means rolling back all
of its changes or just forgetting it. What's clear is that you can roll
back only to the latest one.

Also, in SQL2003 there can be multiple "savepoint levels". I think for
a first implementation it would be fine if we had only one level. It
would, wouldn't it?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Estoy de acuerdo contigo en que la verdad absoluta no existe...
El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Barry Lind <blind(at)xythos(dot)com>, simon(at)2ndquadrant(dot)com, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-17 02:01:17
Message-ID: 20040617020117.GC5925@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

On Thu, Jun 17, 2004 at 10:01:32AM +0800, Christopher Kings-Lynne wrote:
> >And consider this case:
> >
> > BEGIN;
> > ...
> > SAVEPOINT x;
> > SELECT func_call();
> > SELECT func_call();
> > COMMIT;
> >
> >Now if func_call has a savepoint, it is really nested because it can't
> >know whether the savepoint X will be used to roll back, so its status is
> >dependent on the status of X. Now, if we used savepoints in func_call,
> >what happens in the second function call when we define a savepoint with
> >the same name? I assume we overwrite the original, but using nested
> >transaction syntax seems much clearer.
>
> It also seems in this example that func_call() probably shouldn't have
> permission to rollback to savepoint x? Otherwise it would get...weird.

I don't think we should explicitly forbid it. I think it should be
forbidden to close the outermost transaction inside a function (else the
function would not be able to terminate correctly), but for levels
before that one it'd be OK.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Cuando mañana llegue pelearemos segun lo que mañana exija" (Mowgli)


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Barry Lind <blind(at)xythos(dot)com>, simon(at)2ndquadrant(dot)com, Patches <pgsql-patches(at)postgresql(dot)org>, alvherre(at)dcc(dot)uchile(dot)cl
Subject: Re: Nested transactions
Date: 2004-06-17 02:01:32
Message-ID: 40D0FB7C.8030309@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

> And consider this case:
>
> BEGIN;
> ...
> SAVEPOINT x;
> SELECT func_call();
> SELECT func_call();
> COMMIT;
>
> Now if func_call has a savepoint, it is really nested because it can't
> know whether the savepoint X will be used to roll back, so its status is
> dependent on the status of X. Now, if we used savepoints in func_call,
> what happens in the second function call when we define a savepoint with
> the same name? I assume we overwrite the original, but using nested
> transaction syntax seems much clearer.

It also seems in this example that func_call() probably shouldn't have
permission to rollback to savepoint x? Otherwise it would get...weird.

Chris


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Barry Lind <blind(at)xythos(dot)com>, simon(at)2ndquadrant(dot)com, Patches <pgsql-patches(at)postgresql(dot)org>, alvherre(at)dcc(dot)uchile(dot)cl
Subject: Re: Nested transactions
Date: 2004-06-17 02:03:15
Message-ID: 200406170203.i5H23F801739@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

Christopher Kings-Lynne wrote:
> > And consider this case:
> >
> > BEGIN;
> > ...
> > SAVEPOINT x;
> > SELECT func_call();
> > SELECT func_call();
> > COMMIT;
> >
> > Now if func_call has a savepoint, it is really nested because it can't
> > know whether the savepoint X will be used to roll back, so its status is
> > dependent on the status of X. Now, if we used savepoints in func_call,
> > what happens in the second function call when we define a savepoint with
> > the same name? I assume we overwrite the original, but using nested
> > transaction syntax seems much clearer.
>
> It also seems in this example that func_call() probably shouldn't have
> permission to rollback to savepoint x? Otherwise it would get...weird.

Yes, weird.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Barry Lind <blind(at)xythos(dot)com>, simon(at)2ndquadrant(dot)com, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-17 02:04:20
Message-ID: 200406170204.i5H24KR01850@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

Alvaro Herrera wrote:
> On Thu, Jun 17, 2004 at 10:01:32AM +0800, Christopher Kings-Lynne wrote:
> > >And consider this case:
> > >
> > > BEGIN;
> > > ...
> > > SAVEPOINT x;
> > > SELECT func_call();
> > > SELECT func_call();
> > > COMMIT;
> > >
> > >Now if func_call has a savepoint, it is really nested because it can't
> > >know whether the savepoint X will be used to roll back, so its status is
> > >dependent on the status of X. Now, if we used savepoints in func_call,
> > >what happens in the second function call when we define a savepoint with
> > >the same name? I assume we overwrite the original, but using nested
> > >transaction syntax seems much clearer.
> >
> > It also seems in this example that func_call() probably shouldn't have
> > permission to rollback to savepoint x? Otherwise it would get...weird.
>
> I don't think we should explicitly forbid it. I think it should be
> forbidden to close the outermost transaction inside a function (else the
> function would not be able to terminate correctly), but for levels
> before that one it'd be OK.

True. I see no reason to disallow it. Alvaro, you mentioned savepoint
levels, and I assume this to work around cases where they would need
the nested transactions that we are implementing.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Barry Lind <blind(at)xythos(dot)com>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-17 15:56:03
Message-ID: 40D1BF13.3050208@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

Alvaro Herrera wrote:

> With this in place, implementing SAVEPOINTs the way SQL expects them to
> work appears to be a very trivial exercise.
>
> You may not see it, but a savepoint is just the start of a nested
> transaction in disguise. Consider:
>
> begin;
> insert into foo values (1);
> savepoint dammit;
> insert into foo values (2);
> select foo; -- fails
> rollback to dammit;
> insert into foo values (3);
> commit;
>
> You expect the transaction to finish with tuples 1 and 3 in table foo,
> right? Well, this is exactly the same as
>
> begin;
> insert into foo values (1);
> begin; -- dammit
> insert into foo values (2);
> select foo; -- fails, goes to aborted state
> rollback;
> insert into foo values (3);
> commit;
>
> So all that's needed for the former to work is to be able to define a
> "name" for a transaction (using a cute syntax) and being able to
> rollback to it. Definitely trivial, after all the work I have put into
> making the latter work.
>

The problem I see with moving towards supporting savepoints with the
current proposal is with how commit works:

Consider:

begin;
insert into foo values (1);
savepoint dammit;
insert into foo values (2);
select foo;
insert into foo values (3);
commit;

This one commit needs to commit the top level transaction. But if the
savepoint command is really starting a sub transaction then that commit
would only commit the subtransaction not the top level transaction. I
don't see how you can use COMMIT to sometimes mean commit the
subtransaction and other times have it mean commit the top level
transaction.

I don't have a problem with the under the covers functionality in this
patch, it is how begin/commit are changed to support the underlying
functionality that concerns me. IMHO we should not change the behavior
of begin/commit for nested transactions (leave them do what they have
always done - i.e. control the top level transaction state), but
introduce new syntax for subtransactions control.

thanks,
--Barry


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-17 21:19:24
Message-ID: 1087507163.12015.5100.camel@stromboli
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

On Thu, 2004-06-17 at 02:44, Alvaro Herrera wrote:
> I don't know what Oracle or other DBMSs expect in this area. Anyone
> care to give me a few pointers? If I'm missing something, I want to
> know as soon as possible.

Without ignoring your other responses, I remain massively impressed....

SAVEPOINTs allow you to split a transaction into multiple related parts.

You can issue:
(implicit start)
statement1
statement2
statement3 - causes error - note does not rollback txn
reissue statement3
SAVEPOINT X
statement 4 - succeeds
ROLLBACK to X (undoes statement 4, but not statement 3)
statement 5
COMMIT

statements 1,2,3,5 are committed

Best Regards, Simon Riggs


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Barry Lind <blind(at)xythos(dot)com>
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-17 22:01:49
Message-ID: 200406172201.i5HM1ns22285@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

> The problem I see with moving towards supporting savepoints with the
> current proposal is with how commit works:
>
> Consider:
>
> begin;
> insert into foo values (1);
> savepoint dammit;
> insert into foo values (2);
> select foo;
> insert into foo values (3);
> commit;
>
> This one commit needs to commit the top level transaction. But if the
> savepoint command is really starting a sub transaction then that commit
> would only commit the subtransaction not the top level transaction. I
> don't see how you can use COMMIT to sometimes mean commit the
> subtransaction and other times have it mean commit the top level
> transaction.
>
> I don't have a problem with the under the covers functionality in this
> patch, it is how begin/commit are changed to support the underlying
> functionality that concerns me. IMHO we should not change the behavior
> of begin/commit for nested transactions (leave them do what they have
> always done - i.e. control the top level transaction state), but
> introduce new syntax for subtransactions control.

Well, because their was only one BEGIN, the commit commits all open
subtransactions. The code will have to track the number of BEGIN's used
and will have to roll all savepoints into the next commit. However, it
is only the commit that matches the outermost begin that has this
behavior. Consider this:

> begin;
> insert into foo values (1);
> savepoint aa;
> begin;
> savepoint dammit;
> insert into foo values (2);
> commit;
> select foo;
> rollback dammit;
> rollback aa;
> insert into foo values (3);
> commit;

OK, the inner commit does not close the aa subtransaction. One big
question is whether it closes the dammit subtransaction. And is
rollback to aa valid (I think so), and what about rollback dammit, which
was defined in a subtransaction (I think we have to disallow that).

Did I make a mistake by promoting subtransactions rather than
savepoints?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Barry Lind <blind(at)xythos(dot)com>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-18 02:13:02
Message-ID: 40D24FAE.3030903@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

The other thing that I have been meaning to say in this thread is that I
don't like using COMMIT to mean subtransaction commit (vs. introducing a
new command for it) because of the following situation.

Lets say that I have a java method that takes a jdbc connection and this
code starts a transaction and does some work then issues a commit to
ensure the changes have been committed to the database, and then does
some other work outside the database based on the fact that the commit
was sucessfull and it therefore knows the data is saved to disk (i.e.
send out an email notification, or any number of other non-database tasks).

Now lets suppose that someone calls this method with a database
connection that already has a transaction started, so that this method
really is beginning and working with a sub-transaction. Now when it
commits it doesn't know if the changes will ever get to disk since its
commit could be rolled back later. So this code gets broken.

I like the functionality of nested transactions, I just think that there
needs to be different commands other than BEGIN/COMMIT to work with
them. So that there is no possiblity for misunderstanding what COMMIT
really means.

thanks,
--Barry

Alvaro Herrera wrote:

> On Wed, Jun 16, 2004 at 11:45:36PM +0100, Simon Riggs wrote:
>
>
>>The patch looks impressively technical, but overall I'm not exactly sure
>>what it does...I guess I'm just not clear why I would want it, except as
>>the main technical pre-work to later syntax changes. I'm sure some short
>>explanations would clear that up for me very quickly... :)
>
>
> Right. I have never intended to be implementing a known SQL standard
> feature. What I'm doing is allowing the whole backend to go back to a
> know state after an error is encountered.
>
> With this in place, implementing SAVEPOINTs the way SQL expects them to
> work appears to be a very trivial exercise.
>
>
>>Perhaps what I've just asked about is trivial icing on the cake you've
>>just baked,
>
>
> I think this phrase very precisely describes it. At least, that's what
> I expect.
>
> You may not see it, but a savepoint is just the start of a nested
> transaction in disguise. Consider:
>
> begin;
> insert into foo values (1);
> savepoint dammit;
> insert into foo values (2);
> select foo; -- fails
> rollback to dammit;
> insert into foo values (3);
> commit;
>
> You expect the transaction to finish with tuples 1 and 3 in table foo,
> right? Well, this is exactly the same as
>
> begin;
> insert into foo values (1);
> begin; -- dammit
> insert into foo values (2);
> select foo; -- fails, goes to aborted state
> rollback;
> insert into foo values (3);
> commit;
>
> So all that's needed for the former to work is to be able to define a
> "name" for a transaction (using a cute syntax) and being able to
> rollback to it. Definitely trivial, after all the work I have put into
> making the latter work.
>
> In extant releases you can only do this:
> begin;
> insert into foo values (1);
> insert into foo values (2);
> select foo; -- oops, can't go back!
> rollback;
> begin;
> insert into foo values (1);
> insert into foo values (3);
> commit;
>
> You are forced to send all the commands before the aborting one to the
> server again. And there's no way to "undo" a command in the
> transaction, short of aborting it completely.
>
>
> I don't know what Oracle or other DBMSs expect in this area. Anyone
> care to give me a few pointers? If I'm missing something, I want to
> know as soon as possible.
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Barry Lind <blind(at)xythos(dot)com>, simon(at)2ndquadrant(dot)com, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-18 15:38:10
Message-ID: 20920.1087573090@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl> writes:
> I don't think we should explicitly forbid it. I think it should be
> forbidden to close the outermost transaction inside a function (else the
> function would not be able to terminate correctly), but for levels
> before that one it'd be OK.

More specifically, a function cannot close any xact that was open when
it was called. It can only close xacts that it started (or
equivalently, roll back to savepoints that it established). Otherwise
the behavior is nonsensical. Rollback of an outer transaction would
mean making like the function call never even happened, so the function
certainly couldn't expect to keep control.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Barry Lind <blind(at)xythos(dot)com>, Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-18 16:32:52
Message-ID: 21670.1087576372@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Did I make a mistake by promoting subtransactions rather than
> savepoints?

No. We can implement savepoints on top of subtransactions, but not
vice versa. AFAICS the savepoint syntax is just a shorthand for
a constrained form of subtransaction --- essentially one where you
can't explicitly commit a subtransaction, it's implicitly committed
when you commit the parent. I don't see anything but syntactic
sugar here ...

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Barry Lind <blind(at)xythos(dot)com>
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Nested transactions
Date: 2004-06-18 22:47:10
Message-ID: 4213.1087598830@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

Barry Lind <blind(at)xythos(dot)com> writes:
> I like the functionality of nested transactions, I just think that there
> needs to be different commands other than BEGIN/COMMIT to work with
> them. So that there is no possiblity for misunderstanding what COMMIT
> really means.

There's something to be said for that view. Another thing in its favor
is that if we choose names like SUBBEGIN and SUBCOMMIT, then we get rid
of the syntax conflict with plpgsql's BEGIN/END. A function cannot
legally issue a true COMMIT, as it has to be inside an outer transaction
--- so it only needs to be able to say SUBBEGIN and SUBCOMMIT.

I'm not at all wedded to those particular names, of course. Just
thinking that it'd simplify life if they were spelled differently than
BEGIN and END.

regards, tom lane


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Barry Lind <blind(at)xythos(dot)com>
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Patches <pgsql-patches(at)postgresql(dot)org>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: [PATCHES] Nested transactions
Date: 2004-06-18 23:30:49
Message-ID: 40D37B29.8020307@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-patches

Barry Lind wrote:
> The other thing that I have been meaning to say in this thread is that I
> don't like using COMMIT to mean subtransaction commit (vs. introducing a
> new command for it) because of the following situation.
>
> Lets say that I have a java method that takes a jdbc connection and this
> code starts a transaction and does some work then issues a commit to
> ensure the changes have been committed to the database, and then does
> some other work outside the database based on the fact that the commit
> was sucessfull and it therefore knows the data is saved to disk (i.e.
> send out an email notification, or any number of other non-database tasks).
>
> Now lets suppose that someone calls this method with a database
> connection that already has a transaction started, so that this method
> really is beginning and working with a sub-transaction. Now when it
> commits it doesn't know if the changes will ever get to disk since its
> commit could be rolled back later. So this code gets broken.

Note that there is no standard way in JDBC to enter a subtransaction
unless you issue the BEGIN explicitly or invoke the savepoint API. There
are lots of ways to confuse the driver's transaction handling already if
you issue arbitary transaction control instructions (althugh I'm working
on making the driver recover better -- the v3 transaction status
indicator in ReadyForQuery helps).

And, well, how do you expect the code to ever not be broken? If commit()
really does commit the whole transaction, the caller code that expects
to still be in a transaction is going to be unhappy (if it doesn't
expect to still be in a transaction, why is it opening a subtransaction
at all?). The callee should be using JTA's registerSynchronization() to
get callbacks after transaction commit, or using 2PC (yes I know it's
not supported yet), depending on the guarantees it needs.

Earlier Alvaro was looking at ways to provide the transaction nesting
level via the client protocol; I suggested doing it as a parameter (so
you get ParameterStatus on nesting change) but I'm not sure what
happened with it after that. Assuming something does get done here, the
driver can track where it is in subtransactions quite easily, and so if
you want Connection.commit() to really mean "commit this transaction and
all subtransactions" even in the face of the user messing around with
BEGIN themselves, we can do that even if multiple COMMITs are needed --
we just look at the current nesting level to work out how many to issue.

This behaviour (of commit()/rollback()) actually makes sense as things
like connection pools will expect Connection.commit() or
Connection.rollback to produce a reasonably "vanilla" connection state,
and transaction monitors are likely to want those methods to affect the
entire transaction too.

> I like the functionality of nested transactions, I just think that there
> needs to be different commands other than BEGIN/COMMIT to work with
> them. So that there is no possiblity for misunderstanding what COMMIT
> really means.

"BEGIN NESTED WORK" / "COMMIT NESTED WORK" / "END NESTED WORK" or
something? And make plain BEGIN inside a transaction a warning (as it
currently is) and plain COMMIT/END inside a subtransaction an error? (or
should they affect all subtransactions?) I can see this having some
value for dealing with existing applications that issue redundant
BEGIN/COMMIT/ROLLBACK statements (and get warnings, but ignore them).

-O