Quick Links

Re: Two weeks to feature freeze

From:	Sailesh Krishnamurthy <sailesh(at)cs(dot)berkeley(dot)edu>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Mike Mascari <mascarm(at)mascari(dot)com>, Rod Taylor <rbt(at)rbt(dot)ca>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Two weeks to feature freeze
Date:	2003-06-23 05:42:41
Message-ID:	bxyel1lz6we.fsf@datafix.CS.Berkeley.EDU
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> Sailesh Krishnamurthy <sailesh(at)cs(dot)berkeley(dot)edu> writes:
>> I'm not sure if I understand Tom's beef - I think he is
>> concerned about what happens if a subordinate does not respond
>> to a prepare message. I would assume that the co-ordinator
>> would not let the commit go through until it has received
>> confirmations from every subordinate.

Tom> No. I want to know what the subordinate does when it's
Tom> promised to commit and the co-ordinator never responds.
Tom> AFAICS the subordinate is screwed --- it can't commit, and it
Tom> can't abort, and it can't expect to make progress
Tom> indefinitely on other work while it's holding locks for the
Tom> not-quite-committed transaction.

Okay I understand what you mean now.

AFAIK the general way things happen is that each site has a "recovery
procedure" that kicks in after a crash. If the co-ordinator crashes
(which could be before or after it sends out COMMIT messages to some
of the subordinates), its recovery manager will bring the system up,
read the log and ready information about all uncommitted transactions
in virtual storage.

If a Xact is in the PREPARE stage it will periodically send a message
to the co-ordinator asking about what happened to the transaction in
question. Once the co-ordinator has come back online it can respond to
the query.

Of course in the case of a co-ordinator going out of action totally
and remaining unconnected this is not a viable solution.

If you're making the case that 2PC is not viable on very wide area
networks with intermitted connectivity, I agree whole-heartedly.

That said, 2PC (and its children, PA and PC) have their place, and are
indeed used in many systems.

For instance, say you are rigging up a message queueing infrastructure
(like MQ-series) to your database (say with NOTIFY), you'd at least
like to have the db act as a co-ordinator with the MQ.

Or the parallel cluster example I gave earlier. Clustered linux boxes
are definitely here although no open-source DBMS offers a parallel
solution.

--
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

In response to

Re: Two weeks to feature freeze at 2003-06-23 04:06:36 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Mike Mascari	2003-06-23 05:51:15	Re: Two weeks to feature freeze
Previous Message	Yutaka tanida	2003-06-23 05:41:59	2Q implementaion for PostgreSQL buffer replacement.