Re: Proposal: Commit timestamp

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Theo Schlossnagle <jesus(at)omniti(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Jim Nasby <decibel(at)decibel(dot)org>
Subject: Re: Proposal: Commit timestamp
Date: 2007-02-07 00:29:11
Message-ID: 45C91D57.3020403@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/6/2007 11:44 AM, Markus Schiltknecht wrote:
> Hi,
>
> Zeugswetter Andreas ADI SD wrote:
>> And "time based"
>> is surely one of the important conflict resolution methods for async MM
>> replication.
>
> That's what I'm questioning. Wouldn't any other deterministic, but
> seemingly random abort decision be as clever as time based conflict
> resolution? It would then be clear to the user that it's random and not
> some "in most cases time based, but no in others and only if..." thing.
>
>> Sure there are others, like "rule based" "priority based" but I think
>> you don't need additional backend functionality for those.
>
> Got the point, yes. I'm impatient, sorry.
>
> Neither the less, I'm questioning if is it worth adding backend
> functionality for that. And given this probably is the most wanted
> resolution method, this question might be "heretical". You could also
> see it as sort of an user educating question: don't favor time based
> resolution if that's the one resolution method with the most traps.

These are all very good suggestions towards additional conflict
resolution mechanisms, that solve one or the other problem. As we have
said for years now, one size will not fit all. What I am after for the
moment is a system that supports by default a last update wins on the
row level, where last update certainly is a little fuzzy, but not by
minutes. Plus balance type columns. A balance column is not propagated
as a new value, but as a delta between the old and the new value. All
replica will apply the delta to that column regardless of whether the
replication info is newer or older than the existing row. That way,
literal value type columns (like an address) will maintain cluster wide
the value of the last update to the row, while balance type columns will
clusterwide maintain the sum of all changes.

Whatever strategy one will use, in an async multimaster there are always
cases that can be resolved by rules (last update being one of them), and
some that I can't even imagine solving so far. I guess some of the cases
will simply boil down to "the application has to make sure that ...
never occurs". Think of a multi-item order, created on one node, while
another node is deleting the long unused item (which would have to be
backordered). Now while those two nodes figure out what to do to make
this consistent again, a third node does a partial shipment of that
order. The solution is simple, reinsert the deleted item ... only that
there were rather nasty ON DELETE CASCADE's on that item that removed
all the consumer reviews, product descriptions, data sheets and what
not. It's going to be an awful lot of undo.

I haven't really made up my mind about a user defined rule based
conflict resolution interface yet. I do plan to have a unique and
foreign key constraint based, synchronous advisory locking system on top
of my system in a later version (advisory key locks would stay in place
until the transaction, that placed them, replicates).

I guess you see by now why I wanted to keep the discussion about the
individual, rather generic support features in the backend separate from
the particular features I plan to implement in the replication system.
Everyone has different needs and consequently an async multi-master
"must" do a whole range of mutually exclusive things altogether ...
because Postgres can never accept a partial solution. We want the egg
laying milk-wool-pig or nothing.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Drake 2007-02-07 00:39:58 Re: Proposal: TABLE functions
Previous Message Rick Gigger 2007-02-07 00:04:52 Re: 10 weeks to feature freeze (Pending Work)