Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)

From: Steve Singer <steve(at)ssinger(dot)info>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Christopher Browne <cbbrowne(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)
Date: 2012-10-16 00:20:44
Message-ID: BLU0-SMTP71DC1A10859E0BEF09C9D6DC700@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12-10-15 04:51 PM, Andres Freund wrote:
>
> Well, as a crosscheck, could you list your requirements?
>
> Do you need anything more than outputting data in a format compatible to whats
> stored in sl_log_*? You wouldn't have sl_actionseq, everything else should be
> there (Well, you would need to do lookups to get the tableid, but thats not
> really much of a problem). The results would be ordered in complete
> transactions, in commit order.
>
> I guess the other tables would stay as they are as they contain the "added
> value" of slony?
>
> Greetings,

I actually had spent some time a few weeks ago looking over the
documents and code. I never did get around to writing a review as
elegant as Peter's. I have not seen any red flags that make me thing
that what your proposing wouldn't be suitable for slony but sometimes
you don't see details until you start implementing something.

My initial approach to modifying slony to work with this might be
something like:

* Leave sl_event as is for non SYNC events, slon would still generate
SYNC events in sl_event
* We would modify the remote_worker thread in slon to instead of
selecting from sl_event it would get the the next 'committed'
transaction from your apply cache. For each ApplyChange record we
would check to see if it is an insert into sl_event ,if so we would
trigger our existing event processing logic based on the contents of the
ev_type column.
* If the change involves a insert/update/delete/truncate to a replicated
table we would translate that change into SQL and apply it on the
replica, we would not commit changes on the replica until we encounter
a SYNC being added to sl_event for the current origin.
* SQL will be applied in a slightly different order than slony does
today. Today if two concurrent transactions are inserting into the same
replicated table and they commit one after the other there is a good
chance that the apply order on the replica will also be intermixed
(assuming both commits were in between two SYNC events). My thinking is
that we would just replay them one after the other on the replica in
commit order. (Slony doesn't use commit order because we don't have it,
not because we don't like it) this would mean we do away with tracking
the action id.

* If a node is configured as a 'forwarder' not it would store the
processed output of each ApplyChange record in a table on the replica.
If a slon is pulling data from a non-orign (ie if remoteWorkerThread_1
is pulling data from node 2) then it would need to query this table
instead of calling the functions that process the ApplyCache contents.

* To subscribe a node we would generate a SYNC event on the provider and
do the copy_set. We would keep track of that SYNC event. The remote
worker would then ignore any data that comes before that SYNC event
when it starts pulling data from the apply cache.
* DDL events in 2.2+ go into sl_ddl_script (or someting like that) when
we see INSERT commands to that table we would now to then apply the DDL
on the node.

* We would need to continue to populate sl_confirm because nowing what
SYNC events have already been processed by a node is pretty important in
a MOVE SET or FAILOVER. It is possible that we might need to still
track the xip lists of each SYNC for MOVE SET/FAILOVER but I'm not sure
why/why not.

This is all easier said than implemented

Steve

> Andres

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2012-10-16 00:59:56 Re: Global Sequences
Previous Message Bruce Momjian 2012-10-15 23:59:03 Re: Deprecating RULES