Re: dbmirror revisions

Lists: pgsql-generalpgsql-hackers
From: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: dbmirror revisions
Date: 2003-04-03 23:37:03
Message-ID: 200304031637.03517.pgsql@bluepolka.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

I've been modifying dbmirror and wanted to offer my changes to anyone that
cared to experiment, FWIW. My effort is ongoing, the docs aren't perfect,
I make no claims of production readiness, and testing of this latest
version has been minimal, so I strongly advise you to conduct your own
thorough testing before considering a production deployment. That said,
it's a significantly improved solution for our async master-slave needs,
with a few caveats below, and shouldn't be too hard to setup.

There are enough changes that I would hardly consider this a patch, closer
to an overhaul, since I've removed files, renamed others, and added new
files. Among the changes I've made so far:

* Added script for easier setup of many tables/dbs/slaves;
* Added initial support for multiple master replicating distinct data to a
single slave;
* Added batching to minimize load on master and net traffic. You can grab
a configurable number of updates to replicate before hitting the master
again.

* Added port specification;
* Wrapped all replication in transactions;
* Bulletproofed against downed master or slave;
* Started modularization of DB access layer, added some error
handling;
* Added a number of config vars for sync delays, etc;
* Eliminated bug in transaction ordering for replay. Updates cannot
be replicated in the order of the transactions (see archives for discussion
of why).

* Eliminated need for clear_pending.pl by making dbmirror.pl
self-clearing;
* Collasped schema into 1 queue table for performance;
* Changed sequence ID column types to BIGINT for 64-bit sequence;
* Added reconnection handling for robustness;
* Added local tracking of last seq_id to help with recovery
robustness;
* Added master/slave compatibility checking;
* Enabled slave setup during production service so master does not
have to stop serving.
* Renamed tables to minimize namespace conflicts;
* Added lots of logging/debug messages;

* Maybe a few other things I've forgotten...

AFAICS, there are still at least a few major drawbacks to this approach:

* DML statements are not replicated (same for eRServer, AFAIK).

* SEQUENCE objects are not handled; nextval() will not be replicated, so
sequence objects (and serial columns) between master and slave can easily
get out of sync. I wonder if eRServer has this same issue?

* Mass updates/deletes/inserts of 5000 rows with a single SQL command on
the master will result in 5000 individual trigger-firings, and 5000
individual replication inserts on the slave. Rumor has it eRServer's
snapshot gets around this problem.

The code is here:

http://bluepolka.net/dbmirror/dbmirror-20030403-1605.tar.gz

Ed


From: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: dbmirror revisions
Date: 2003-04-03 23:57:20
Message-ID: 200304031657.20395.pgsql@bluepolka.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Thursday April 3 2003 4:37, Ed L. wrote:
>
> AFAICS, there are still at least a few major drawbacks to this approach:
>
> * SEQUENCE objects are not handled; nextval() will not be replicated,
> so sequence objects (and serial columns) between master and slave can
> easily get out of sync. I wonder if eRServer has this same issue?

Clarification: sequence objects for serial columns get out of sync easily,
but the replication maintains correct serial column values.

Ed


From: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: dbmirror revisions
Date: 2003-04-04 16:24:20
Message-ID: 200304040924.20578.pgsql@bluepolka.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Thursday April 3 2003 4:37, Ed L. wrote:
>
> AFAICS, there are still at least a few major drawbacks to this approach:
>
> * DML statements are not replicated (same for eRServer, AFAIK).

D'oh. I meant DDL, not DML.

Ed


From: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: dbmirror revisions
Date: 2003-04-04 22:11:58
Message-ID: 200304041511.58260.pgsql@bluepolka.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Thursday April 3 2003 4:37, Ed L. wrote:
> I've been modifying dbmirror and wanted to offer my changes to anyone
> that cared to experiment, FWIW. My effort is ongoing, the docs aren't
> perfect, I make no claims of production readiness, and testing of this
> latest version has been minimal, so I strongly advise you to conduct your
> own thorough testing before considering a production deployment. That
> said, it's a significantly improved solution for our async master-slave
> needs, with a few caveats below, and shouldn't be too hard to setup.
> ...
> AFAICS, there are still at least a few major drawbacks to this approach:
>
> * SEQUENCE objects are not handled; nextval() will not be replicated,
> so sequence objects (and serial columns) between master and slave can
> easily get out of sync. I wonder if eRServer has this same issue?

I've added code for brute-force replication of sequences in the tgz ball
below. At each sync, the replicator contacts both master and slave and
compares every important aspect of every sequence object on the master with
that of the slave. It then replicates any new sequence object or sequence
object change. This causes dbmirror to hit both master and slave at least
N times on each sync, where N is the number of sequence objects; the
queries are quick. I hate to hit the master like that, but I haven't
thought of a better option short of WAL-log replays. It'd be a nice boost
to query for all sequence values in one SQL query, but I don't know how to
do it in a generalized manner.

http://bluepolka.net/dbmirror/dbmirror-20030404-1446.tar.gz

I think a consistent view on the slave during active replication is not
quite guaranteed with this approach. Sequence updates are not
transactional, we really don't know how to order them with respect to tuple
updates. So someone reading the slave DB might possibly not see sequence
changes appear in the order in which they occurred on the master. For our
warm spare/slave needs, it appears adequate.

Re DDL statement detection, I am thinking about incorporating a schema-only
pg_dump from both master and slave to compare schemas to alert to DDL
changes that could foul replication. Maybe run it only every so often in
dbmirror. None too elegant, but maybe better than nothing...

Ed


From: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: dbmirror revisions
Date: 2003-04-05 07:40:43
Message-ID: 200304050040.43856.pgsql@bluepolka.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Thursday April 3 2003 4:37, some yahoo wrote:
> * Eliminated bug in transaction ordering for replay. Updates
> cannot be replicated in the order of the transactions (see archives for
> discussion of why).

Upon further review, this bug report was the result of a misunderstanding of
the replication ordering. Both replication orderings, old and new, seem to
work fine so far.

Ed


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [GENERAL] dbmirror revisions
Date: 2003-05-27 19:00:03
Message-ID: 200305271900.h4RJ03608271@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


Are any of these changes ready for CVS for 7.4?

---------------------------------------------------------------------------

Ed L. wrote:
> I've been modifying dbmirror and wanted to offer my changes to anyone that
> cared to experiment, FWIW. My effort is ongoing, the docs aren't perfect,
> I make no claims of production readiness, and testing of this latest
> version has been minimal, so I strongly advise you to conduct your own
> thorough testing before considering a production deployment. That said,
> it's a significantly improved solution for our async master-slave needs,
> with a few caveats below, and shouldn't be too hard to setup.
>
> There are enough changes that I would hardly consider this a patch, closer
> to an overhaul, since I've removed files, renamed others, and added new
> files. Among the changes I've made so far:
>
> * Added script for easier setup of many tables/dbs/slaves;
> * Added initial support for multiple master replicating distinct data to a
> single slave;
> * Added batching to minimize load on master and net traffic. You can grab
> a configurable number of updates to replicate before hitting the master
> again.
>
> * Added port specification;
> * Wrapped all replication in transactions;
> * Bulletproofed against downed master or slave;
> * Started modularization of DB access layer, added some error
> handling;
> * Added a number of config vars for sync delays, etc;
> * Eliminated bug in transaction ordering for replay. Updates cannot
> be replicated in the order of the transactions (see archives for discussion
> of why).
>
> * Eliminated need for clear_pending.pl by making dbmirror.pl
> self-clearing;
> * Collasped schema into 1 queue table for performance;
> * Changed sequence ID column types to BIGINT for 64-bit sequence;
> * Added reconnection handling for robustness;
> * Added local tracking of last seq_id to help with recovery
> robustness;
> * Added master/slave compatibility checking;
> * Enabled slave setup during production service so master does not
> have to stop serving.
> * Renamed tables to minimize namespace conflicts;
> * Added lots of logging/debug messages;
>
> * Maybe a few other things I've forgotten...
>
>
> AFAICS, there are still at least a few major drawbacks to this approach:
>
> * DML statements are not replicated (same for eRServer, AFAIK).
>
> * SEQUENCE objects are not handled; nextval() will not be replicated, so
> sequence objects (and serial columns) between master and slave can easily
> get out of sync. I wonder if eRServer has this same issue?
>
> * Mass updates/deletes/inserts of 5000 rows with a single SQL command on
> the master will result in 5000 individual trigger-firings, and 5000
> individual replication inserts on the slave. Rumor has it eRServer's
> snapshot gets around this problem.
>
> The code is here:
>
> http://bluepolka.net/dbmirror/dbmirror-20030403-1605.tar.gz
>
> Ed
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: dbmirror revisions
Date: 2003-08-16 23:20:23
Message-ID: 200308162320.h7GNKNP10902@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


Ed, where are you on these dbmirror improvements?

---------------------------------------------------------------------------

Ed L. wrote:
> I've been modifying dbmirror and wanted to offer my changes to anyone that
> cared to experiment, FWIW. My effort is ongoing, the docs aren't perfect,
> I make no claims of production readiness, and testing of this latest
> version has been minimal, so I strongly advise you to conduct your own
> thorough testing before considering a production deployment. That said,
> it's a significantly improved solution for our async master-slave needs,
> with a few caveats below, and shouldn't be too hard to setup.
>
> There are enough changes that I would hardly consider this a patch, closer
> to an overhaul, since I've removed files, renamed others, and added new
> files. Among the changes I've made so far:
>
> * Added script for easier setup of many tables/dbs/slaves;
> * Added initial support for multiple master replicating distinct data to a
> single slave;
> * Added batching to minimize load on master and net traffic. You can grab
> a configurable number of updates to replicate before hitting the master
> again.
>
> * Added port specification;
> * Wrapped all replication in transactions;
> * Bulletproofed against downed master or slave;
> * Started modularization of DB access layer, added some error
> handling;
> * Added a number of config vars for sync delays, etc;
> * Eliminated bug in transaction ordering for replay. Updates cannot
> be replicated in the order of the transactions (see archives for discussion
> of why).
>
> * Eliminated need for clear_pending.pl by making dbmirror.pl
> self-clearing;
> * Collasped schema into 1 queue table for performance;
> * Changed sequence ID column types to BIGINT for 64-bit sequence;
> * Added reconnection handling for robustness;
> * Added local tracking of last seq_id to help with recovery
> robustness;
> * Added master/slave compatibility checking;
> * Enabled slave setup during production service so master does not
> have to stop serving.
> * Renamed tables to minimize namespace conflicts;
> * Added lots of logging/debug messages;
>
> * Maybe a few other things I've forgotten...
>
>
> AFAICS, there are still at least a few major drawbacks to this approach:
>
> * DML statements are not replicated (same for eRServer, AFAIK).
>
> * SEQUENCE objects are not handled; nextval() will not be replicated, so
> sequence objects (and serial columns) between master and slave can easily
> get out of sync. I wonder if eRServer has this same issue?
>
> * Mass updates/deletes/inserts of 5000 rows with a single SQL command on
> the master will result in 5000 individual trigger-firings, and 5000
> individual replication inserts on the slave. Rumor has it eRServer's
> snapshot gets around this problem.
>
> The code is here:
>
> http://bluepolka.net/dbmirror/dbmirror-20030403-1605.tar.gz
>
> Ed
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: dbmirror revisions
Date: 2003-08-18 17:17:20
Message-ID: 200308181117.20730.pgsql@bluepolka.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Saturday August 16 2003 5:20, Bruce Momjian wrote:
> Ed, where are you on these dbmirror improvements?

My changes are complete as of late April, and I have not worked much on it
since. The changes are not in your CVS. Need to work with the dbmirror
author to see if that makes sense at this point.

Ed

> Ed L. wrote:
> > I've been modifying dbmirror and wanted to offer my changes to anyone
> > that cared to experiment, FWIW. My effort is ongoing, the docs aren't
> > perfect, I make no claims of production readiness, and testing of this
> > latest version has been minimal, so I strongly advise you to conduct
> > your own thorough testing before considering a production deployment.
> > That said, it's a significantly improved solution for our async
> > master-slave needs, with a few caveats below, and shouldn't be too hard
> > to setup.
> >
> > There are enough changes that I would hardly consider this a patch,
> > closer to an overhaul, since I've removed files, renamed others, and
> > added new files. Among the changes I've made so far:
> >
> > * Added script for easier setup of many tables/dbs/slaves;
> > * Added initial support for multiple master replicating distinct data
> > to a single slave;
> > * Added batching to minimize load on master and net traffic. You can
> > grab a configurable number of updates to replicate before hitting the
> > master again.
> >
> > * Added port specification;
> > * Wrapped all replication in transactions;
> > * Bulletproofed against downed master or slave;
> > * Started modularization of DB access layer, added some error
> > handling;
> > * Added a number of config vars for sync delays, etc;
> > * Eliminated bug in transaction ordering for replay. Updates
> > cannot be replicated in the order of the transactions (see archives for
> > discussion of why).
> >
> > * Eliminated need for clear_pending.pl by making dbmirror.pl
> > self-clearing;
> > * Collasped schema into 1 queue table for performance;
> > * Changed sequence ID column types to BIGINT for 64-bit
> > sequence; * Added reconnection handling for robustness;
> > * Added local tracking of last seq_id to help with recovery
> > robustness;
> > * Added master/slave compatibility checking;
> > * Enabled slave setup during production service so master does
> > not have to stop serving.
> > * Renamed tables to minimize namespace conflicts;
> > * Added lots of logging/debug messages;
> >
> > * Maybe a few other things I've forgotten...
> >
> >
> > AFAICS, there are still at least a few major drawbacks to this
> > approach:
> >
> > * DML statements are not replicated (same for eRServer, AFAIK).
> >
> > * SEQUENCE objects are not handled; nextval() will not be
> > replicated, so sequence objects (and serial columns) between master and
> > slave can easily get out of sync. I wonder if eRServer has this same
> > issue?
> >
> > * Mass updates/deletes/inserts of 5000 rows with a single SQL command
> > on the master will result in 5000 individual trigger-firings, and 5000
> > individual replication inserts on the slave. Rumor has it eRServer's
> > snapshot gets around this problem.
> >
> > The code is here:
> >
> > http://bluepolka.net/dbmirror/dbmirror-20030403-1605.tar.gz
> >
> > Ed
> >
> >
> > ---------------------------(end of
> > broadcast)--------------------------- TIP 3: if posting/reading through
> > Usenet, please send an appropriate subscribe-nomail command to
> > majordomo(at)postgresql(dot)org so that your message can get through to the
> > mailing list cleanly