Postgres replication: dump/restore, PITR, Slony,...?

Lists: pgsql-performance
From: Shaul Dar <shauldar(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Postgres replication: dump/restore, PITR, Slony,...?
Date: 2009-06-11 13:12:32
Message-ID: 234efe30906110612p46bfe848x2b790489c1999d55@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

Our configuration is as follows:

1. A staging server, which receives new data and updates the DB
2. Two web servers that have copies of the DB (essentially read-only) and
answer user queries (with load balancer)

Currently we use dump (to SQL file, i.e. pg_dump with no args) + file copy
to replicate the DB daily between the staging and web servers, and then
restore (via psql) the servers one at the time. In our application we expect
that average daily change is only to 3% of the records. My question is what
would be the best way to do this replication?

I read about continuous archiving and
PITR<http://www.postgresql.org/docs/8.3/interactive/continuous-archiving.html>.
My understanding however (e.g. from
this<http://archives.postgresql.org/pgsql-admin/2006-07/msg00279.php>)
is that I cannot do a base backup once and then e.g. apply WAL files on a
daily basis, starting from yesterday's DB, but must instead redo the full
base backup before starting recovery?

Then there are Slony-I, Buchardo, Mamoth Replicator from CMO, simple
replication in Postgres 8.4 and other projects...

Suggestions?
Thanks,

-- Shaul


From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Shaul Dar <shauldar(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres replication: dump/restore, PITR, Slony,...?
Date: 2009-06-11 14:33:04
Message-ID: 877hzizvq7.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

Shaul Dar <shauldar(at)gmail(dot)com> writes:
> 1. A staging server, which receives new data and updates the DB
> 2. Two web servers that have copies of the DB (essentially read-only)
> and answer user queries (with load balancer)

[...]

> Suggestions?

I'd consider WAL Shipping for the staging server and some trigger based
asynchronous replication for feeding the web servers.

More specifically, I'd have a try at Skytools, using walmgr.py for WAL
Shipping and Londiste for replication.
http://wiki.postgresql.org/wiki/Skytools
http://wiki.postgresql.org/wiki/Londiste_Tutorial

Regards,
--
dim


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Shaul Dar <shauldar(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres replication: dump/restore, PITR, Slony,...?
Date: 2009-06-11 16:23:29
Message-ID: 1244737409.22641.4.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


>
> Then there are Slony-I, Buchardo, Mamoth Replicator from CMO, simple
> replication in Postgres 8.4 and other projects...

CMO? :)

Joshua D. Drake
>
> Suggestions?
> Thanks,
>
> -- Shaul
>
--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: jd(at)commandprompt(dot)com
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres replication: dump/restore, PITR, Slony,...?
Date: 2009-06-11 16:30:06
Message-ID: 308aebb82ee2b2cca09da26ab63b82ca@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

>> Then there are Slony-I, Buchardo, Mamoth Replicator from CMO, simple
>> replication in Postgres 8.4 and other projects...

> CMO? :)

Buchardo? :)

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
End Point Corporation
PGP Key: 0x14964AC8 200906111229
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkoxMPkACgkQvJuQZxSWSsizywCbBtuo7cbCwmlHzvbi1kak9leF
XwYAnA5dXlZqyyUOQrymXZf4yGJSMSq6
=UPhb
-----END PGP SIGNATURE-----


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres replication: dump/restore, PITR, Slony,...?
Date: 2009-06-11 16:32:14
Message-ID: 1244737934.22641.7.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Thu, 2009-06-11 at 16:30 +0000, Greg Sabino Mullane wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160
>
> >> Then there are Slony-I, Buchardo, Mamoth Replicator from CMO, simple
> >> replication in Postgres 8.4 and other projects...
>
> > CMO? :)
>
> Buchardo? :)

A new desert, Buchardo CMO:

Two shots of brandy
One shot of rum
Vanilla Ice cream
Cherries

Blend to perfection.

Joshua D. Drake

--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: Shaul Dar <shauldar(at)gmail(dot)com>
To: jd(at)commandprompt(dot)com
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres replication: dump/restore, PITR, Slony,...?
Date: 2009-06-12 11:27:10
Message-ID: 234efe30906120427t289141d2j61d084edd9f108ae@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

All right, so I misspelled Bucardo (also Mammoth...), and the company's name
is Command Prompt (please get someone to work on that incomprehensible logo
- I went back and looked at it and still have no clue what it means :-).

Now how about some serious answers relating to my questions?

Dimitri, thanks for your answer. I don't need to replicate TO the staging
server (this is where the changes happen) but rather FROM the staging server
TO the Web (query) servers. I think my description wasn't clear enough.
Currently the staging DB changes daily as new records are inserted to it
(would have liked to use COPY instead, but I believe that's only useful for
bulk loading the whole DB, not appending to it?). Those changes need to be
reflected on the Web servers. Today this is done via dump-copy files-restore
of the whole DB (we shut down each Web server DB while restoring it,
obviously), and I we are looking for a better way.

I would truly appreciate specific suggestions and pointers/references ("some
trigger based asynchronous replication" doesn't help much...).

Also is my understanding of PITR limitations correct?

Thanks,

-- Shaul

On Thu, Jun 11, 2009 at 7:32 PM, Joshua D. Drake <jd(at)commandprompt(dot)com>wrote:

> On Thu, 2009-06-11 at 16:30 +0000, Greg Sabino Mullane wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: RIPEMD160
> >
> > >> Then there are Slony-I, Buchardo, Mamoth Replicator from CMO, simple
> > >> replication in Postgres 8.4 and other projects...
> >
> > > CMO? :)
> >
> > Buchardo? :)
>
> A new desert, Buchardo CMO:
>
> Two shots of brandy
> One shot of rum
> Vanilla Ice cream
> Cherries
>
> Blend to perfection.
>
> Joshua D. Drake
>


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: pgsql-performance(at)postgresql(dot)org Content-Type: multipart/alternative; boundary=001636c5b2f03dc599046c12586b X-Virus-Scanned: Maia Mailguard 1(dot)0(dot)1 X-Mailing-List: pgsql-performance List-Archive: <http://archives(dot)postgresql(dot)org/pgsql-performance> List-Help: <mailto:majordomo(at)postgresql(dot)org?body=help> List-ID: <pgsql-performance(dot)postgresql(dot)org> List-Owner: <mailto:pgsql-performance-owner(at)postgresql(dot)org> List-Post: <mailto:pgsql-performance(at)postgresql(dot)org> List-Subscribe: <mailto:majordomo(at)postgresql(dot)org?body=sub%20pgsql-performance> List-Unsubscribe: <mailto:majordomo(at)postgresql(dot)org?body=unsub%20pgsql-performance>
Subject: Re: Postgres replication: dump/restore, PITR, Slony,...?
Date: 2009-06-12 19:11:42
Message-ID: 81ad7080454fa9b71eb7d7a159b1c636@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> Currently we use dump (to SQL file, i.e. pg_dump with no args) + file copy
> to replicate the DB daily between the staging and web servers, and then
> restore (via psql) the servers one at the time. In our application we expect
> that average daily change is only to 3% of the records. My question is what
> would be the best way to do this replication?

Bucardo should handle this easy enough. Just install Bucardo, tell it about the
databases, tell it which tables to replicate, and start it up. If the tables
have unique indexes (e.g. PKs) you can use the 'pushdelta' type of sync, which
will copy rows as they change from the staging server to the web servers.
If the tables don't have unique indexes, you'll have to use the 'fullcopy'
sync type, which, as you might imagine, copies the entire table each time.

You can further control both of these to fire automatically when the data
on the staging server changes, or to only fire when you tell it to, e.g.
every X minutes, or based on some other criteria. You can also configure
how many of the web servers get pushed to at one time, from 1 up to
all of them.

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
End Point Corporation
PGP Key: 0x14964AC8 200906121509
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkoyqFkACgkQvJuQZxSWSsjB8ACffcQRD+Vb7SV0RZnoo70hkpwB
nycAn0QDiogs3EuCrc9+h4rMoToTFopz
=Sltu
-----END PGP SIGNATURE-----