Re: Schemas, databases, and backups

Lists: pgsql-novice
From: Thomas Harold <tgh(at)tgharold(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Schemas, databases, and backups
Date: 2005-11-24 12:51:03
Message-ID: 4385B737.3070304@tgharold.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

According to my reading of the pgsql documentation, the two basic backup
scripts are pg_dump and pg_dumpall. pg_dump allows you to dump a single
database to a file, while pg_dumpall dumps all of the databases to a
single file.

Currently, we use MSSQL's built-in backup facility. That allows us,
with a single command, to dump every database to separate files on a
daily basis (and we keep 14-days online). That makes recovering from a
glitch in one of the databases very easy, and it's rather simple to go
back to a particular day.

Also, schemas are new to us, so I'm still thinking about how they will
affect our processes and databases.

(I'm betting that the ultimate answer is going to be to look for some
3rd party tool in pgFoundry.)

So, now for the questions:

1) Is there a tool (or is this easily scripted in bash?) that would
iterate through the databases in pgsql and dump them to individual
files? I'm guessing that we would query pg_databases and dump the
database names to a file (how?) and then parse that to feed to pg_dump
(I can figure this bit out myself).

2) What if I wanted to dump individual schemas? Is this dangerous / not
recommended? (Probably not... if I can have relationships between
tables in different schemas?)


From: Jaime Casanova <systemguards(at)gmail(dot)com>
To: Thomas Harold <tgh(at)tgharold(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Schemas, databases, and backups
Date: 2005-11-24 16:23:18
Message-ID: c2d9e70e0511240823n8fc18e0k2245e3b4fa751616@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On 11/24/05, Thomas Harold <tgh(at)tgharold(dot)com> wrote:
> According to my reading of the pgsql documentation, the two basic backup
> scripts are pg_dump and pg_dumpall. pg_dump allows you to dump a single
> database to a file, while pg_dumpall dumps all of the databases to a
> single file.
>
> Currently, we use MSSQL's built-in backup facility. That allows us,
> with a single command, to dump every database to separate files on a
> daily basis (and we keep 14-days online). That makes recovering from a
> glitch in one of the databases very easy, and it's rather simple to go
> back to a particular day.
>
> Also, schemas are new to us, so I'm still thinking about how they will
> affect our processes and databases.
>

if you want to make querys that retrives info for more than one
database at the same time think in use schemas.

> (I'm betting that the ultimate answer is going to be to look for some
> 3rd party tool in pgFoundry.)
>
> So, now for the questions:
>
> 1) Is there a tool (or is this easily scripted in bash?) that would
> iterate through the databases in pgsql and dump them to individual
> files? I'm guessing that we would query pg_databases and dump the
> database names to a file (how?) and then parse that to feed to pg_dump
> (I can figure this bit out myself).
>

psql -d template1 -U postgres -c "select datname from pg_databases
where datname not in ('template1', 'template0', 'postgres');" | while
read D;

or something like that in a shell script and the simply pg_dump $D...

> 2) What if I wanted to dump individual schemas? Is this dangerous / not
> recommended? (Probably not... if I can have relationships between
> tables in different schemas?)
>

dunno

--
Atentamente,
Jaime Casanova
(DBA: DataBase Aniquilator ;)


From: Martin Foster <martin(at)ethereal-realms(dot)org>
To: Thomas Harold <tgh(at)tgharold(dot)com>, PostgreSQL Novice List <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Schemas, databases, and backups
Date: 2005-11-24 17:25:13
Message-ID: 4385F779.1010805@ethereal-realms.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Thomas Harold wrote:
> According to my reading of the pgsql documentation, the two basic backup
> scripts are pg_dump and pg_dumpall. pg_dump allows you to dump a single
> database to a file, while pg_dumpall dumps all of the databases to a
> single file.
>
> Currently, we use MSSQL's built-in backup facility. That allows us,
> with a single command, to dump every database to separate files on a
> daily basis (and we keep 14-days online). That makes recovering from a
> glitch in one of the databases very easy, and it's rather simple to go
> back to a particular day.
>
> Also, schemas are new to us, so I'm still thinking about how they will
> affect our processes and databases.
>
> (I'm betting that the ultimate answer is going to be to look for some
> 3rd party tool in pgFoundry.)
>
> So, now for the questions:
>
> 1) Is there a tool (or is this easily scripted in bash?) that would
> iterate through the databases in pgsql and dump them to individual
> files? I'm guessing that we would query pg_databases and dump the
> database names to a file (how?) and then parse that to feed to pg_dump
> (I can figure this bit out myself).
>
> 2) What if I wanted to dump individual schemas? Is this dangerous / not
> recommended? (Probably not... if I can have relationships between
> tables in different schemas?)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

Its pretty trivial to code it in a bash script and I've done it for a
while now. Since these are primarily command line tools, you can
easily pipe the output compress the files, sort it by date and make your
life a bit easier overall.

I included a copy of mine, which was used to backup from another server.
It will also automatically store them in monthly folders with every
day inside for ease of retrieval. Its commented and I make use of
variables to speed things along and to my knowledge is still setup to
backup databases listed.

Please note however that this script has not been used in a while, the
servers it was intended to work on are no longer operational. So any
passwords or addresses are of no consequence!

Martin Foster
Creator/Designer Ethereal Realms
martin(at)ethereal-realms(dot)org

Attachment Content-Type Size
pg_backup.shl text/plain 3.9 KB

From: Thomas Harold <tgh(at)tgharold(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Schemas, databases, and backups
Date: 2005-11-24 18:47:35
Message-ID: 43860AC7.9030204@tgharold.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Jaime Casanova wrote:

> psql -d template1 -U postgres -c "select datname from pg_databases
> where datname not in ('template1', 'template0', 'postgres');" | while
> read D;
>
> or something like that in a shell script and the simply pg_dump $D...
>

I found the following snippet of code, which roughly matches yours. It
was over in the Redhat mailing lists and was used to vacuum databases.

# su postgres -c 'psql -t -c "select datname from pg_database order by
datname;" template1' | xargs -n 1 echo

template0
template1
test1
test2

After some mucking about, I came up with the following single-line shell
command (suitable for adding to root's crontab).

# su postgres -c 'psql -t -c "select datname from pg_database where not
datistemplate and datallowconn order by datname;" template1' | xargs -n
1 -i pg_dump -Ft -b -U postgres -f /backup/postgresql/pgsql.`date
+%Y%m%d.%H%M`.{}.tar {}

I couldn't figure out how to add in the "not in ('template1',
'postgres', 'template1')" into the single-line shell command. It seemed
to confuse the shell.

Issues with the above command:

1) The date gets reevaluated for each new execution of pg_dump. Which
is not necessarily ideal if you want filenames that group easily.
Converting to a shell script would allow finer control.

2) The output is not compressed. I guess I could switch to using "-Fc"
in conjunction with "-Z 9".

Additional questions and notes:

A) pg_dump takes an argument "--schema=schema", which looks like it
allows me to just dump the contents of a particular schema within a
database. So if I wanted, I could iterate through the list of schemas
and go that route.

B) There's also a "--table=table" argument, which dumps a single table.
The man page for pg_dump warns me that pg_dump will not output any
objects that the table depends on, so it may not be possible to restore.
(The same warning applied to --schema=schema.)

C) I'm not sure whether I can get away with using "where not
datistemplate and datallowconn". For backing up user databases, does it
matter? I can't figure out how to quote the commands properly to keep
bash from getting confused. (Doubled-up quotes? Escaped quotes?)

D) After more mucking, I figured out how to set a static datestamp value
for the entire command and compress the tar files using gzip. I'm not
sure whether I should use "export" or "set" (both worked).

# export DTSTAMP=`date +%Y%m%d.%H%M` ; su postgres -c 'psql -t -c
"select datname from pg_database where not datistemplate and
datallowconn order by datname;" template1' | xargs -n 1 -i bash -c
"pg_dump -Ft -b -U postgres {} | gzip -c >
/backup/postgresql/pgsql.${DTSTAMP}.{}.tgz"

Links:
http://archives.postgresql.org/pgsql-general/2000-01/msg00593.php
http://postgis.refractions.net/pipermail/postgis-users/2005-November/009925.html


From: Jaime Casanova <systemguards(at)gmail(dot)com>
To: Thomas Harold <tgh(at)tgharold(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Schemas, databases, and backups
Date: 2005-11-24 19:49:24
Message-ID: c2d9e70e0511241149h1c45a41bvd9c8ea3bce67c9b9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

>
> # su postgres -c 'psql -t -c "select datname from pg_database where not
> datistemplate and datallowconn order by datname;" template1' | xargs -n
> 1 -i pg_dump -Ft -b -U postgres -f /backup/postgresql/pgsql.`date
> +%Y%m%d.%H%M`.{}.tar {}
>
> I couldn't figure out how to add in the "not in ('template1',
> 'postgres', 'template1')" into the single-line shell command. It seemed
> to confuse the shell.
>
> Issues with the above command:
>
> 1) The date gets reevaluated for each new execution of pg_dump. Which
> is not necessarily ideal if you want filenames that group easily.
> Converting to a shell script would allow finer control.
>
> 2) The output is not compressed. I guess I could switch to using "-Fc"
> in conjunction with "-Z 9".
>
> Additional questions and notes:
>
> A) pg_dump takes an argument "--schema=schema", which looks like it
> allows me to just dump the contents of a particular schema within a
> database. So if I wanted, I could iterate through the list of schemas
> and go that route.
>
> B) There's also a "--table=table" argument, which dumps a single table.
> The man page for pg_dump warns me that pg_dump will not output any
> objects that the table depends on, so it may not be possible to restore.
> (The same warning applied to --schema=schema.)
>

sure you can just like with pg_database, it's pg_namespace for schemas
and pg_tables for tables...

> C) I'm not sure whether I can get away with using "where not
> datistemplate and datallowconn". For backing up user databases, does it
> matter? I can't figure out how to quote the commands properly to keep
> bash from getting confused. (Doubled-up quotes? Escaped quotes?)
>

"where not datistemplate and datallowconn" it's better than "datname
not in (values)"

> D) After more mucking, I figured out how to set a static datestamp value
> for the entire command and compress the tar files using gzip. I'm not
> sure whether I should use "export" or "set" (both worked).
>
> # export DTSTAMP=`date +%Y%m%d.%H%M` ; su postgres -c 'psql -t -c
> "select datname from pg_database where not datistemplate and
> datallowconn order by datname;" template1' | xargs -n 1 -i bash -c
> "pg_dump -Ft -b -U postgres {} | gzip -c >
> /backup/postgresql/pgsql.${DTSTAMP}.{}.tgz"
>

maybe it's a good idea to put all that in a script... it's getting
bigger and bigger (and uglier and uglier)

> Links:
> http://archives.postgresql.org/pgsql-general/2000-01/msg00593.php
> http://postgis.refractions.net/pipermail/postgis-users/2005-November/009925.html
>

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)


From: John Koller <johnckoller(at)yahoo(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Schemas, databases, and backups
Date: 2005-11-25 16:38:37
Message-ID: pan.2005.11.25.16.38.36.760253@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Thu, 24 Nov 2005 07:51:03 -0500, Thomas Harold wrote:

> According to my reading of the pgsql documentation, the two basic backup
> scripts are pg_dump and pg_dumpall. pg_dump allows you to dump a single
> database to a file, while pg_dumpall dumps all of the databases to a
> single file.
>
> Currently, we use MSSQL's built-in backup facility. That allows us,
> with a single command, to dump every database to separate files on a
> daily basis (and we keep 14-days online). That makes recovering from a
> glitch in one of the databases very easy, and it's rather simple to go
> back to a particular day.
> So, now for the questions:
[SNIP]
>
> 1) Is there a tool (or is this easily scripted in bash?) that would
> iterate through the databases in pgsql and dump them to individual
> files? I'm guessing that we would query pg_databases and dump the
> database names to a file (how?) and then parse that to feed to pg_dump
> (I can figure this bit out myself).
{SNIP}

The FreeBSD port installs a script called 502.pgsql that dumps each
database to a separate file. It looks like it is close to what you
are looking for.

http://www.freebsd.org/cgi/cvsweb.cgi/ports/databases/postgresql81-server/
files/502.pgsql?rev=1.7&content-type=text/x-cvsweb-markup