Re: Long term database archival

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, thm(at)duke(dot)edu
Subject: Re: Long term database archival
Date: 2006-07-06 23:14:39
Message-ID: 44AD995F.9060803@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Karl O. Pinc wrote:
> Hi,
>
> What is the best pg_dump format for long-term database
> archival? That is, what format is most likely to
> be able to be restored into a future PostgreSQL
> cluster.
>
> Mostly, we're interested in dumps done with
> --data-only, and have preferred the
> default (-F c) format. But this form is somewhat more
> opaque than a plain text SQL dump, which is bound
> to be supported forever "out of the box".
> Should we want to restore a 20 year old backup
> nobody's going to want to be messing around with
> decoding a "custom" format dump if it does not
> just load all by itself.
For schema dumps the custom format has advantages IMHO,
mainly because it adds flexibility. When creating text-formatted
dumps, you have to specify options like "--no-owner, ..."
at _dumping_ time, while custom-format dumps allow you to
specify them at _restoration_ time.

For data-dumps this is less relevant, since the amount of
available options is much smaller. But even there, restoring
with "insert-statements" as opposed to "copy from stdin" could
be usefull in some situations.

Anyway, 20 years is a _long_, _long_ time. If you _really_
need to keep your data that long, I'd suggest you create
text-only schema dumps, and text-only data dumps. The postgres
developers are very concerned about backward compatibility in
my experience, but probably _not_ for versions from 20 years ago ;-)

But since the probability of the need to restore your backup
in 6 months is _much_ larger than the one of needing to restore
it in 20 years, I'd create customer-format dumps too.
For the near future, they're the better choice IMHO.

> Is the answer different if we're dumping the
> schema as well as the data?
The above holds true for the schema as well as for the data.

greetings, Florian Pflug

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Florian G. Pflug 2006-07-06 23:18:22 Re: Best way to deal with quote_literal issue?
Previous Message Ryan Gran 2006-07-06 23:00:35 PANIC: XX000: right sibling is not next child in "pg_depend_reference_index"