Re: Detach/attach database

Lists: pgsql-hackers
From: Thom Brown <thom(at)linux(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Detach/attach database
Date: 2011-11-13 13:13:11
Message-ID: CAA-aLv5cQf09zvFRCB1XXUQLSp-ouX0S_Hq6ryScd6CtamipFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I don't know if this has been discussed before, but would it be
feasible to introduce the ability to detach and attach databases? (if
you're thinking "stop right there" skip to the end) What I had in
mind would be to do something like the following:

SELECT pg_detach_database('my_database')

pg_detach_database
--------------------
base/16384
(1 row)

Then this database would no longer be accessible. The system tables
would be updated to reflect the removal of this database, and some
kind of manifest file would be produced in that database directory.
The database represented by that returned directory could then be
moved elsewhere (or deleted if the user so wished, but isn't a real
benefit).

But then if another cluster were running with the same version and
architecture of the cluster we just detached our database from, we
could copy that directory to the base directory of that other cluster
(assuming the OID of the database didn't already conflict with an
existing one), then run:

SELECT pg_attach_database('16384', optional_tablespace_name_here);

Then the usual version/architecture/platform checks would happen, and
a reading of the manifest to populate system tables, checking for
conflicts, then the database brought online.

And if I haven't already asked too much, something like this could be run:

SELECT pg_start_clone('my_database');

pg_start_clone
----------------
base/16384
(1 row)

You may now be able to infer where this notion came from, when someone
asked if you can clone databases without kicking users off. However,
this isn't a schema-only copy, but naturally contains data as well.

So that function above would be like pg_start_backup() except that we
wouldn't prevent writes to other stable databases, just the candidate
one. I personally don't see how this could work with the way we
currently replay WAL files.

But the idea is that it would create the manifest file like
attach_database() except it wouldn't detach the database, and users
could continue to use the database.

Then the user could copy away the database directory elsewhere, then call:

SELECT pg_stop_clone('my_database');

And in theory the user could rename the copy of the directory and move
it back and reattach it as an identical copy (or identical at the time
of copying it).

The most obvious problems I see are related to permissions. All the
object level permissions would have to exist in the destination
database (which is fine if you're copying it to the same cluster), but
even if they do exist, I suspect the OIDs of those users would need to
be the same. Then there's extensions and collations. If either of
those are unavailable on the destination cluster, it couldn't work
either. But then some kind of error message communicating the missing
component could be returned when attempting to attach it.

Also I don't know how you could pause WAL replay for an individual database.

But use cases for this would be:
- Backing up select databases for quick re-attachment without lengthy restores
- User running out of disk space and wants to move large databases to
another server quickly
- A large static database is wanted on another server
- A copy of a large database is wanted on the same cluster

It's just a vague idea, and I'm kind of expecting responses to begin
with "Well for a start, this couldn't possible begin to work
because..." but that's par for the course. ;)

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detach/attach database
Date: 2011-11-13 15:26:32
Message-ID: CA+U5nMJt=ndva-Rx5u8SjJhZfmX_qwhtfChDB34K1005t6aO_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 13, 2011 at 1:13 PM, Thom Brown <thom(at)linux(dot)com> wrote:

> I don't know if this has been discussed before, but would it be
> feasible to introduce the ability to detach and attach databases? (if
> you're thinking "stop right there" skip to the end)  What I had in
> mind would be to do something like the following:

That would be better done at the tablespace level, and then the
feature becomes "transportable tablespaces". Which seems like a good
and useful idea to me.

> You may now be able to infer where this notion came from, when someone
> asked if you can clone databases without kicking users off.  However,
> this isn't a schema-only copy, but naturally contains data as well.

The OP wanted to do this without freezing activity on the database,
which is not easy...

OTOH we can do a backup of just a single database and then filter
recovery at database level to produce just a single copy of another
database on its own server, if anyone wanted that.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Thom Brown <thom(at)linux(dot)com>
Subject: Re: Detach/attach database
Date: 2011-11-13 15:57:35
Message-ID: 2918196.bD61AKbugE@alap2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday, November 13, 2011 13:13:11 Thom Brown wrote:
> Hi,
>
> I don't know if this has been discussed before, but would it be
> feasible to introduce the ability to detach and attach databases? (if
> you're thinking "stop right there" skip to the end)
> It's just a vague idea, and I'm kind of expecting responses to begin
> with "Well for a start, this couldn't possible begin to work
> because..." but that's par for the course. ;)
You would have to do quite some surgery because of oids from shared tables. I
don't think thats easily dooable.

Andres


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thom Brown <thom(at)linux(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detach/attach database
Date: 2011-11-13 16:42:35
Message-ID: 6870.1321202555@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thom Brown <thom(at)linux(dot)com> writes:
> I don't know if this has been discussed before, but would it be
> feasible to introduce the ability to detach and attach databases? (if
> you're thinking "stop right there" skip to the end)

... skipping ...

> It's just a vague idea, and I'm kind of expecting responses to begin
> with "Well for a start, this couldn't possible begin to work
> because..." but that's par for the course. ;)

The main reason this doesn't work is XID management.

It's barely possible you could make it work if you first locked all
other sessions out of the DB and then froze every XID in the database,
but that's a sufficiently heavyweight operation to make it of dubious
value.

You'd also have to think of some way to ensure that page LSNs in the
database are lower than the current WAL endpoint in the receiver.

The other thing I'd be concerned about is inconsistency with the global
system catalogs in the receiving installation. Consider roles for
example: the receiver might not have the same set of roles, probably
wouldn't have the same OIDs for those roles, and definitely would be
missing the pg_shdepend entries that describe which objects in the
transported database are owned by which roles.

regards, tom lane


From: Thom Brown <thom(at)linux(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detach/attach database
Date: 2011-11-13 17:22:17
Message-ID: CAA-aLv6+cYUtJ8kZ7oTg5KVi8pGp7vr0B44Meih0awugW5gbQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13 November 2011 16:42, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thom Brown <thom(at)linux(dot)com> writes:
>> I don't know if this has been discussed before, but would it be
>> feasible to introduce the ability to detach and attach databases? (if
>> you're thinking "stop right there" skip to the end)
>
> ... skipping ...
>
>> It's just a vague idea, and I'm kind of expecting responses to begin
>> with "Well for a start, this couldn't possible begin to work
>> because..." but that's par for the course. ;)
>
> The main reason this doesn't work is XID management.
>
> It's barely possible you could make it work if you first locked all
> other sessions out of the DB and then froze every XID in the database,
> but that's a sufficiently heavyweight operation to make it of dubious
> value.
>
> You'd also have to think of some way to ensure that page LSNs in the
> database are lower than the current WAL endpoint in the receiver.
>
> The other thing I'd be concerned about is inconsistency with the global
> system catalogs in the receiving installation.  Consider roles for
> example: the receiver might not have the same set of roles, probably
> wouldn't have the same OIDs for those roles, and definitely would be
> missing the pg_shdepend entries that describe which objects in the
> transported database are owned by which roles.

I feared such a non-traversable terrain would prevent it being
possible. Oh well. Thanks for the explanation though.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Thom Brown <thom(at)linux(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detach/attach database
Date: 2011-11-14 09:55:33
Message-ID: CAA-aLv7G96LwbL27vqdh-hsNHjdSsM2WWb_ahy2EKnRbDck3TA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13 November 2011 15:26, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Sun, Nov 13, 2011 at 1:13 PM, Thom Brown <thom(at)linux(dot)com> wrote:
>
>> I don't know if this has been discussed before, but would it be
>> feasible to introduce the ability to detach and attach databases? (if
>> you're thinking "stop right there" skip to the end)  What I had in
>> mind would be to do something like the following:
>
> That would be better done at the tablespace level, and then the
> feature becomes "transportable tablespaces". Which seems like a good
> and useful idea to me.

I've been trying to think why the tablespace equivalent would be
better but can't see it. The reason for detaching a database would be
that you want do so something with an entire related set of data. A
tablespace can contain just indexes, or a few tables from several
databases.

>> You may now be able to infer where this notion came from, when someone
>> asked if you can clone databases without kicking users off.  However,
>> this isn't a schema-only copy, but naturally contains data as well.
>
> The OP wanted to do this without freezing activity on the database,
> which is not easy...
>
> OTOH we can do a backup of just a single database and then filter
> recovery at database level to produce just a single copy of another
> database on its own server, if anyone wanted that.

Filtering recovery sounds very tricky to me. And it's the global
objects part which makes things extra difficult. But the whole idea I
was seeking sounds riddled with holes anyway, but glad I can at least
put it from my mind.

So can I humbly request we completely re-architect the whole of
PostgreSQL to fit this feature? Thanks.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detach/attach database
Date: 2011-11-14 13:32:17
Message-ID: CA+TgmoapeErUmPqeWrPhwr83VG+DKTC6rkg7D9GKz1Npjs+Bqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 14, 2011 at 4:55 AM, Thom Brown <thom(at)linux(dot)com> wrote:
> So can I humbly request we completely re-architect the whole of
> PostgreSQL to fit this feature?  Thanks.

Heh.

I have to admit I've thought about this from time to time, and it
would be pretty cool. I was initially thinking that it wouldn't be
that difficult to do this on a per-database level, because if you
slurp up a whole database then by definition you're also including the
system catalogs, which means that you have the pg_class, pg_attribute,
and pg_type entries that are necessary to interpret the table
contents. If you do anything more fine-grained (per-tablespace,
per-table, or whatever) then things get much more complex, but at the
database level you only need to worry about interactions with other
globals: tablespace and role definitions. And we could probably write
code to grovel through the system catalogs for a newly "mounted"
database and do search and replace on the appropriate columns, to map
from the old OIDs to the new ones. It wouldn't be simple, but I think
it could be done.

But Tom's point about XIDs and LSNs seems like it kind of puts a
bullet through the heart of the whole idea. Now, before you can move
the database (or table, or whatever) between clusters, you've got to
rewrite all the data files to freeze XIDs and, I don't know, zero out
LSNs, or something. And if you're going to rewrite all the data, then
you've pretty much lost all the benefit of doing this in the first
place. In fact, it might end up being *slower* than a dump and
restore; even an uncompressed dump will be smaller than the on-disk
footprint of the original database, and many dumps compress quite
well.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Thom Brown <thom(at)linux(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detach/attach database
Date: 2011-11-14 15:05:28
Message-ID: CAA-aLv5QbJZjegLYLUnkEvLZoj3Q5PQB9sWS42+KEVLKjj6TvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 14 November 2011 13:32, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> But Tom's point about XIDs and LSNs seems like it kind of puts a
> bullet through the heart of the whole idea.
>

What about having database-level XIDs rather than cluster-level? Is that
remotely feasible?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detach/attach database
Date: 2011-11-14 15:07:28
Message-ID: CA+TgmoZ4SiVm7f8MSSvC6RvkeTXTSUPLLvz3FJWOyySdUkDpng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 14, 2011 at 10:05 AM, Thom Brown <thom(at)linux(dot)com> wrote:
> On 14 November 2011 13:32, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>> But Tom's point about XIDs and LSNs seems like it kind of puts a
>> bullet through the heart of the whole idea.
>
> What about having database-level XIDs rather than cluster-level?  Is that
> remotely feasible?

Maybe. You'd need a set separate set for shared catalogs, too. It
seems like a heck of a lot of work, though, especially since (IME,
anyway) most people only really one run one database per cluster.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Thom Brown <thom(at)linux(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detach/attach database
Date: 2011-11-14 15:12:53
Message-ID: CAA-aLv6kpWtQCViwpmZ45mYFZSRCURTGJt48zLkb+4k+YmSvTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 14 November 2011 15:07, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, Nov 14, 2011 at 10:05 AM, Thom Brown <thom(at)linux(dot)com> wrote:
> > On 14 November 2011 13:32, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >>
> >> But Tom's point about XIDs and LSNs seems like it kind of puts a
> >> bullet through the heart of the whole idea.
> >
> > What about having database-level XIDs rather than cluster-level? Is that
> > remotely feasible?
>
> Maybe. You'd need a set separate set for shared catalogs, too. It
> seems like a heck of a lot of work, though, especially since (IME,
> anyway) most people only really one run one database per cluster.
>

Thought it would be a lot of work. Well one benefit I could potentially
see is paving the way for per-database replication. But I'll let this
dream go as it's clearly not something to realistically pursue.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detach/attach database
Date: 2011-11-14 22:46:18
Message-ID: 201111142246.pAEMkIB25158@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> But Tom's point about XIDs and LSNs seems like it kind of puts a
> bullet through the heart of the whole idea. Now, before you can move
> the database (or table, or whatever) between clusters, you've got to
> rewrite all the data files to freeze XIDs and, I don't know, zero out
> LSNs, or something. And if you're going to rewrite all the data, then
> you've pretty much lost all the benefit of doing this in the first
> place. In fact, it might end up being *slower* than a dump and
> restore; even an uncompressed dump will be smaller than the on-disk
> footprint of the original database, and many dumps compress quite
> well.

These are the same limitations pg_upgrade has, except it freezes the
system tables of the new cluster (very small) and then moves the clog
files from the old cluster to the new cluster to match the user files.
No way to really merge two different cluster clog files.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +