Re: [PATCH] pg_upgrade: support for btrfs copy-on-write clones

From: Oskari Saarenmaa <os(at)ohmu(dot)fi>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Larry Rosenman <ler(at)lerctr(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] pg_upgrade: support for btrfs copy-on-write clones
Date: 2013-10-05 13:57:15
Message-ID: 52501ABB.2080407@ohmu.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

05.10.2013 16:38, Bruce Momjian kirjoitti:
> On Fri, Oct 4, 2013 at 10:42:46PM +0300, Oskari Saarenmaa wrote:
>> Thanks for the offers, but it looks like ZFS doesn't actually implement
>> a similar file level clone operation. See
>> https://github.com/zfsonlinux/zfs/issues/405 for discussion on a feature
>> request for it.
>>
>> ZFS does support cloning entire datasets which seem to be similar to
>> btrfs subvolume snapshots and could be used to set up a new data
>> directory for a new $PGDATA. This would require the original $PGDATA
>> to be a dataset/subvolume of its own and quite a bit different logic
>> (than just another file copy method in pg_upgrade) to initialize the new
>> version's $PGDATA as a snapshot/clone of the original. The way this
>> would work is that the original $PGDATA dataset/subvolume gets cloned to
>> a new location after which we move the files out of the way of the new
>> PG installation and run pg_upgrade in link mode. I'm not sure if
>> there's a good way to integrate this into pg_upgrade or if it's just
>> something that could be documented as a fast way to run pg_upgrade
>> without touching original files.
>>
>> With btrfs tooling the sequence would be something like this:
>>
>> btrfs subvolume snapshot /srv/pg92 /srv/pg93
>> mv /srv/pg93/data /srv/pg93/data92
>> initdb /data/pg93/data
>> pg_upgrade --link \
>> --old-datadir=/data/pg93/data92 \
>> --new-datadir=/data/pg93/data
>
> Does btrfs support file system snapshots? If so, shouldn't people just
> take a snapshot of the old data directory before the ugprade, rather
> than using cloning?

Yeah, it's possible to clone an existing subvolume, but this requires
that $PGDATA is a subvolume of its own and would be a bit difficult to
integrate into existing pg_upgrade scripts.

The BTRFS_IOC_CLONE ioctl operates on file level and can be used to
clone files anywhere in a btrfs filesystem.

/ Oskari

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sameer Thakur 2013-10-05 14:47:45 Re: pg_stat_statements: calls under-estimation propagation
Previous Message Bruce Momjian 2013-10-05 13:38:49 Re: [PATCH] pg_upgrade: support for btrfs copy-on-write clones