Re: [RFC] Incremental backup v2: add backup profile to base backup

From: Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Incremental backup v2: add backup profile to base backup
Date: 2014-10-06 16:18:43
Message-ID: 5432C0E3.9000201@2ndquadrant.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Il 06/10/14 17:50, Robert Haas ha scritto:
> On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini
> <marco(dot)nenciarini(at)2ndquadrant(dot)it> wrote:
>>> 2. Take a differential backup. In the backup label file, note the LSN
>>> of the fullback to which the differential backup is relative, and the
>>> newest LSN guaranteed to be present in the differential backup. The
>>> actual backup can consist of a series of 20-byte buffer tags, those
>>> being the exact set of blocks newer than the base-backup's
>>> latest-guaranteed-to-be-present LSN. Each buffer tag is followed by
>>> an 8kB block of data. If a relfilenode is truncated or removed, you
>>> need some way to indicate that in the backup; e.g. include a buffertag
>>> with forknum = -(forknum + 1) and blocknum = the new number of blocks,
>>> or InvalidBlockNumber if removed entirely.
>>
>> To have a working backup you need to ship each block which is newer than
>> latest-guaranteed-to-be-present in full backup and not newer than
>> latest-guaranteed-to-be-present in the current backup. Also, as a
>> further optimization, you can think about not sending the empty space in
>> the middle of each page.
>
> Right. Or compressing the data.

If we want to introduce compression on server side, I think that
compressing the whole tar stream would be more effective.

>
>> My main concern here is about how postgres can remember that a
>> relfilenode has been deleted, in order to send the appropriate "deletion
>> tag".
>
> You also need to handle truncation.

Yes, of course. The current backup profile contains the file size, and
it can be used to truncate the file to the right size.

>> IMHO the easiest way is to send the full list of files along the backup
>> and let to the client the task to delete unneeded files. The backup
>> profile has this purpose.
>>
>> Moreover, I do not like the idea of using only a stream of block as the
>> actual differential backup, for the following reasons:
>>
>> * AFAIK, with the current infrastructure, you cannot do a backup with a
>> block stream only. To have a valid backup you need many files for which
>> the concept of LSN doesn't apply.
>>
>> * I don't like to have all the data from the various
>> tablespace/db/whatever all mixed in the same stream. I'd prefer to have
>> the blocks saved on a per file basis.
>
> OK, that makes sense. But you still only need the file list when
> sending a differential backup, not when sending a full backup. So
> maybe a differential backup looks like this:
>
> - Ship a table-of-contents file with a list relation files currently
> present and the length of each in blocks.

Having the size in bytes allow you to use the same format for non-block
files. Am I missing any advantage of having the size in blocks over
having the size in bytes?

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco(dot)nenciarini(at)2ndQuadrant(dot)it | www.2ndQuadrant.it

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-10-06 16:19:03 Re: [RFC] Incremental backup v2: add backup profile to base backup
Previous Message Heikki Linnakangas 2014-10-06 16:17:37 Re: [RFC] Incremental backup v2: add backup profile to base backup