Re: pg_dump directory archive format / parallel pg_dump

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-19 12:47:12
Message-ID: 4D36DD50.7090804@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19.01.2011 07:45, Joachim Wieland wrote:
> On Mon, Jan 17, 2011 at 5:38 PM, Jaime Casanova<jaime(at)2ndquadrant(dot)com> wrote:
>> This one is the last version of this patch? if so, commitfest app
>> should be updated to reflect that
>
> Here are the latest patches all of them also rebased to current HEAD.
> Will update the commitfest app as well.

What's the idea of storing the file sizes in the toc file? It looks like
it's not used for anything.

It would be nice to have this format match the tar format. At the
moment, there's a couple of cosmetic differences:

* TOC file is called "TOC", instead of "toc.dat"

* blobs TOC file is called "BLOBS.TOC" instead of "blobs.toc"

* each blob is stored as "blobs/<oid>.dat", instead of "blob_<oid>.dat"

The only significant difference is that in the directory archive format,
each data file has a header in the beginning.

What are the benefits of the data file header? Would it be better to
leave it out, so that the format would be identical to the tar format?
You could then just tar up the directory to get a tar archive, or vice
versa.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-01-19 12:55:08 Re: [COMMITTERS] pgsql: Log replication connections only when log_connections is on
Previous Message Magnus Hagander 2011-01-19 12:44:06 Re: [COMMITTERS] pgsql: Log replication connections only when log_connections is on