Re: directory archive format for pg_dump

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, José Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: directory archive format for pg_dump
Date: 2010-12-16 18:04:54
Message-ID: 4D0A54C6.3090502@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16.12.2010 19:58, Robert Haas wrote:
> On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> One more thing: the motivation behind this patch is to allow parallel
>> pg_dump in the future, so we should be make sure this patch caters well for
>> that.
>>
>> As soon as we have parallel pg_dump, the next big thing is going to be
>> parallel dump of the same table using multiple processes. Perhaps we should
>> prepare for that in the directory archive format, by allowing the data of a
>> single table to be split into multiple files. That way parallel pg_dump is
>> simple, you just split the table in chunks of roughly the same size, say
>> 10GB each, and launch a process for each chunk, writing to a separate file.
>>
>> It should be a quite simple add-on to the current patch, but will make life
>> so much easier for parallel pg_dump. It would also be helpful to work around
>> file size limitations on some filesystems.
>
> Sounds reasonable. Are you planning to do this and commit?

I'll defer to Joachim, assuming he has the time & energy.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2010-12-16 18:12:25 PgEast 2011: NYC CFP
Previous Message Robert Haas 2010-12-16 18:03:14 Re: [PATCH] V3: Idle in transaction cancellation