pg_dump directory archive format / parallel pg_dump

Lists: pgsql-hackers
From: Joachim Wieland <joe(at)mcknight(dot)de>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-07 20:18:26
Message-ID: AANLkTint+8edwviyF5HBExYsg0cjeQ3oU_of41WG+z7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here's a new series of patches for the parallel dump/restore. They need to be
applied on top of each other.

The parallel pg_dump patch does not yet use the synchronized snapshot
functionality from my other patch to not create more dependencies than
necessary.

(1) pg_dump directory archive format (without checks as requested by Heikki)
(2) parallel pg_dump
(3) checks for the directory archive format

Joachim

Attachment Content-Type Size
pg_dump-directory.diff.gz application/x-gzip 10.6 KB
pg_dump-directory-parallel.diff.gz application/x-gzip 27.2 KB
pg_dump-directory-parallel-checks.diff.gz application/x-gzip 7.4 KB

From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-17 22:38:10
Message-ID: AANLkTi=KQkzr4RD7TGwWQRwSYrcZGXEqi9sU3LCZFh=N@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 7, 2011 at 3:18 PM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> Here's a new series of patches for the parallel dump/restore. They need to be
> applied on top of each other.
>

This one is the last version of this patch? if so, commitfest app
should be updated to reflect that

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL


From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-19 05:45:00
Message-ID: AANLkTikDUdoSfNChk7D5WJFV=99wjsE5s-aK0O=nskWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 17, 2011 at 5:38 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> This one is the last version of this patch? if so, commitfest app
> should be updated to reflect that

Here are the latest patches all of them also rebased to current HEAD.
Will update the commitfest app as well.

Joachim

Attachment Content-Type Size
pg_dump-directory.diff.gz application/x-gzip 10.5 KB
pg_dump-directory-parallel.diff.gz application/x-gzip 28.6 KB
pg_dump-directory-parallel-checks.diff.gz application/x-gzip 7.3 KB

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-19 12:47:12
Message-ID: 4D36DD50.7090804@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19.01.2011 07:45, Joachim Wieland wrote:
> On Mon, Jan 17, 2011 at 5:38 PM, Jaime Casanova<jaime(at)2ndquadrant(dot)com> wrote:
>> This one is the last version of this patch? if so, commitfest app
>> should be updated to reflect that
>
> Here are the latest patches all of them also rebased to current HEAD.
> Will update the commitfest app as well.

What's the idea of storing the file sizes in the toc file? It looks like
it's not used for anything.

It would be nice to have this format match the tar format. At the
moment, there's a couple of cosmetic differences:

* TOC file is called "TOC", instead of "toc.dat"

* blobs TOC file is called "BLOBS.TOC" instead of "blobs.toc"

* each blob is stored as "blobs/<oid>.dat", instead of "blob_<oid>.dat"

The only significant difference is that in the directory archive format,
each data file has a header in the beginning.

What are the benefits of the data file header? Would it be better to
leave it out, so that the format would be identical to the tar format?
You could then just tar up the directory to get a tar archive, or vice
versa.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-19 14:01:46
Message-ID: AANLkTikqrGJ0zq9Vw34-4T+70EeRpVOZt=cwAe-mzDX-@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 19, 2011 at 7:47 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Here are the latest patches all of them also rebased to current HEAD.
>> Will update the commitfest app as well.
>
> What's the idea of storing the file sizes in the toc file? It looks like
> it's not used for anything.

It's part of the overall idea to make sure files are not inadvertently
exchanged between different backups and that a file is not truncated.
In the future I'd also like to add a checksum to the TOC so that a
backup can be checked for integrity. This will cost performance but
with the parallel backup it can be distributed to several processors.

> It would be nice to have this format match the tar format. At the moment,
> there's a couple of cosmetic differences:
>
> * TOC file is called "TOC", instead of "toc.dat"
>
> * blobs TOC file is called "BLOBS.TOC" instead of "blobs.toc"
>
> * each blob is stored as "blobs/<oid>.dat", instead of "blob_<oid>.dat"

That can be done easily...

> The only significant difference is that in the directory archive format,
> each data file has a header in the beginning.

> What are the benefits of the data file header? Would it be better to leave
> it out, so that the format would be identical to the tar format? You could
> then just tar up the directory to get a tar archive, or vice versa.

The header is there to identify a file, it contains the header that
every other pgdump file contains, including the internal version
number and the unique backup id.

The tar format doesn't support compression so going from one to the
other would only work for an uncompressed archive and special care
must be taken to get the order of the tar file right.

If you want to drop the header altogether, fine with me but if it's
just for the tar <-> directory conversion, then I am failing to see
what the use case of that would be.

A tar archive has the advantage that you can postprocess the dump data
with other tools but for this we could also add an option that gives
you only the data part of a dump file (and uncompresses it at the same
time if compressed). Once we have that however, the question is what
anybody would then still want to use the tar format for...

Joachim


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-20 11:07:32
Message-ID: 4D381774.8010407@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19.01.2011 16:01, Joachim Wieland wrote:
> On Wed, Jan 19, 2011 at 7:47 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> Here are the latest patches all of them also rebased to current HEAD.
>>> Will update the commitfest app as well.
>>
>> What's the idea of storing the file sizes in the toc file? It looks like
>> it's not used for anything.
>
> It's part of the overall idea to make sure files are not inadvertently
> exchanged between different backups and that a file is not truncated.
> In the future I'd also like to add a checksum to the TOC so that a
> backup can be checked for integrity. This will cost performance but
> with the parallel backup it can be distributed to several processors.

Ok. I'm going to leave out the filesize. I can see some value in that,
and the CRC, but I don't want to add stuff that's not used at this point.

>> It would be nice to have this format match the tar format. At the moment,
>> there's a couple of cosmetic differences:
>>
>> * TOC file is called "TOC", instead of "toc.dat"
>>
>> * blobs TOC file is called "BLOBS.TOC" instead of "blobs.toc"
>>
>> * each blob is stored as "blobs/<oid>.dat", instead of "blob_<oid>.dat"
>
> That can be done easily...
>
>> The only significant difference is that in the directory archive format,
>> each data file has a header in the beginning.
>
>> What are the benefits of the data file header? Would it be better to leave
>> it out, so that the format would be identical to the tar format? You could
>> then just tar up the directory to get a tar archive, or vice versa.
>
> The header is there to identify a file, it contains the header that
> every other pgdump file contains, including the internal version
> number and the unique backup id.
>
> The tar format doesn't support compression so going from one to the
> other would only work for an uncompressed archive and special care
> must be taken to get the order of the tar file right.

Hmm, tar format doesn't support compression, but looks like the file
format issue has been thought of already: there's still code there to
add .gz suffix for compressed files. How about adopting that convention
in the directory format too? That would make an uncompressed directory
format compatible with the tar format.

That seems pretty attractive anyway, because you can then dump to a
directory, and manually gzip the data files later.

Now that we have an API for compression in compress_io.c, it probably
wouldn't be very hard to implement the missing compression support to
tar format either.

> If you want to drop the header altogether, fine with me but if it's
> just for the tar<-> directory conversion, then I am failing to see
> what the use case of that would be.
>
> A tar archive has the advantage that you can postprocess the dump data
> with other tools but for this we could also add an option that gives
> you only the data part of a dump file (and uncompresses it at the same
> time if compressed). Once we have that however, the question is what
> anybody would then still want to use the tar format for...

I don't know how popular it'll be in practice, but it seems very nice to
me if you can do things like parallel pg_dump in directory format first,
and then tar it up to a file for archival.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-20 13:46:28
Message-ID: AANLkTi=JnfTh5STJiSsJUiQBcF-4T_qXmKZpxgwx4yVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 6:07 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> It's part of the overall idea to make sure files are not inadvertently
>> exchanged between different backups and that a file is not truncated.
>> In the future I'd also like to add a checksum to the TOC so that a
>> backup can be checked for integrity. This will cost performance but
>> with the parallel backup it can be distributed to several processors.
>
> Ok. I'm going to leave out the filesize. I can see some value in that, and
> the CRC, but I don't want to add stuff that's not used at this point.

Okay.

>> The header is there to identify a file, it contains the header that
>> every other pgdump file contains, including the internal version
>> number and the unique backup id.
>>
>> The tar format doesn't support compression so going from one to the
>> other would only work for an uncompressed archive and special care
>> must be taken to get the order of the tar file right.
>
> Hmm, tar format doesn't support compression, but looks like the file format
> issue has been thought of already: there's still code there to add .gz
> suffix for compressed files. How about adopting that convention in the
> directory format too? That would make an uncompressed directory format
> compatible with the tar format.

So what you could do is dump in the tar format, untar and restore in
the directory format. I see that this sounds nice but still I am not
sure why someone would dump to the tar format in the first place.

But you still cannot go back from the directory archive to the tar
archive because the standard command line tar will not respect the
order of the objects that pg_restore expects in a tar format, right?

> That seems pretty attractive anyway, because you can then dump to a
> directory, and manually gzip the data files later.

The command line gzip will probably add its own header to the file
that pg_restore would need to strip off...

This is a valid use case for people who are concerned with a fast
dump, usually they would dump uncompressed and later compress the
archive. However once we have parallel pg_dump, this advantage
vanishes.

> Now that we have an API for compression in compress_io.c, it probably
> wouldn't be very hard to implement the missing compression support to tar
> format either.

True, but the question to the advantage of the tar format remains :-)

>> A tar archive has the advantage that you can postprocess the dump data
>> with other tools  but for this we could also add an option that gives
>> you only the data part of a dump file (and uncompresses it at the same
>> time if compressed). Once we have that however, the question is what
>> anybody would then still want to use the tar format for...
>
> I don't know how popular it'll be in practice, but it seems very nice to me
> if you can do things like parallel pg_dump in directory format first, and
> then tar it up to a file for archival.

Yes, but you cannot pg_restore the archive then if it was created with
standard tar, right?

Joachim


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-20 15:22:01
Message-ID: 4D385319.1060005@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20.01.2011 15:46, Joachim Wieland wrote:
> On Thu, Jan 20, 2011 at 6:07 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> The header is there to identify a file, it contains the header that
>>> every other pgdump file contains, including the internal version
>>> number and the unique backup id.
>>>
>>> The tar format doesn't support compression so going from one to the
>>> other would only work for an uncompressed archive and special care
>>> must be taken to get the order of the tar file right.
>>
>> Hmm, tar format doesn't support compression, but looks like the file format
>> issue has been thought of already: there's still code there to add .gz
>> suffix for compressed files. How about adopting that convention in the
>> directory format too? That would make an uncompressed directory format
>> compatible with the tar format.
>
> So what you could do is dump in the tar format, untar and restore in
> the directory format. I see that this sounds nice but still I am not
> sure why someone would dump to the tar format in the first place.

I'm not sure either. Maybe you want to pipe the output of "pg_dump -F t"
via an ssh tunnel to another host, where you untar it, producing a
directory format dump. You can then edit the directory format dump, and
restore it back to the database without having to tar it again.

It gives you a lot of flexibility if the formats are compatible, which
is generally good.

> But you still cannot go back from the directory archive to the tar
> archive because the standard command line tar will not respect the
> order of the objects that pg_restore expects in a tar format, right?

Hmm, I didn't realize pg_restore requires the files to be in certain
order in the tar file. There's no mention of that in the docs either, we
should add that. It doesn't actually require that if you read from a
file, but from stdin it does.

You can put files in the archive in a certain order if you list them
explicitly in the tar command line, like "tar cf backup.tar toc.dat
...". It's hard to know the right order, though. In practice you would
need to do "tar tf backup.tar >files" before untarring, and use "files"
to tar them again in the rightorder.

>> That seems pretty attractive anyway, because you can then dump to a
>> directory, and manually gzip the data files later.
>
> The command line gzip will probably add its own header to the file
> that pg_restore would need to strip off...

Yeah, we should write the header too. That's not hard, e.g gzopen will
do that automatically, or you can pass a flag to deflateInit2.

>>> A tar archive has the advantage that you can postprocess the dump data
>>> with other tools but for this we could also add an option that gives
>>> you only the data part of a dump file (and uncompresses it at the same
>>> time if compressed). Once we have that however, the question is what
>>> anybody would then still want to use the tar format for...
>>
>> I don't know how popular it'll be in practice, but it seems very nice to me
>> if you can do things like parallel pg_dump in directory format first, and
>> then tar it up to a file for archival.
>
> Yes, but you cannot pg_restore the archive then if it was created with
> standard tar, right?

See above, you can unless you try to pipe it to pg_restore. In fact,
that's listed as an advantage of the tar format over other formats in
the pg_dump documentation.

(I'm working on this, no need to submit a new patch)

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-20 15:34:16
Message-ID: 82D5BF05-4FB0-437F-8025-8697538CBE92@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan20, 2011, at 16:22 , Heikki Linnakangas wrote:
> You can put files in the archive in a certain order if you list them explicitly in the tar command line, like "tar cf backup.tar toc.dat ...". It's hard to know the right order, though. In practice you would need to do "tar tf backup.tar >files" before untarring, and use "files" to tar them again in the rightorder.

Hm, could we create a file in the backup directory which lists the files in the right order?

best regards,
Florian Pflug


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-21 09:41:27
Message-ID: 4D3954C7.9060503@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20.01.2011 17:22, Heikki Linnakangas wrote:
> (I'm working on this, no need to submit a new patch)

Ok, here's a heavily refactored version of this (also available at
git://git.postgresql.org/git/users/heikki/postgres.git, branch
pg_dump_directory). The directory format is now identical to the tar
format, except that in the directory format the files can be compressed.
Also we don't write the restore.sql file - it would be nice to have, but
pg_restore doesn't require it. We can leave that as a TODO.

I ended up writing another compression abstraction layer in
compress_io.c. It wraps fopen / gzopen etc. in a common API, so that the
caller doesn't need to care if the file is compressed or not. In
hindsight, the compression API we put in earlier didn't suit us very
well. But I guess it wasn't a complete waste, as it moved the gory
details of zlib out of the custom format code.

If compression is used, the files are created with the .gz suffix, and
include the gzip header so that you can manipulate them easily with
gzip/gunzip utilities. When reading, we accept files with or without the
.gz suffix, and you can have some files compressed and others uncompressed.

I haven't updated the documentation yet.

There's one UI thing that bothers me. The option to specify the target
directory is called --file. But it's clearly not a file. OTOH, I'd hate
to introduce a parallel --dir option just for this. Any thoughts on this?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
pg_dump_directory-2.patch text/x-diff 39.4 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-21 13:35:37
Message-ID: AANLkTi=pFPJO8E5EkdoJET9jc+yq5c2d3VDqf2d9e6Qt@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 21, 2011 at 4:41 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> There's one UI thing that bothers me. The option to specify the target
> directory is called --file. But it's clearly not a file. OTOH, I'd hate to
> introduce a parallel --dir option just for this. Any thoughts on this?

If we were starting over, I'd probably suggest calling the option -o,
--output. But since -o is already taken (for --oids) I'd be inclined
to just make the help text read:

-f, --file=FILENAME output file (or directory) name
-F, --format=c|t|p|d output file format (custom, tar, text, dir)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-21 15:34:42
Message-ID: 4D39A792.80306@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21.01.2011 15:35, Robert Haas wrote:
> On Fri, Jan 21, 2011 at 4:41 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> There's one UI thing that bothers me. The option to specify the target
>> directory is called --file. But it's clearly not a file. OTOH, I'd hate to
>> introduce a parallel --dir option just for this. Any thoughts on this?
>
> If we were starting over, I'd probably suggest calling the option -o,
> --output. But since -o is already taken (for --oids) I'd be inclined
> to just make the help text read:
>
> -f, --file=FILENAME output file (or directory) name
> -F, --format=c|t|p|d output file format (custom, tar, text, dir)

Ok, that's exactly what the patch does now. I guess it's fine then.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-21 15:47:07
Message-ID: 4D39AA7B.6050001@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/21/2011 10:34 AM, Heikki Linnakangas wrote:
> On 21.01.2011 15:35, Robert Haas wrote:
>> On Fri, Jan 21, 2011 at 4:41 AM, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> There's one UI thing that bothers me. The option to specify the target
>>> directory is called --file. But it's clearly not a file. OTOH, I'd
>>> hate to
>>> introduce a parallel --dir option just for this. Any thoughts on this?
>>
>> If we were starting over, I'd probably suggest calling the option -o,
>> --output. But since -o is already taken (for --oids) I'd be inclined
>> to just make the help text read:
>>
>> -f, --file=FILENAME output file (or directory) name
>> -F, --format=c|t|p|d output file format (custom, tar, text,
>> dir)
>
> Ok, that's exactly what the patch does now. I guess it's fine then.
>

Maybe we could change the hint to say "--file=DESTINATION" or
"--file=FILENAME|DIRNAME" ?

Just a thought.

cheers

andrew


From: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-21 17:11:14
Message-ID: 4D39BE32.2080801@timbira.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Em 21-01-2011 12:47, Andrew Dunstan escreveu:
> Maybe we could change the hint to say "--file=DESTINATION" or
> "--file=FILENAME|DIRNAME" ?
>
... "--file=OUTPUT" or "--file=OUTPUTNAME".

--
Euler Taveira de Oliveira
http://www.timbira.com/


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Joachim Wieland <joe(at)mcknight(dot)de>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-23 21:20:07
Message-ID: 4D3C9B87.1040208@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21.01.2011 19:11, Euler Taveira de Oliveira wrote:
> Em 21-01-2011 12:47, Andrew Dunstan escreveu:
>> Maybe we could change the hint to say "--file=DESTINATION" or
>> "--file=FILENAME|DIRNAME" ?
>>
> ... "--file=OUTPUT" or "--file=OUTPUTNAME".

Ok, works for me.

I've committed this patch now, with a whole bunch of further fixes.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-01-30 22:26:12
Message-ID: AANLkTikQni7ya7zpkU1YCuW8+AjC1GOtitbHwsvN3imv@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 19, 2011 at 12:45 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> On Mon, Jan 17, 2011 at 5:38 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> This one is the last version of this patch? if so, commitfest app
>> should be updated to reflect that
>
> Here are the latest patches all of them also rebased to current HEAD.
> Will update the commitfest app as well.

The parallel pg_dump portion of this patch (i.e. the still-uncommitted
part) no longer applies. Please rebase.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-02 04:32:22
Message-ID: AANLkTi=rLALqouUcibs5+R-kjkW+FJxXujZF4Ow2nu3X@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jan 30, 2011 at 5:26 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The parallel pg_dump portion of this patch (i.e. the still-uncommitted
> part) no longer applies.  Please rebase.

Here is a rebased version with some minor changes as well. I haven't
tested it on Windows now but will do so as soon as the Unix part has
been reviewed.

Joachim

Attachment Content-Type Size
parallel_pg_dump.patch.gz application/x-gzip 29.6 KB

From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-04 04:46:12
Message-ID: AANLkTin2ce85MeFLT558rrSCVjTDATDuVJrf1XJA-2+2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 2, 2011 at 13:32, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> Here is a rebased version with some minor changes as well.

I read the patch works as below. Am I understanding correctly?
1. Open all connections in a parent process.
2. Start transactions for each connection in the parent.
3. Spawn child processes with fork().
4. Each child process uses one of the inherited connections.

I think we have 2 important technical issues here:
* The consistency is not perfect. Each transaction is started
with small delays in step 1, but we cannot guarantee no other
transaction between them.
* Can we inherit connections to child processes with fork() ?
Moreover, we also need to pass running transactions to children.
I wonder libpq is designed for such usage.

To solve both issues, we might want a way to control visibility
in a database server instead of client programs. Don't we need
server-side support like [1] before developing parallel dump?
[1] http://wiki.postgresql.org/wiki/ClusterFeatures#Export_snapshots_to_other_sessions

> I haven't
> tested it on Windows now but will do so as soon as the Unix part has
> been reviewed.

It might be better to remove Windows-specific codes from the first try.
I doubt Windows message queue is the best API in such console-based
application. I hope we could use the same implementation for all
platforms for inter-process/thread communication.

--
Itagaki Takahiro


From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-05 03:50:16
Message-ID: AANLkTimvk1PLbpj63QVSq7zhQyT80ssDVg2TOn577oDR@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 3, 2011 at 11:46 PM, Itagaki Takahiro
<itagaki(dot)takahiro(at)gmail(dot)com> wrote:
> I think we have 2 important technical issues here:
>  * The consistency is not perfect. Each transaction is started
>   with small delays in step 1, but we cannot guarantee no other
>   transaction between them.

This is exactly where the patch for synchronized snapshot comes into
the game. See https://commitfest.postgresql.org/action/patch_view?id=480

>  * Can we inherit connections to child processes with fork() ?
>   Moreover, we also need to pass running transactions to children.
>   I wonder libpq is designed for such usage.

As far as I know you can inherit sockets to a child program, as long
as you make sure that after the fork only one, father or child, uses
the socket, the other one should close it. But this wouldn't be a
matter with the above mentioned patch anyway.

> It might be better to remove Windows-specific codes from the first try.
> I doubt Windows message queue is the best API in such console-based
> application. I hope we could use the same implementation for all
> platforms for inter-process/thread communication.

Windows doesn't support pipes, but offers the message queues to
exchange messages. Parallel pg_dump only exchanges messages in the
form of "DUMP 39209" or "RESTORE OK 48 23 93", it doesn't exchange any
large chunks of binary data, just these small textual messages. The
messages also stay within the same process, they are just sent between
the different threads. The windows part worked just fine when I tested
it last time. Do you have any other technology in mind that you think
is better suited?

Joachim


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-05 06:47:06
Message-ID: AANLkTi=gCG2H1Qup3bJ_OJLuEYcyi5_RxYxcnWsCXbPc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 5, 2011 at 04:50, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> On Thu, Feb 3, 2011 at 11:46 PM, Itagaki Takahiro
> <itagaki(dot)takahiro(at)gmail(dot)com> wrote:
>> It might be better to remove Windows-specific codes from the first try.
>> I doubt Windows message queue is the best API in such console-based
>> application. I hope we could use the same implementation for all
>> platforms for inter-process/thread communication.
>
> Windows doesn't support pipes, but offers the message queues to
> exchange messages. Parallel pg_dump only exchanges messages in the
> form of "DUMP 39209" or "RESTORE OK 48 23 93", it doesn't exchange any
> large chunks of binary data, just these small textual messages. The
> messages also stay within the same process, they are just sent between
> the different threads. The windows part worked just fine when I tested
> it last time. Do you have any other technology in mind that you think
> is better suited?

Haven't been following this thread in details or read the code.. But
our /port directory contains a pipe() implementation for Windows,
that's used for the syslogger at least. Look in the code for pgpipe().
If using that one works, then that should probably be used rather than
something completely custom.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-06 19:12:22
Message-ID: AANLkTimYX5h1UVZABr5qWbUE7H=VeX2sb84bnVZqugoh@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 1, 2011 at 11:32 PM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> On Sun, Jan 30, 2011 at 5:26 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> The parallel pg_dump portion of this patch (i.e. the still-uncommitted
>> part) no longer applies.  Please rebase.
>
> Here is a rebased version with some minor changes as well. I haven't
> tested it on Windows now but will do so as soon as the Unix part has
> been reviewed.
>

code review:

something i found, and is a very simple one, is this warning (there's
a similar issue in _StartMasterParallel with the buf variable)
"""
pg_backup_directory.c: In function ‘_EndMasterParallel’:
pg_backup_directory.c:856: warning: ‘status’ may be used uninitialized
in this function
"""

i guess the huge amount of info is showing the patch is just for
debugging and will be removed before commit, right?

functional review:

it works good most of the time, just a few points:
- if i interrupt the process the connections stay, i guess it could
catch the signal and finish the connections
- if i have an exclusive lock on a table and a worker starts dumping
it, it fails because it can't take the lock but it just say "it was
ok" and would prefer an error

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-07 06:51:56
Message-ID: AANLkTimF9jrAcyK42ALSAKk8y3EYP1VX5jk8o02vc0rQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 6, 2011 at 2:12 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Tue, Feb 1, 2011 at 11:32 PM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
>> On Sun, Jan 30, 2011 at 5:26 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> The parallel pg_dump portion of this patch (i.e. the still-uncommitted
>>> part) no longer applies.  Please rebase.
>>
>> Here is a rebased version with some minor changes as well. I haven't
>> tested it on Windows now but will do so as soon as the Unix part has
>> been reviewed.
>>
>
> code review:
>

ah! two other things i forget:

- there is no docs
- pg_dump and pg_restore are inconsistent:
pg_dump requires the directory to be provided with the -f option:
pg_dump -Fd -f dir_dump
pg_restore pass the directory as an argument for -Fd: pg_restore -Fd dir_dump

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL


From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-08 03:42:50
Message-ID: AANLkTi=Z60kBAWGgBmpUovXbLOxSV-CcPxsOHWFoFs7u@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Jaime,

thanks for your review!

On Sun, Feb 6, 2011 at 2:12 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> code review:
>
> something i found, and is a very simple one, is this warning (there's
> a similar issue in _StartMasterParallel with the buf variable)
> """
> pg_backup_directory.c: In function ‘_EndMasterParallel’:
> pg_backup_directory.c:856: warning: ‘status’ may be used uninitialized
> in this function
> """

Cool. My compiler didn't tell me about this.

> i guess the huge amount of info is showing the patch is just for
> debugging and will be removed before commit, right?

That's right.

> functional review:
>
> it works good most of the time, just a few points:
> - if i interrupt the process the connections stay, i guess it could
> catch the signal and finish the connections

Hm, well, recovering gracefully out of errors could be improved. In
your example you would signal the children implicitly because the
parent process dies and the pipes to the children would get broken as
well. Of course the parent could more actively terminate the children
but it might not be the best option to just kill them, as then there
will be a lot of "unexpected EOF" connections in the log. So if an
error condition comes up in the parent (as in your example, because
you canceled the process), then ideally the parent should signal the
children with a non-lethal signal and the children should catch this
"please terminate" signal and exit cleanly but as soon as possible. If
the error case comes up at the child however, then we'd need to make
sure that the user sees the error message from the child. This should
work well as-is but currently it could happen that the parent exists
before all of the children have exited. I'll investigate this a bit...

> - if i have an exclusive lock on a table and a worker starts dumping
> it, it fails because it can't take the lock but it just say "it was
> ok" and would prefer an error

I'm getting a clear

pg_dump: [Archivierer] could not lock table public.c: ERROR: could
not obtain lock on relation "c"

but I'll look into this as well.

Regarding your other post:

> - there is no docs

True...

> - pg_dump and pg_restore are inconsistent:
> pg_dump requires the directory to be provided with the -f option:
> pg_dump -Fd -f dir_dump
> pg_restore pass the directory as an argument for -Fd: pg_restore -Fd dir_dump

Well, this is there with pg_dump and pg_restore currently as well. -F
is the switch for the format and it just takes "d" as the format. The
dir_dump is an option without any switch.

See the output for the --help switches:

Usage:
pg_dump [OPTION]... [DBNAME]

Usage:
pg_restore [OPTION]... [FILE]

So in either case you don't need to give a switch for what you have.
If you run pg_dump you don't give the switch for the database but you
need to give it for the output (-f) and with pg_restore you don't give
a switch for the file that you're restoring but you'd need to give -d
for restoring to a database.

Joachim


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-08 04:34:07
Message-ID: AANLkTinjxM0ZRsocLLogbGUpbDNbAbtPTSWDYD9B4xcO@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 7, 2011 at 10:42 PM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
>> i guess the huge amount of info is showing the patch is just for
>> debugging and will be removed before commit, right?
>
> That's right.

So how close are we to having a committable version of this? Should
we push this out to 9.2?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-09 01:31:00
Message-ID: AANLkTinCT_EFYBd2+-yO9kuA4a16iXw1U-F5T5BYBagm@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 8, 2011 at 13:34, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> So how close are we to having a committable version of this?  Should
> we push this out to 9.2?

I think so. The feature is pretty attractive, but more works are required:
* Re-base on synchronized snapshots patch
* Consider to use pipe also on Windows.
* Research libpq + fork() issue. We have a warning in docs:
http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html
| On Unix, forking a process with open libpq connections can lead to
unpredictable results

--
Itagaki Takahiro


From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-09 03:54:07
Message-ID: AANLkTi=9SEkMdG0wkEGPCsVh9gd6jYohFXDO6gwL+5vo@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 8, 2011 at 8:31 PM, Itagaki Takahiro
<itagaki(dot)takahiro(at)gmail(dot)com> wrote:
> On Tue, Feb 8, 2011 at 13:34, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> So how close are we to having a committable version of this?  Should
>> we push this out to 9.2?
>
> I think so. The feature is pretty attractive, but more works are required:
>  * Re-base on synchronized snapshots patch
>  * Consider to use pipe also on Windows.
>  * Research libpq + fork() issue. We have a warning in docs:
> http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html
> | On Unix, forking a process with open libpq connections can lead to
> unpredictable results

Just for the records, once the sync snapshot patch is committed, there
is no need to do fancy libpq + fork() combinations anyway.
Unfortunately, so far no committer has commented on the synchronized
snapshot patch at all.

I am not fighting for getting parallel pg_dump done in 9.1, as I don't
really have a personal use case for the patch. However it would be the
irony of the year if we shipped 9.1 with a synchronized snapshot patch
but no parallel dump :-)

Joachim


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump directory archive format / parallel pg_dump
Date: 2011-02-10 15:22:31
Message-ID: AANLkTinaPv4iWeSj+BRXMrEJxf0+y9t8Ym7CUZOz=yX9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 8, 2011 at 10:54 PM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> On Tue, Feb 8, 2011 at 8:31 PM, Itagaki Takahiro
> <itagaki(dot)takahiro(at)gmail(dot)com> wrote:
>> On Tue, Feb 8, 2011 at 13:34, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> So how close are we to having a committable version of this?  Should
>>> we push this out to 9.2?
>>
>> I think so. The feature is pretty attractive, but more works are required:
>>  * Re-base on synchronized snapshots patch
>>  * Consider to use pipe also on Windows.
>>  * Research libpq + fork() issue. We have a warning in docs:
>> http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html
>> | On Unix, forking a process with open libpq connections can lead to
>> unpredictable results
>
> Just for the records, once the sync snapshot patch is committed, there
> is no need to do fancy libpq + fork() combinations anyway.
> Unfortunately, so far no committer has commented on the synchronized
> snapshot patch at all.
>
> I am not fighting for getting parallel pg_dump done in 9.1, as I don't
> really have a personal use case for the patch. However it would be the
> irony of the year if we shipped 9.1 with a synchronized snapshot patch
> but no parallel dump  :-)

True. But it looks like there are some outstanding items from
previous reviews that you've yet to address, which makes pushing it
out seem fairly reasonable...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company