Quick Links

Re: pg_dump additional options for performance

Lists:	pgsql-hackers

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	pg_dump additional options for performance
Date:	2008-02-06 15:13:24
Message-ID:	1202310804.29242.54.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

pg_dump allows you to specify -s --schema-only, or -a --data-only.

The -s option creates the table, as well as creating constraints and
indexes. These objects need to be dropped prior to loading, if we are to
follow the performance recommendations in the docs. But the only way to
do that is to manually edit the script to produce a cut down script.

So it would be good if we could dump objects in 3 groups
1. all commands required to re-create table
2. data
3. all commands required to complete table after data load

My proposal is to provide two additional modes:
--schema-pre-load corresponding to (1) above
--schema-post-load corresponding to (3) above

This would then allow this sequence of commands

pg_dump --schema-pre-load
pg_dump --data-only
pg_dump --schema-post-load

to be logically equivalent, but faster than

pg_dump --schema-only
pg_dump --data-only

both forms of which are equivalent to just

pg_dump

[Assuming data isn't changing between invocations...]

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Magnus Hagander <magnus(at)hagander(dot)net>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-06 15:41:56
Message-ID:	20080206154156.GK9549@svr2.hagander.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Feb 06, 2008 at 03:13:24PM +0000, Simon Riggs wrote:
> pg_dump allows you to specify -s --schema-only, or -a --data-only.
>
> The -s option creates the table, as well as creating constraints and
> indexes. These objects need to be dropped prior to loading, if we are to
> follow the performance recommendations in the docs. But the only way to
> do that is to manually edit the script to produce a cut down script.
>
> So it would be good if we could dump objects in 3 groups
> 1. all commands required to re-create table
> 2. data
> 3. all commands required to complete table after data load
>
> My proposal is to provide two additional modes:
> --schema-pre-load corresponding to (1) above
> --schema-post-load corresponding to (3) above
>
> This would then allow this sequence of commands
>
> pg_dump --schema-pre-load
> pg_dump --data-only
> pg_dump --schema-post-load
>
> to be logically equivalent, but faster than
>
> pg_dump --schema-only
> pg_dump --data-only
>
> both forms of which are equivalent to just
>
> pg_dump
>
>
> [Assuming data isn't changing between invocations...]

I've been considering just this. Another otpion I came up with was a more
generic switch where you'd have:
pg_dump --what=tables
pg_dump --what=indexes,constraints

or something like that. Would give more flexibility, but I'm not sure if
that's worthwhile.

Having the ability to just this filtering that you're talking about would
be very handy - I've needed it more than once.

Where would you load primary keys and such contrants? Pre- or post dump? I
think the case could be made for either one...

//Magnus

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-06 15:47:58
Message-ID:	47A9D6AE.4090807@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> My proposal is to provide two additional modes:
> --schema-pre-load corresponding to (1) above
> --schema-post-load corresponding to (3) above
>
> This would then allow this sequence of commands
>
> pg_dump --schema-pre-load
> pg_dump --data-only
> pg_dump --schema-post-load
>
> to be logically equivalent, but faster than
>
> pg_dump --schema-only
> pg_dump --data-only
>

It would actually be better, as well as faster, because it will be
guaranteed to work :-) There are known cases where schema-only followed
by data-only fails.

One more note: we need to make sure the corresponding new modes are also
added to pg_restore.

cheers

andrew

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Magnus Hagander <magnus(at)hagander(dot)net>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-06 15:56:19
Message-ID:	1202313379.29242.90.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2008-02-06 at 16:41 +0100, Magnus Hagander wrote:

> Where would you load primary keys and such contrants? Pre- or post dump? I
> think the case could be made for either one...

Post dump. If the constraint was successfully in place when we performed
the dump then it should work successfully after the load.

I like your syntax/idea as well. I think that would be worth doing in
addition for when you want fine grained control. (I'd add the ability to
dump any named object, allowing you to dump individual indexes,
functions, etc..)

I don't think the two syntaxes compete. I want to be able to say
"everything before", "data" and "everything after" without having to
remember to specify --what=tables,views,sequences,kitchensink etc.
Especially since you probably won't notice you've missed out an object
type (e.g. sequences) until you try to run an application. Too late!

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	"Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Magnus Hagander" <magnus(at)hagander(dot)net>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-06 16:10:00
Message-ID:	E1539E0ED7043848906A8FF995BDA57902C24838@m0143.s-mxs.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon wrote:
> My proposal is to provide two additional modes:
> --schema-pre-load corresponding to (1) above
> --schema-post-load corresponding to (3) above

Sounds nice.
For a large schema we might rather want one switch that dumps 2 files,
no ?
Probably also better from a mvcc perspective.

Andreas

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-06 16:31:56
Message-ID:	47A9E0FC.5000601@archonet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> On Wed, 2008-02-06 at 16:41 +0100, Magnus Hagander wrote:
>
> I don't think the two syntaxes compete. I want to be able to say
> "everything before", "data" and "everything after" without having to
> remember to specify --what=tables,views,sequences,kitchensink etc.
> Especially since you probably won't notice you've missed out an object
> type (e.g. sequences) until you try to run an application. Too late!

Is this not a job for the -l / -L mode of pg_restore, but ported to pg_dump?

With a bit of tweaking ti the format you could do something like:

pg_dump -l mydb | grep BEFORE > obj_list.txt
pg_dump -L obj_list.txt mydb > mydb.before.schema

--
Richard Huxton
Archonet Ltd

From:	Decibel! <decibel(at)decibel(dot)org>
To:	Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-08 17:09:35
Message-ID:	20080208170934.GW1212@decibel.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Feb 06, 2008 at 05:10:00PM +0100, Zeugswetter Andreas ADI SD wrote:
> Simon wrote:
> > My proposal is to provide two additional modes:
> > --schema-pre-load corresponding to (1) above
> > --schema-post-load corresponding to (3) above
>
> Sounds nice.
> For a large schema we might rather want one switch that dumps 2 files,
> no ?
> Probably also better from a mvcc perspective.

For that matter, it'd be better if you could just get all 3 files (pre,
data, post) in one shot with one transaction; that would guarantee you a
clean dump.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Decibel! <decibel(at)decibel(dot)org>
Cc:	Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-08 20:01:43
Message-ID:	20080208200143.GJ31022@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Decibel! wrote:

> For that matter, it'd be better if you could just get all 3 files (pre,
> data, post) in one shot with one transaction; that would guarantee you a
> clean dump.

+1.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	Decibel! <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 10:13:15
Message-ID:	1202724795.4247.179.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 2008-02-08 at 17:01 -0300, Alvaro Herrera wrote:
> Decibel! wrote:
>
> > For that matter, it'd be better if you could just get all 3 files (pre,
> > data, post) in one shot with one transaction; that would guarantee you a
> > clean dump.
>
> +1.

OK, that seems possible.

What do you think the syntax should be for that?

If you specify --data-only you get 1 file
If you specify --schema-only you get either 1 or 2 files
and if you don't specify either you get either 1 or 3 files

How do we name the files?

--pre-schema-file
--data-file
--post-schema-file

If any of the above 3 files are named then --file is treated as an
error.

That allows you to do this:

* dump all 3 files in one go

pg_dump --pre-schema-file=pre_schema.dmp --data-file=data.dmp
--post-schema-file=post_schema.dmp

* dump 2 files for schema

pg_dump --pre-schema-file=pre_schema.dmp
--post-schema-file=post_schema.dmp --schema-only

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Decibel! <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 15:15:17
Message-ID:	20080211151517.GD9245@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:

> What do you think the syntax should be for that?
>
> If you specify --data-only you get 1 file
> If you specify --schema-only you get either 1 or 2 files
> and if you don't specify either you get either 1 or 3 files
>
> How do we name the files?
>
> --pre-schema-file
> --data-file
> --post-schema-file

I think this is pretty unwieldy. Perhaps we could have a parameter that
received a single file name and appended or prepended something to it to
create the three file names.

--multidump-prefix=foobar
and it creates foobar.1.predata, foobar.2.data, foobar.3.postdata

or something like that? The number would help to sort them
appropriately, and the string would ensure that you know what each file
is ... perhaps we could have %-escapes in the name to expand to both of
these? Perhaps we could have other %-escapes for things like database
name --- so you could say --multidump-filename=%d.%n.%t.dump ... but
then it would be nice to have strftime escapes too.

Or is this too complex?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, "Decibel!" <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 15:29:44
Message-ID:	47B069E8.3020303@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alvaro Herrera wrote:
> --multidump-prefix=foobar
> and it creates foobar.1.predata, foobar.2.data, foobar.3.postdata
>
> or something like that? The number would help to sort them
> appropriately, and the string would ensure that you know what each file
> is ... perhaps we could have %-escapes in the name to expand to both of
> these? Perhaps we could have other %-escapes for things like database
> name --- so you could say --multidump-filename=%d.%n.%t.dump ... but
> then it would be nice to have strftime escapes too.
>
> Or is this too complex?
>

Yes, I think it is. We do not have to be infinitely flexible. KISS seems
apposite.

cheers

andrew

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Decibel! <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 15:41:32
Message-ID:	1202744492.16770.109.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2008-02-11 at 10:29 -0500, Andrew Dunstan wrote:
>
> Alvaro Herrera wrote:
> > --multidump-prefix=foobar
> > and it creates foobar.1.predata, foobar.2.data, foobar.3.postdata
> >
> > or something like that? The number would help to sort them
> > appropriately, and the string would ensure that you know what each file
> > is ... perhaps we could have %-escapes in the name to expand to both of
> > these? Perhaps we could have other %-escapes for things like database
> > name --- so you could say --multidump-filename=%d.%n.%t.dump ... but
> > then it would be nice to have strftime escapes too.
> >
> > Or is this too complex?
> >
>
> Yes, I think it is. We do not have to be infinitely flexible. KISS seems
> apposite.

What syntax do you suggest?

How about we use the --file as the prefix?
and just use a postfix of .1 and .2 and .3

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>, "Decibel!" <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 15:50:11
Message-ID:	47B06EB3.2020704@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> On Mon, 2008-02-11 at 10:29 -0500, Andrew Dunstan wrote:
>
>> Alvaro Herrera wrote:
>>
>>> --multidump-prefix=foobar
>>> and it creates foobar.1.predata, foobar.2.data, foobar.3.postdata
>>>
>>> or something like that? The number would help to sort them
>>> appropriately, and the string would ensure that you know what each file
>>> is ... perhaps we could have %-escapes in the name to expand to both of
>>> these? Perhaps we could have other %-escapes for things like database
>>> name --- so you could say --multidump-filename=%d.%n.%t.dump ... but
>>> then it would be nice to have strftime escapes too.
>>>
>>> Or is this too complex?
>>>
>>>
>> Yes, I think it is. We do not have to be infinitely flexible. KISS seems
>> apposite.
>>
>
> What syntax do you suggest?
>
> How about we use the --file as the prefix?
> and just use a postfix of .1 and .2 and .3
>

seems reasonable.

cheers

andrew

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Decibel! <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 16:51:07
Message-ID:	2816.1202748667@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I think this is pretty unwieldy.

I agree. Since any multiple-output-file case can't usefully use stdout,
I think we should combine the switches and just have one switch that
says both that you want separated output and what the target filename
is. Thus something like

--pre-schema-file = foo
--data-file = bar
--post-schema-file = baz

where specifying any of these suppresses the "normal" output to stdout.
So, if you give just a subset of them, you get just subset output.

With this design, --schema-only, --data-only, and --file are obsolete,
and we should probably throw an error if any of them are used in
combination with these switches.

regards, tom lane

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Decibel! <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 17:11:00
Message-ID:	1202749860.16770.123.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2008-02-11 at 11:51 -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > I think this is pretty unwieldy.
>
> I agree. Since any multiple-output-file case can't usefully use stdout,
> I think we should combine the switches and just have one switch that
> says both that you want separated output and what the target filename
> is. Thus something like
>
> --pre-schema-file = foo
> --data-file = bar
> --post-schema-file = baz
>
> where specifying any of these suppresses the "normal" output to stdout.
> So, if you give just a subset of them, you get just subset output.
>
> With this design, --schema-only, --data-only, and --file are obsolete,
> and we should probably throw an error if any of them are used in
> combination with these switches.

Looks good from here.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Magnus Hagander <magnus(at)hagander(dot)net>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, "Decibel!" <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 17:15:19
Message-ID:	47B082A7.2040300@hagander.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> On Mon, 2008-02-11 at 11:51 -0500, Tom Lane wrote:
>> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>>> I think this is pretty unwieldy.
>> I agree. Since any multiple-output-file case can't usefully use stdout,
>> I think we should combine the switches and just have one switch that
>> says both that you want separated output and what the target filename
>> is. Thus something like
>>
>> --pre-schema-file = foo
>> --data-file = bar
>> --post-schema-file = baz
>>
>> where specifying any of these suppresses the "normal" output to stdout.
>> So, if you give just a subset of them, you get just subset output.
>>
>> With this design, --schema-only, --data-only, and --file are obsolete,
>> and we should probably throw an error if any of them are used in
>> combination with these switches.
>
> Looks good from here.

Simon, are you planning to do this? (assuming everybody agrees on the
syntax)

//Magnus

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Magnus Hagander <magnus(at)hagander(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Decibel! <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 17:30:13
Message-ID:	1202751013.16770.136.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2008-02-11 at 18:15 +0100, Magnus Hagander wrote:

> Simon, are you planning to do this? (assuming everybody agrees on the
> syntax)

Yes, will do.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Decibel!" <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 17:42:46
Message-ID:	47B08916.60605@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>
>> I think this is pretty unwieldy.
>>
>
> I agree. Since any multiple-output-file case can't usefully use stdout,
> I think we should combine the switches and just have one switch that
> says both that you want separated output and what the target filename
> is. Thus something like
>
> --pre-schema-file = foo
> --data-file = bar
> --post-schema-file = baz
>
> where specifying any of these suppresses the "normal" output to stdout.
> So, if you give just a subset of them, you get just subset output.
>
> With this design, --schema-only, --data-only, and --file are obsolete,
> and we should probably throw an error if any of them are used in
> combination with these switches.
>
>
>

this looks good. But arguably it's really pre-data and post-data.
pre-schema would be something that comes before the schema, no? Or maybe
it should be {pre,post}-data-schema-file.

cheers

andrew

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Decibel!" <decibel(at)decibel(dot)org>, Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 17:46:28
Message-ID:	4075.1202751988@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> --pre-schema-file = foo
>> --data-file = bar
>> --post-schema-file = baz

> this looks good. But arguably it's really pre-data and post-data.
> pre-schema would be something that comes before the schema, no? Or maybe
> it should be {pre,post}-data-schema-file.

I was thinking that the names were a bit long, but I'm not sure we can
get away with omitting "-file". --pre-file, --data-file, --post-file
are nice and short, but perhaps fail to convey the meaning?

regards, tom lane

From:	"Christopher Browne" <cbbrowne(at)gmail(dot)com>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	"Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, Decibel! <decibel(at)decibel(dot)org>, "Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 17:56:29
Message-ID:	d6d6637f0802110956u505c0787jdf12a932b6f909aa@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Feb 11, 2008 3:41 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Mon, 2008-02-11 at 10:29 -0500, Andrew Dunstan wrote:
> >
> > Alvaro Herrera wrote:
> > > --multidump-prefix=foobar
> > > and it creates foobar.1.predata, foobar.2.data, foobar.3.postdata
> > >
> > > or something like that? The number would help to sort them
> > > appropriately, and the string would ensure that you know what each file
> > > is ... perhaps we could have %-escapes in the name to expand to both of
> > > these? Perhaps we could have other %-escapes for things like database
> > > name --- so you could say --multidump-filename=%d.%n.%t.dump ... but
> > > then it would be nice to have strftime escapes too.
> > >
> > > Or is this too complex?
> > >
> >
> > Yes, I think it is. We do not have to be infinitely flexible. KISS seems
> > apposite.
>
> What syntax do you suggest?
>
> How about we use the --file as the prefix?
> and just use a postfix of .1 and .2 and .3

It seems better to me to have a suffix that suggests some sort of
meaning. I'm not sure of the ideal names, but starting with:
.pre-schema, .data, and .post-schema as possibilities seems like a
route to get to possibly-better names...

--
http://linuxfinances.info/info/linuxdistributions.html
"The definition of insanity is doing the same thing over and over and
expecting different results." -- assortedly attributed to Albert
Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 20:07:01
Message-ID:	1202760421.21589.23.camel@dogma.ljc.laika.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2008-02-06 at 17:10 +0100, Zeugswetter Andreas ADI SD wrote:
> Simon wrote:
> > My proposal is to provide two additional modes:
> > --schema-pre-load corresponding to (1) above
> > --schema-post-load corresponding to (3) above
>
> Sounds nice.
> For a large schema we might rather want one switch that dumps 2 files,
> no ?
> Probably also better from a mvcc perspective.
>

Assuming pg_dump/pg_reload supports pre/data/post phases, as Simon
suggests, the multi-file options could also be handled by -Fc and using
pg_restore, correct?

Or would the multi-file options provide any new functionality?

Regards,
Jeff Davis

From:	Decibel! <decibel(at)decibel(dot)org>
To:	Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-11 23:21:26
Message-ID:	7B1B3E77-650F-4253-9D8E-B0B1AEAEEB42@decibel.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Feb 11, 2008, at 11:56 AM, Christopher Browne wrote:
> It seems better to me to have a suffix that suggests some sort of
> meaning. I'm not sure of the ideal names, but starting with:
> .pre-schema, .data, and .post-schema as possibilities seems like a
> route to get to possibly-better names...

BTW, if people want to get anal with the names I think it's fine to
also have options to explicitly name the files, but for normal usage
I'd much rather have a simple option flag that just appends stuff to
--file.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-19 22:18:05
Message-ID:	1203459485.3846.123.camel@dogma.ljc.laika.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, 2008-02-06 at 15:13 +0000, Simon Riggs wrote:
> The -s option creates the table, as well as creating constraints and
> indexes. These objects need to be dropped prior to loading, if we are to
> follow the performance recommendations in the docs. But the only way to
> do that is to manually edit the script to produce a cut down script.
>
> So it would be good if we could dump objects in 3 groups
> 1. all commands required to re-create table
> 2. data
> 3. all commands required to complete table after data load
>
> My proposal is to provide two additional modes:
> --schema-pre-load corresponding to (1) above
> --schema-post-load corresponding to (3) above

Another thought:

We could also break step #3 into two steps: those objects required for
correctness (e.g. unique indexes, other constraints, etc); and those
objects that are merely for performance (i.e. non-constraining indexes).
Let's call these steps #3A and #3B.

After reading the thread here:

http://archives.postgresql.org/pgsql-performance/2008-02/msg00211.php

it's clear that building indexes can take a long time, and they aren't
strictly required for correct database operation. Often, it's perfectly
reasonable to operate the database without a few of the indexes, so long
as they don't imply a constraint.

Step #3B could be done with "CONCURRENTLY" to allow uninterrupted
operation.

We could even allow pg_restore to run step #3B in multiple sessions
building different indexes to use multiple processor cores.

This is just a thought. I can see that it's getting fairly complex, and
it may be better to just leave these things up to the DBA.

Regards,
Jeff Davis

From:	"Jochem van Dieten" <jochemd(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-24 13:22:32
Message-ID:	f96a9b830802240522u2dbf4cdci51b2bfc58ecabd25@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Feb 11, 2008 at 5:51 PM, Tom Lane wrote:
> I agree. Since any multiple-output-file case can't usefully use stdout,
> I think we should combine the switches and just have one switch that
> says both that you want separated output and what the target filename
> is. Thus something like
>
> --pre-schema-file = foo
> --data-file = bar
> --post-schema-file = baz

Or we could have a switch that specifies a directory and have pg_dump
split the dump not just in pre-schema, data and post-schema, but also
split the data in a file for each table. That would greatly facilitate
a parallel restore of the data through multiple connections.

Jochem

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Jochem van Dieten" <jochemd(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-25 05:39:32
Message-ID:	2e78013d0802242139i6642bef4y215909808f7c1960@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Feb 24, 2008 at 6:52 PM, Jochem van Dieten <jochemd(at)gmail(dot)com> wrote:
>
> Or we could have a switch that specifies a directory and have pg_dump
> split the dump not just in pre-schema, data and post-schema, but also
> split the data in a file for each table. That would greatly facilitate
> a parallel restore of the data through multiple connections.
>

How about having a single switch like --optimize <level> and then based
on the "level", pg_dump behaves differently. For example, if optimization is
turned off (i.e. -O0), pg_dump just dumps the schema and data. At level 1,
it will dump the pre-schema, data and post-schema.

We can then add more levels and optimize it further. For example, postponing
the creation of non-constraining indexes, splitting the data into
multiple files etc.
I can also think of adding constructs to the dump so that we can identify what
can be restored in parallel and pg_restore using that information
during restore.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	"Tom Dunstan" <pgsql(at)tomd(dot)cc>
To:	"Jochem van Dieten" <jochemd(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-25 09:06:09
Message-ID:	ca33c0a30802250106w5c9d2b8ey56ea358c6c950f43@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Feb 24, 2008 at 6:52 PM, Jochem van Dieten <jochemd(at)gmail(dot)com> wrote:
> Or we could have a switch that specifies a directory and have pg_dump
> split the dump not just in pre-schema, data and post-schema, but also
> split the data in a file for each table. That would greatly facilitate
> a parallel restore of the data through multiple connections.

I'll admit to thinking something similar while reading this thread,
mostly because having to specify multiple filenames just to do a dump
and then do them all on the way back in seemed horrible. My idea was
to stick the multiple streams into a structured container file rather
than a directory though - a zip file a la JAR/ODF leapt to mind. That
has the nice property of being a single dump file with optional built
in compression that could store all the data as separate streams and
would allow a smart restore program to do as much in parallel as makes
sense. Mucking around with directories or three different filenames or
whatever is a pain. I'll bet most users want to say "pg_dump
--dump-file=foo.zip foo", back up foo.zip as appropriate, and when
restoring saying "pg_restore --dump-file=foo.zip -j 4" or whatever and
having pg_restore do the rest. The other nice thing about using a zip
file as a container is that you can inspect it with standard tools if
you need to.

Another thought is that doing things this way would allow us to add
extra metadata to the dump in later versions without giving the user
yet another command line switch for an extra file. Or even, thinking a
bit more outside the box, allow us to store data in binary format if
that's what the user wants at some point (thinking of the output from
binary io rather than on disk representation, obviously). Exposing all
the internals of this stuff via n command line args is pretty
constraining - it would be nice if pg_dump just produced the most
efficient dump, and if we decide at a later date that that means doing
things a bit differently, then we bump the dump file version and just
do it.

Just a thought...

Cheers

Tom

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 05:39:29
Message-ID:	24991.1204004369@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> ... So it would be good if we could dump objects in 3 groups
> 1. all commands required to re-create table
> 2. data
> 3. all commands required to complete table after data load

[ much subsequent discussion snipped ]

BTW, what exactly was the use-case for this? The recent discussions
about parallelizing pg_restore make it clear that the all-in-one
dump file format still has lots to recommend it. So I'm just wondering
what the actual advantage of splitting the dump into multiple files
will be. It clearly makes life more complicated; what are we buying?

regards, tom lane

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 11:06:02
Message-ID:	1204023962.4252.231.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 00:39 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > ... So it would be good if we could dump objects in 3 groups
> > 1. all commands required to re-create table
> > 2. data
> > 3. all commands required to complete table after data load
>
> [ much subsequent discussion snipped ]
>
> BTW, what exactly was the use-case for this? The recent discussions
> about parallelizing pg_restore make it clear that the all-in-one
> dump file format still has lots to recommend it. So I'm just wondering
> what the actual advantage of splitting the dump into multiple files
> will be. It clearly makes life more complicated; what are we buying?

One file is convenient, but not fast.

Convenience should be the default, with speed as an option.

A current pg_dump spends 99% of its time in the unload phase, and the
unload of each table can be parallelised, potentially.

A current pg_dump file has many potentially parallelisable tasks during
the *reload* process:
* the individual COPY statements with respect to each other
* the data within a COPY statement for a single table
* the addition of indexes
* some FK checks can be run in parallel, say 25-75% of them, typically
all of the very largest

Also, writing data to multiple drives can increase performance of both
unload and reload.

ISTM the easiest way to gain the most benefit from parallelisation is to
allow the COPY TO (unload) operation to be parallelised on pg_dump,
splitting the files into one per table. The COPY FROM (load) operation
can then be parallelised fairly easily to match.

So that would mean we would run an unload like this

pg_dump --pre-schema-file=f1 --save-snapshot -snapshot-id=X
pg_dump -t bigtable --data-file=f2.1 --snapshot-id=X
pg_dump -t bigtable2 --data-file=f2.2 --snapshot-id=X
pg_dump -T bigtable -T bigtable2 --data-file=f2.3 --snapshot-id=X

... (add other tables to be run in parallel)

pg_dump --post-schema-file=f3 -snapshot-id=X

and then run the reload like this

psql -f f1
psql -f f2.1 &
psql -f f2.2 &
psql -f f2.3 &
wait
psql -f f3

using shell notation for parallel operations just to simplify things.

This idea does *not* rely on the idea of saved snapshots, but they do
play well together. I hope to write both (as agreed on other thread) and
to discuss all the syntax for the snapshot stuff separately.

We could go to even greater lengths by expressing each of the reload
dependencies explicitly into the pg_dump output, rather than just
implicitly via output ordering. It would then be possible to have a
concurrent psql/pg_restore to execute multiple threads and execute tasks
that respect the dependencies. That would be cute, but it sounds way
over-cooked to me and about 10 times more work to extract the maximum
possible parallelizability, when we can get 2-3 times performance with
some much simpler changes.

Note that another approach might be to attempt to parallelise each load
individually, but that again is a lot of work and often not required.

So speeding up the data dump/reload portion seems quick and easy.

(Plus, I've pretty much finished writing the pg_dump changes and was
going to publish it today... but that doesn't make it right, I know.)

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 11:22:56
Message-ID:	1204024976.4252.241.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-19 at 14:18 -0800, Jeff Davis wrote:
> On Wed, 2008-02-06 at 15:13 +0000, Simon Riggs wrote:
> > The -s option creates the table, as well as creating constraints and
> > indexes. These objects need to be dropped prior to loading, if we are to
> > follow the performance recommendations in the docs. But the only way to
> > do that is to manually edit the script to produce a cut down script.
> >
> > So it would be good if we could dump objects in 3 groups
> > 1. all commands required to re-create table
> > 2. data
> > 3. all commands required to complete table after data load
> >
> > My proposal is to provide two additional modes:
> > --schema-pre-load corresponding to (1) above
> > --schema-post-load corresponding to (3) above
>
> Another thought:
>
> We could also break step #3 into two steps: those objects required for
> correctness (e.g. unique indexes, other constraints, etc); and those
> objects that are merely for performance (i.e. non-constraining indexes).
> Let's call these steps #3A and #3B.
>
> After reading the thread here:
>
> http://archives.postgresql.org/pgsql-performance/2008-02/msg00211.php
>
> it's clear that building indexes can take a long time, and they aren't
> strictly required for correct database operation. Often, it's perfectly
> reasonable to operate the database without a few of the indexes, so long
> as they don't imply a constraint.
>
> Step #3B could be done with "CONCURRENTLY" to allow uninterrupted
> operation.
>
> We could even allow pg_restore to run step #3B in multiple sessions
> building different indexes to use multiple processor cores.
>
> This is just a thought. I can see that it's getting fairly complex, and
> it may be better to just leave these things up to the DBA.

My thinking is to do either:

* keep it as simple as possible to allow DBA to manually improve
performance

* express dependency information in the pg_dump output to allow some
level of parallelism to use that information to advantage automatically

Doing the second one only seems a lot of work and would prevent a DBA
from being able to code things a particular way for their install. It
seems possible to get good benefit from the first one and yet still do
the second part of that later.

If we can work out some notation or grammar to put into the script
output then we might imagine having a concurrent psql that spawns
multiple sessions and executes. With the correct grammar all we would
need is an additional psql option --parallel=N and then psql would work
its way through the script keeping N sessions busy when we reload.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Magnus Hagander <magnus(at)hagander(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 11:31:38
Message-ID:	20080226113138.GM528@svr2.hagander.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Feb 26, 2008 at 12:39:29AM -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > ... So it would be good if we could dump objects in 3 groups
> > 1. all commands required to re-create table
> > 2. data
> > 3. all commands required to complete table after data load
>
> [ much subsequent discussion snipped ]
>
> BTW, what exactly was the use-case for this? The recent discussions
> about parallelizing pg_restore make it clear that the all-in-one
> dump file format still has lots to recommend it. So I'm just wondering
> what the actual advantage of splitting the dump into multiple files
> will be. It clearly makes life more complicated; what are we buying?

One use-case would be when you have to make some small change to the schema
while reloading it, that's still compatible with the data format. Then
you'd dump schema-no-indexes-and-stuff, then *edit* that file, before
reloading things. It's a lot easier to edit the file if it's not hundreds
of gigabytes..

//Magnus

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 11:46:13
Message-ID:	200802261246.16194.dfontaine@hi-media.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le mardi 26 février 2008, Simon Riggs a écrit :
> So that would mean we would run an unload like this
>
> pg_dump --pre-schema-file=f1 --save-snapshot -snapshot-id=X
> pg_dump -t bigtable --data-file=f2.1 --snapshot-id=X
> pg_dump -t bigtable2 --data-file=f2.2 --snapshot-id=X
> pg_dump -T bigtable -T bigtable2 --data-file=f2.3 --snapshot-id=X

As a user I'd really prefer all of this to be much more transparent, and could
well imagine the -Fc format to be some kind of TOC + zip of table data + post
load instructions (organized per table), or something like this.
In fact just what you described, all embedded in a single file.

And I'd much prefer it if this (new?) format was trustworthy enough to be the
new default format of -Fc dumps. Then we could add some *simple* command line
parameter to control the threading behavior of the dump and reload process,
ala make -j. We could even support some option for the user to tell us which
disk arrays to use for parallel dumping.

pg_dump -j2 --dumpto=/mount/sda:/mount/sdb ... > mydb.dump
pg_restore -j4 ... mydb.dump

Then the trick would certainly be to use your work internally to feed a newer
dump format, which may or may not look exactly like the current one... and
the user would not have to mess around to get a coherent optimized dump.

The other comments on this threads about playing with the schema before and
after restoring seem to be related to pg_restore facilities, not at all with
how you want to dump. I know I'd like to have a single simple pg_dump tool,
then some flexible pg_restore options for using the dump. We already
have --schema-only and --data-only, what about having some more stuff here?

pg_restore --with-my-new-schema file.sql --no-index --no-fks [etc]

Hope this helps, regards,
--
dim

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 12:05:40
Message-ID:	1204027540.4252.251.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 12:46 +0100, Dimitri Fontaine wrote:
> Le mardi 26 février 2008, Simon Riggs a écrit :
> > So that would mean we would run an unload like this
> >
> > pg_dump --pre-schema-file=f1 --save-snapshot -snapshot-id=X
> > pg_dump -t bigtable --data-file=f2.1 --snapshot-id=X
> > pg_dump -t bigtable2 --data-file=f2.2 --snapshot-id=X
> > pg_dump -T bigtable -T bigtable2 --data-file=f2.3 --snapshot-id=X
>
> As a user I'd really prefer all of this to be much more transparent, and could
> well imagine the -Fc format to be some kind of TOC + zip of table data + post
> load instructions (organized per table), or something like this.
> In fact just what you described, all embedded in a single file.

If its in a single file then it won't perform as well as if its separate
files. We can put separate files on separate drives. We can begin
reloading one table while another is still unloading. The OS will
perform readahead for us on single files whereas on one file it will
look like random I/O. etc.

I'm not proposing we change things to use separate files in all cases.
Just when you want to use separate files, you can.

> And I'd much prefer it if this (new?) format was trustworthy enough to be the
> new default format of -Fc dumps. Then we could add some *simple* command line
> parameter to control the threading behavior of the dump and reload process,
> ala make -j. We could even support some option for the user to tell us which
> disk arrays to use for parallel dumping.
>
> pg_dump -j2 --dumpto=/mount/sda:/mount/sdb ... > mydb.dump
> pg_restore -j4 ... mydb.dump

I like the -j syntax.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	"Tom Dunstan" <pgsql(at)tomd(dot)cc>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	"Dimitri Fontaine" <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 12:49:12
Message-ID:	ca33c0a30802260449r45c19725iea4b03c0f02c8b37@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Feb 26, 2008 at 5:35 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Tue, 2008-02-26 at 12:46 +0100, Dimitri Fontaine wrote:
> > As a user I'd really prefer all of this to be much more transparent, and could
> > well imagine the -Fc format to be some kind of TOC + zip of table data + post
> > load instructions (organized per table), or something like this.
> > In fact just what you described, all embedded in a single file.
>
> If its in a single file then it won't perform as well as if its separate
> files. We can put separate files on separate drives. We can begin
> reloading one table while another is still unloading. The OS will
> perform readahead for us on single files whereas on one file it will
> look like random I/O. etc.

Yeah, writing multiple unknown-length streams to a single file in
parallel is going to be all kinds of painful, and this use case seems
to be the biggest complaint against a zip file kind of approach. I
didn't know about the custom file format when I suggested the zip file
one yesterday*, but a zip or equivalent has the major benefit of
allowing the user to do manual inspection / tweaking of the dump
because the file format is one that can be manipulated by standard
tools. And zip wins over tar because it's indexed - if you want to
extract just the schema and hack on it you don't need to touch your
multi-GBs of data.

Perhaps a compromise: we specify a file system layout for table data
files, pre/post scripts and other metadata that we want to be made
available to pg_restore. By default, it gets dumped into a zip file /
whatever, but a user who wants to get parallel unloads can pass a flag
that tells pg_dump to stick it into a directory instead, with exactly
the same file layout. Or how about this: if the filename given to
pg_dump is a directory, spit out files in there, otherwise
create/overwrite a single file.

While it's a bit fiddly, putting data on separate drives would then
involve something like symlinking the tablename inside the dump dir
off to an appropriate mount point, but that's probably not much worse
than running n different pg_dump commands specifying different files.
Heck, if you've got lots of data and want very particular behavior,
you've got to specify it somehow. :)

Cheers

Tom

* The custom file format does not seem well advertised. None of the
examples on the pg_dump page use it, and I've never come across it in
my travels on the vast interwebs. Heck, I've even hacked on pg_dump
and I didn't know about it :). I won't suggest advertising it more
while this discussion is going on though, since it may be obsoleted by
whatever the final outcome is here.

From:	"Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
To:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Dimitri Fontaine" <dfontaine(at)hi-media(dot)com>
Cc:	<pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 13:01:08
Message-ID:	E1539E0ED7043848906A8FF995BDA57902D064A0@m0143.s-mxs.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> > As a user I'd really prefer all of this to be much more transparent,
and could
> > well imagine the -Fc format to be some kind of TOC + zip of table
data + post
> > load instructions (organized per table), or something like this.
> > In fact just what you described, all embedded in a single file.
>
> If its in a single file then it won't perform as well as if its
separate
> files. We can put separate files on separate drives. We can begin
> reloading one table while another is still unloading. The OS will
> perform readahead for us on single files whereas on one file it will
> look like random I/O. etc.

Well, nowadays this is not generally true. You would most likely be
using stripesets, so you don't need more that one file for reading to be
using more than one spindle in parallel.
Also different threads or processes would be used, so readahead would be
done in parallel at different offsets in the file.

Andreas

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Dunstan <pgsql(at)tomd(dot)cc>
Cc:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 13:05:11
Message-ID:	1204031111.4252.255.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 18:19 +0530, Tom Dunstan wrote:
> On Tue, Feb 26, 2008 at 5:35 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > On Tue, 2008-02-26 at 12:46 +0100, Dimitri Fontaine wrote:
> > > As a user I'd really prefer all of this to be much more transparent, and could
> > > well imagine the -Fc format to be some kind of TOC + zip of table data + post
> > > load instructions (organized per table), or something like this.
> > > In fact just what you described, all embedded in a single file.
> >
> > If its in a single file then it won't perform as well as if its separate
> > files. We can put separate files on separate drives. We can begin
> > reloading one table while another is still unloading. The OS will
> > perform readahead for us on single files whereas on one file it will
> > look like random I/O. etc.
>
> Yeah, writing multiple unknown-length streams to a single file in
> parallel is going to be all kinds of painful, and this use case seems
> to be the biggest complaint against a zip file kind of approach. I
> didn't know about the custom file format when I suggested the zip file
> one yesterday*, but a zip or equivalent has the major benefit of
> allowing the user to do manual inspection / tweaking of the dump
> because the file format is one that can be manipulated by standard
> tools. And zip wins over tar because it's indexed - if you want to
> extract just the schema and hack on it you don't need to touch your
> multi-GBs of data.
>
> Perhaps a compromise: we specify a file system layout for table data
> files, pre/post scripts and other metadata that we want to be made
> available to pg_restore. By default, it gets dumped into a zip file /
> whatever, but a user who wants to get parallel unloads can pass a flag
> that tells pg_dump to stick it into a directory instead, with exactly
> the same file layout. Or how about this: if the filename given to
> pg_dump is a directory, spit out files in there, otherwise
> create/overwrite a single file.
>
> While it's a bit fiddly, putting data on separate drives would then
> involve something like symlinking the tablename inside the dump dir
> off to an appropriate mount point, but that's probably not much worse
> than running n different pg_dump commands specifying different files.
> Heck, if you've got lots of data and want very particular behavior,
> you've got to specify it somehow. :)

Separate files seems much simpler...

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Tom Dunstan <pgsql(at)tomd(dot)cc>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 13:28:11
Message-ID:	47C413EB.8080505@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs wrote:
> Separate files seems much simpler...
>
>

Yes, We need to stick to the KISS principle.

ISTM that we could simply invent a new archive format of "d" for directory.

BTW, parallel dumping might be important, but is really much less so
than parallel restoring in my book.

cheers

andrew

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	"Tom Dunstan" <pgsql(at)tomd(dot)cc>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 13:38:06
Message-ID:	200802261438.08519.dfontaine@hi-media.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le mardi 26 février 2008, Tom Dunstan a écrit :
> On Tue, Feb 26, 2008 at 5:35 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > Le mardi 26 février 2008, Dimitri Fontaine a écrit :
> >> We could even support some option for the user to tell us which
> >> disk arrays to use for parallel dumping.
> >>
> >> pg_dump -j2 --dumpto=/mount/sda:/mount/sdb ... > mydb.dump
> >> pg_restore -j4 ... mydb.dump
> >
> > If its in a single file then it won't perform as well as if its separate
> > files. We can put separate files on separate drives. We can begin
> > reloading one table while another is still unloading. The OS will
> > perform readahead for us on single files whereas on one file it will
> > look like random I/O. etc.
>
> Yeah, writing multiple unknown-length streams to a single file in
> parallel is going to be all kinds of painful
[...]
> While it's a bit fiddly, putting data on separate drives would then
> involve something like symlinking the tablename inside the dump dir
> off to an appropriate mount point, but that's probably not much worse
> than running n different pg_dump commands specifying different files.
> Heck, if you've got lots of data and want very particular behavior,
> you've got to specify it somehow. :)

What I meant with the --dumpto=/mount/sda:/mount/sdb idea was that pg_dump
would unload data to those dirs (filesystems/disk array/whatever) then
prepare the final zip file from here.

We could even choose for the --dumpto option to associate each entry to a
process, or have a special TOC syntax which allows for complex setups, and
have pg_dump dump first a TOC you edit, then use the edited version to
control the parallel unloading, disks to use for which tables, etc.

That is exactly your ideas, but with a try to make them appear clear and
simple from a user point of view, so with some more work to get done by the
tools.
--
dim

From:	Magnus Hagander <magnus(at)hagander(dot)net>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Dunstan <pgsql(at)tomd(dot)cc>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 14:12:26
Message-ID:	20080226141226.GQ528@svr2.hagander.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Feb 26, 2008 at 08:28:11AM -0500, Andrew Dunstan wrote:
>
>
> Simon Riggs wrote:
> >Separate files seems much simpler...
> >
> >
>
> Yes, We need to stick to the KISS principle.
>
> ISTM that we could simply invent a new archive format of "d" for directory.

Yeah, you can always ZIP (or whatever) the resulting directory when you're
done..

But looking at it from a "backup tool perspective", like if you want to
integrate it in your network backup solution, that might make it harder.
Being able to deliver over a single, or over multiple, pipes is what's
needed there. If you need to dump it to disk first and can only "pick it
up" later, that'll require a lot more I/O and disk space.

But I'm not sure that's a concern we need to think about in this case,
just wanted to mention it.

> BTW, parallel dumping might be important, but is really much less so
> than parallel restoring in my book.

By far. The only case where you want the backup to max out your system
would be during an "offline upgrade"... You don't want a regular backup to
max things out, because it will slow other things down. Whereas if you're
doing a restore, you most likely want your data back up ASAP.

//Magnus

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Tom Dunstan <pgsql(at)tomd(dot)cc>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 14:15:33
Message-ID:	1204035333.4252.285.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 08:28 -0500, Andrew Dunstan wrote:
>
> Simon Riggs wrote:
> > Separate files seems much simpler...
> >
> >
>
> Yes, We need to stick to the KISS principle.
>
> ISTM that we could simply invent a new archive format of "d" for directory.

...and then dump each table to a separate file?

If we dumped all the tables to separate files we would then be able to
have a COPY statement *without* inline data - we just give a relative
address to the data file (I know we can't do that currently). So, yes, I
like it.

> BTW, parallel dumping might be important, but is really much less so
> than parallel restoring in my book.

Yeh, but unload time creates a long running transaction that is a PITA,
so the unload time matters for two reasons.

Splitting up the dump is the enabler for splitting up the load.

If we had concurrent psql *and* out of line data then we would be able
to load many tables at once. But the single psql process would then be
the bottleneck for reading data, so we really need multiple client
processes, each with its own session to the database. Moving the
bottleneck isn't solving the problem longer term. Sure, psql could spawn
lots of additional processes, but thats a lot more work than just
splitting up the files.

My feeling that trying to do everything automagically will end up in a
long project that has some restrictions in the final result.

I would very much like to KISS and just split the files up at dump time
and then allow them to be executed concurrently on load. It's not really
a problem to put the various files in a directory manually.

Most use cases of this are when we have a few really big tables and lots
of small ones and manually coding an unload/reload is not a big issue in
the scheme of things. It will be easy to use, performant, reliable and
we can work on other backend software in the time saved by doing this.

There's no real problem with allowing split-file unloads *and* later
having a automagical solution.

So I'm saying:

* lets go for the 3 split-file options (called whatever...)

* we might find a really cool automagical solution *later*. That would
probably look like this
- parallel dumping from spawned sessions on pg_dump
- dumps to multiple files in same directory
- parallel reload using pg_restore

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Magnus Hagander <magnus(at)hagander(dot)net>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Dunstan <pgsql(at)tomd(dot)cc>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 14:27:07
Message-ID:	1204036027.4252.289.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 15:12 +0100, Magnus Hagander wrote:
> On Tue, Feb 26, 2008 at 08:28:11AM -0500, Andrew Dunstan wrote:
> >
> >
> > Simon Riggs wrote:
> > >Separate files seems much simpler...
> > >
> > >
> >
> > Yes, We need to stick to the KISS principle.
> >
> > ISTM that we could simply invent a new archive format of "d" for directory.
>
> Yeah, you can always ZIP (or whatever) the resulting directory when you're
> done..
>
> But looking at it from a "backup tool perspective", like if you want to
> integrate it in your network backup solution, that might make it harder.
> Being able to deliver over a single, or over multiple, pipes is what's
> needed there. If you need to dump it to disk first and can only "pick it
> up" later, that'll require a lot more I/O and disk space.
>
> But I'm not sure that's a concern we need to think about in this case,
> just wanted to mention it.

If I'm using a network backup solution I'd be using physical backup,
which can already be parallelised.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To:	Magnus Hagander <magnus(at)hagander(dot)net>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Dunstan <pgsql(at)tomd(dot)cc>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 14:27:46
Message-ID:	47C421E2.50209@cheapcomplexdevices.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Magnus Hagander wrote:
> On Tue, Feb 26, 2008 at 08:28:11AM -0500, Andrew Dunstan wrote:
>> Simon Riggs wrote:
>>> Separate files seems much simpler...
>> Yes, We need to stick to the KISS principle.
>> ISTM that we could simply invent a new archive format of "d" for directory.
> Yeah, you can always ZIP (or whatever) the resulting directory when you're
> done..

I could imagine this being very rsync friendly too.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Magnus Hagander <magnus(at)hagander(dot)net>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 15:03:57
Message-ID:	2284.1204038237@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> On Tue, Feb 26, 2008 at 12:39:29AM -0500, Tom Lane wrote:
>> BTW, what exactly was the use-case for this?

> One use-case would be when you have to make some small change to the schema
> while reloading it, that's still compatible with the data format. Then
> you'd dump schema-no-indexes-and-stuff, then *edit* that file, before
> reloading things. It's a lot easier to edit the file if it's not hundreds
> of gigabytes..

This is a use-case for having switches that *extract* convenient subsets
of a dump archive. It does not mandate having pg_dump emit multiple
files. You could extract, say, the pre-data schema into a text SQL
script, edit it, load it, then extract the data and remainining script
directly into the database from the dump file.

In short, what I think we need here is just some more conveniently
defined extraction filter switches than --schema-only and --data-only.
There's no need for any fundamental change to pg_dump's architecture.

Yes, I've read the subsequent discussion about a "directory" output
format. I think it's pointless complication --- or at least, that it's
a performance hack rather than a functionality one, with no chance of
any actual performance gain until we've parallelized pg_restore, and
with zero existing evidence that any gain would be had even then.

BTW, if we avoid fooling with the definition of the archive format,
that also means that the extraction-switch patch should be relatively
independent of parallelization work, so the work could proceed
concurrently.

regards, tom lane

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Magnus Hagander <magnus(at)hagander(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 15:15:39
Message-ID:	47C42D1B.90708@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>
>> On Tue, Feb 26, 2008 at 12:39:29AM -0500, Tom Lane wrote:
>>
>>> BTW, what exactly was the use-case for this?
>>>
>
>
>> One use-case would be when you have to make some small change to the schema
>> while reloading it, that's still compatible with the data format. Then
>> you'd dump schema-no-indexes-and-stuff, then *edit* that file, before
>> reloading things. It's a lot easier to edit the file if it's not hundreds
>> of gigabytes..
>>
>
> This is a use-case for having switches that *extract* convenient subsets
> of a dump archive. It does not mandate having pg_dump emit multiple
> files. You could extract, say, the pre-data schema into a text SQL
> script, edit it, load it, then extract the data and remainining script
> directly into the database from the dump file.
>
> In short, what I think we need here is just some more conveniently
> defined extraction filter switches than --schema-only and --data-only.
> There's no need for any fundamental change to pg_dump's architecture.
>
> Yes, I've read the subsequent discussion about a "directory" output
> format. I think it's pointless complication --- or at least, that it's
> a performance hack rather than a functionality one, with no chance of
> any actual performance gain until we've parallelized pg_restore, and
> with zero existing evidence that any gain would be had even then.
>
> BTW, if we avoid fooling with the definition of the archive format,
> that also means that the extraction-switch patch should be relatively
> independent of parallelization work, so the work could proceed
> concurrently.
>
>
>

I agree that they are really independent. There are enough reasons for
splitting the schema output into pre-data and post-data sections that we
should do that forthwith.

cheers

andrew

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 15:15:59
Message-ID:	1204038959.4252.321.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 10:03 -0500, Tom Lane wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
> > On Tue, Feb 26, 2008 at 12:39:29AM -0500, Tom Lane wrote:
> >> BTW, what exactly was the use-case for this?
>
> > One use-case would be when you have to make some small change to the schema
> > while reloading it, that's still compatible with the data format. Then
> > you'd dump schema-no-indexes-and-stuff, then *edit* that file, before
> > reloading things. It's a lot easier to edit the file if it's not hundreds
> > of gigabytes..
>
> This is a use-case for having switches that *extract* convenient subsets
> of a dump archive. It does not mandate having pg_dump emit multiple
> files. You could extract, say, the pre-data schema into a text SQL
> script, edit it, load it, then extract the data and remainining script
> directly into the database from the dump file.
>
> In short, what I think we need here is just some more conveniently
> defined extraction filter switches than --schema-only and --data-only.
> There's no need for any fundamental change to pg_dump's architecture.

> Yes, I've read the subsequent discussion about a "directory" output
> format. I think it's pointless complication --- or at least, that it's
> a performance hack rather than a functionality one, with no chance of
> any actual performance gain until we've parallelized pg_restore, and
> with zero existing evidence that any gain would be had even then.
>
> BTW, if we avoid fooling with the definition of the archive format,
> that also means that the extraction-switch patch should be relatively
> independent of parallelization work, so the work could proceed
> concurrently.

So if I understand:

* we add switches to pg_dump to dump out separate files with --pre,
--post and --data (or other names) [TODO: Simon]

* we add switches to pg_restore to load/dump from the single archive
file the subsets of --pre, --post, --data [TODO: Magnus]

Everybody agree?

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	David BOURIAUD <david(dot)bouriaud(at)ac-rouen(dot)fr>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 15:23:18
Message-ID:	200802261623.21062.david.bouriaud@ac-rouen.fr
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le mardi 26 février 2008, Tom Lane a écrit :
>
> In short, what I think we need here is just some more conveniently
> defined extraction filter switches than --schema-only and --data-only.
> There's no need for any fundamental change to pg_dump's architecture.

Forgive me if what I will say bellow is completly pointless, but I think at
this point that the base of this discussion might be wrong. If the decision
made here is to keep pg_dump simple, then maybe that it could be left as it
is, and create another tool just to extract some parts of a database, either
schema or data.
As far as I understand what is said here, pg_dump is thought to be a tool used
to make a backup of a database to use it somewhere else. So let it be as it
is.
What I intendeed to mean in my first post, is that it would be great to have a
tool that could let one get a partial dump of a database at one time, so as
to modify (or not) and to alter the database afterward (or not).

I use to work on many databases at a time, and sometime, I have to quickly fix
a function, add a trigger to a table... Sometime I create a sql file and save
my work before passing the command set to psql, but sometimes I don't have
much time and type in the code directly in psql.
So far so good, the code works, until a problem is found, and then, I don't
have any source file to work on...unless I use pg_dump and search the so big
file for the code I want to modify.
I hope you see what I mean. Since the idea whas to dump informations about the
structure of a table, function type or whatever object one could want from
the base, I asked for options for pg_dump, but maybe a new tool (based on
pg_dump ?) could satisfy everyone ?

From:	Magnus Hagander <magnus(at)hagander(dot)net>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 15:28:17
Message-ID:	20080226152817.GT528@svr2.hagander.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Feb 26, 2008 at 03:15:59PM +0000, Simon Riggs wrote:
> On Tue, 2008-02-26 at 10:03 -0500, Tom Lane wrote:
> > Magnus Hagander <magnus(at)hagander(dot)net> writes:
> > > On Tue, Feb 26, 2008 at 12:39:29AM -0500, Tom Lane wrote:
> > >> BTW, what exactly was the use-case for this?
> >
> > > One use-case would be when you have to make some small change to the schema
> > > while reloading it, that's still compatible with the data format. Then
> > > you'd dump schema-no-indexes-and-stuff, then *edit* that file, before
> > > reloading things. It's a lot easier to edit the file if it's not hundreds
> > > of gigabytes..
> >
> > This is a use-case for having switches that *extract* convenient subsets
> > of a dump archive. It does not mandate having pg_dump emit multiple
> > files. You could extract, say, the pre-data schema into a text SQL
> > script, edit it, load it, then extract the data and remainining script
> > directly into the database from the dump file.
> >
> > In short, what I think we need here is just some more conveniently
> > defined extraction filter switches than --schema-only and --data-only.
> > There's no need for any fundamental change to pg_dump's architecture.
>
> > Yes, I've read the subsequent discussion about a "directory" output
> > format. I think it's pointless complication --- or at least, that it's
> > a performance hack rather than a functionality one, with no chance of
> > any actual performance gain until we've parallelized pg_restore, and
> > with zero existing evidence that any gain would be had even then.
> >
> > BTW, if we avoid fooling with the definition of the archive format,
> > that also means that the extraction-switch patch should be relatively
> > independent of parallelization work, so the work could proceed
> > concurrently.
>
> So if I understand:
>
> * we add switches to pg_dump to dump out separate files with --pre,
> --post and --data (or other names) [TODO: Simon]
>
> * we add switches to pg_restore to load/dump from the single archive
> file the subsets of --pre, --post, --data [TODO: Magnus]

Not entirely sure how I rended up volounteering for that one ;-) And
frankly, if we're going to add the same switches with the same meaning to
both, it's probably more efficient if the same person does them both...

//Magnus

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Magnus Hagander <magnus(at)hagander(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 15:47:08
Message-ID:	1204040828.4252.346.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 16:28 +0100, Magnus Hagander wrote:
> On Tue, Feb 26, 2008 at 03:15:59PM +0000, Simon Riggs wrote:

> > So if I understand:
> >
> > * we add switches to pg_dump to dump out separate files with --pre,
> > --post and --data (or other names) [TODO: Simon]
> >
> > * we add switches to pg_restore to load/dump from the single archive
> > file the subsets of --pre, --post, --data [TODO: Magnus]
>
> Not entirely sure how I rended up volounteering for that one ;-) And
> frankly, if we're going to add the same switches with the same meaning to
> both, it's probably more efficient if the same person does them both...

OK, I'll do them both.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 15:48:55
Message-ID:	3002.1204040935@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> My thinking is to do either:
> * keep it as simple as possible to allow DBA to manually improve
> performance
> * express dependency information in the pg_dump output to allow some
> level of parallelism to use that information to advantage automatically

I'm astonished at how much pontificating is going on in this thread
from people who seem unaware of what pg_dump *already* does.

If you don't already know exactly what is in an -Fc dump, you are
unqualified to be discussing how to improve it.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 16:59:09
Message-ID:	4064.1204045149@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> So if I understand:

> * we add switches to pg_dump to dump out separate files with --pre,
> --post and --data (or other names) [TODO: Simon]

> * we add switches to pg_restore to load/dump from the single archive
> file the subsets of --pre, --post, --data [TODO: Magnus]

Oh dear. Apparently you do not understand that *these are the same
switches*. The way pg_dump is structured, there is only one
implementation to be done for both cases.

regards, tom lane

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 17:16:51
Message-ID:	1204046211.4252.363.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 10:48 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > My thinking is to do either:
> > * keep it as simple as possible to allow DBA to manually improve
> > performance
> > * express dependency information in the pg_dump output to allow some
> > level of parallelism to use that information to advantage automatically
>
> I'm astonished at how much pontificating is going on in this thread
> from people who seem unaware of what pg_dump *already* does.
>
> If you don't already know exactly what is in an -Fc dump, you are
> unqualified to be discussing how to improve it.

I've not been advocating improving pg_restore, which is where the -Fc
tricks come in.

pg_dump can write text files that can be input to psql. pg_dump already
has the dependency information *but* it doesn't write the dependency
info to a psql-able file. But it could, which is what I meant.

If we created a grammar for psql that allowed dependency information to
be expressed then pg_dump and pg_restore could generate scripts with
their dependency info expressed in the psql grammar.

I see you thought I meant pg_restore. I don't thinking extending
pg_restore in that way is of sufficiently generic use to make it
worthwhile. Extending psql would be worth it, since not all psql scripts
come from pg_dump.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 17:25:15
Message-ID:	1204046715.4252.371.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 11:59 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > So if I understand:
>
> > * we add switches to pg_dump to dump out separate files with --pre,
> > --post and --data (or other names) [TODO: Simon]
>
> > * we add switches to pg_restore to load/dump from the single archive
> > file the subsets of --pre, --post, --data [TODO: Magnus]
>
> Oh dear. Apparently you do not understand that *these are the same
> switches*. The way pg_dump is structured, there is only one
> implementation to be done for both cases.

Yes, agreed. I was trying to slip away from doing it so hard I didn't
notice... that code is dog-ugly and it hurts my head, so I was looking
forward to moving onto something else :-)

I'm just testing the patch now.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Greg Smith <gsmith(at)gregsmith(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Dunstan <pgsql(at)tomd(dot)cc>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 17:27:13
Message-ID:	Pine.GSO.4.64.0802261209430.204@westnet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 26 Feb 2008, Simon Riggs wrote:

> Splitting up the dump is the enabler for splitting up the load.

While the pg_dump split train seems to be leaving the station, I feel
compelled to point out that focus does nothing to help people who are
bulk-loading data that came from somewhere else. If my data is already in
PostgreSQL, and I'm doing a dump/load, I can usually split the data easily
enough with existing tools to handle that right now via COPY (SELECT...)
TO. Some tools within pg_dump would be nice, but I don't need them that
much. It's gigantic files that came from some other DB I don't even have
access to that I struggle with loading efficiently.

The work Dimitri is doing is wandering in that direction and that may be
enough. I note that something that addresses loading big files regardless
of source could also work on PostgreSQL dumps, while a pg_dump focused
effort helps nothing but that specific workflow. I wonder if doing too
much work on the pg_dump path is the best use of someone's time when the
more general case will need to be addressed one day anyway.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Greg Smith <gsmith(at)gregsmith(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Dunstan <pgsql(at)tomd(dot)cc>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 18:02:17
Message-ID:	5327.1204048937@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> While the pg_dump split train seems to be leaving the station, I feel
> compelled to point out that focus does nothing to help people who are
> bulk-loading data that came from somewhere else.

What are you imagining here ... a plain SQL script containing
database-independent INSERT commands? That's going to suck compared
to COPY no matter what.

If you're imagining that it's at least pg_dump output that came from
someplace else, we can probably speed it up using the ideas being
kicked around here.

regards, tom lane

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Greg Smith <gsmith(at)gregsmith(dot)com>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Dunstan <pgsql(at)tomd(dot)cc>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 18:13:21
Message-ID:	1204049601.4252.399.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 12:27 -0500, Greg Smith wrote:
> On Tue, 26 Feb 2008, Simon Riggs wrote:
>
> > Splitting up the dump is the enabler for splitting up the load.
>
> While the pg_dump split train seems to be leaving the station, I feel
> compelled to point out that focus does nothing to help people who are
> bulk-loading data that came from somewhere else. If my data is already in
> PostgreSQL, and I'm doing a dump/load, I can usually split the data easily
> enough with existing tools to handle that right now via COPY (SELECT...)
> TO. Some tools within pg_dump would be nice, but I don't need them that
> much. It's gigantic files that came from some other DB I don't even have
> access to that I struggle with loading efficiently.

Don't write them as gigantic files... :-)

> The work Dimitri is doing is wandering in that direction and that may be
> enough. I note that something that addresses loading big files regardless
> of source could also work on PostgreSQL dumps, while a pg_dump focused
> effort helps nothing but that specific workflow. I wonder if doing too
> much work on the pg_dump path is the best use of someone's time when the
> more general case will need to be addressed one day anyway.

I take your points. I've spent as much time discussing this as coding
it, which is OK cos coding the wrong thing is also a waste of time.

If we have individual load files then we will be able to use a parallel
load utility on them, whether they come from pg_dump or otherwise.
Neither of the current pg_dump output formats would allow that.
--data-only does allow that but there's no way to dump a sensible schema
to go with that, which is what this thread is about.

Dimitri is working on a parallel loader that allows error handling. I
don't think its worth duplicating that effort by attempting to make COPY
work in parallel. Even if we did, the single client could easily become
a bottleneck. So an external utility that spawns multiple processes and
runs multiple concurrent COPY statements sounds quite good to me.

He's asked for help, if anybody can assist.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 18:18:04
Message-ID:	5513.1204049884@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> I've not been advocating improving pg_restore, which is where the -Fc
> tricks come in.
> ...
> I see you thought I meant pg_restore. I don't thinking extending
> pg_restore in that way is of sufficiently generic use to make it
> worthwhile. Extending psql would be worth it, since not all psql scripts
> come from pg_dump.

OK, the reason I didn't grasp what you are proposing is that it's insane.

We can easily, and backwards-compatibly, improve pg_restore to do
concurrent restores. Trying to make psql do something like this will
require a complete rewrite, and there is no prospect that it will work
for any input that didn't come from (an updated version of) pg_dump
anyway. Furthermore you will have to write a whole bunch of new code
just to duplicate what pg_dump/pg_restore already do, ie store/retrieve
the TOC and dependency info in a program-readable fashion.

Since the performance advantages are still somewhat hypothetical,
I think we should reach for the low-hanging fruit first. If concurrent
pg_restore really does prove to be the best thing since sliced bread,
*then* would be the time to start thinking about whether it's possible
to do the same thing in less-constrained scenarios.

regards, tom lane

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 18:29:39
Message-ID:	47C45A93.8090603@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Since the performance advantages are still somewhat hypothetical,
> I think we should reach for the low-hanging fruit first. If concurrent
> pg_restore really does prove to be the best thing since sliced bread,
> *then* would be the time to start thinking about whether it's possible
> to do the same thing in less-constrained scenarios.
>
>
>

Amen. pg_restore is where all the work should be going first, I believe.

cheers

andrew

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 18:32:59
Message-ID:	1204050779.16886.24.camel@dogma.ljc.laika.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 17:16 +0000, Simon Riggs wrote:
> I've not been advocating improving pg_restore, which is where the -Fc
> tricks come in.

Oh, then it was a miscommunication on my part, because I was talking
about pg_restore, and I assumed you were as well.

Regards,
Jeff Davis

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 18:44:44
Message-ID:	20080226104444.4138b2c5@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 26 Feb 2008 10:32:59 -0800
Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> On Tue, 2008-02-26 at 17:16 +0000, Simon Riggs wrote:
> > I've not been advocating improving pg_restore, which is where the
> > -Fc tricks come in.
>
> Oh, then it was a miscommunication on my part, because I was talking
> about pg_restore, and I assumed you were as well.
>

IMO the place to start is COPY which is per my tests, slow. Multi
worker connection restore is great and I have proven that with some
work it can provide o.k. results but it is certainly not acceptable.

Let's not fix the middle and ignore the beginning.

Sincerely,

Joshua D. Drake

- --
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director | PostgreSQL political pundit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHxF4fATb/zqfZUUQRAty+AJ0VmYoY1wZ+Sijb/eRiDL+LedKrHgCfZuIn
SQTgRqm14gVfubz3u3o54mc=
=uIYl
-----END PGP SIGNATURE-----

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 18:58:25
Message-ID:	6124.1204052305@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> IMO the place to start is COPY which is per my tests, slow. Multi
> worker connection restore is great and I have proven that with some
> work it can provide o.k. results but it is certainly not acceptable.

It was already pointed out to you that we can hope for only incremental
speedups in COPY per se. Don't be too quick to dismiss the discussion
of large-grain parallelism, because I don't see anything else within
reach that might give integer multiples rather than percentage points.

regards, tom lane

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 19:08:31
Message-ID:	20080226110831.2957271c@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 26 Feb 2008 13:58:25 -0500
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> > IMO the place to start is COPY which is per my tests, slow. Multi
> > worker connection restore is great and I have proven that with some
> > work it can provide o.k. results but it is certainly not acceptable.
>
> It was already pointed out to you that we can hope for only
> incremental speedups in COPY per se. Don't be too quick to dismiss
> the discussion of large-grain parallelism, because I don't see
> anything else within reach that might give integer multiples rather
> than percentage points.

Oh please don't think I don't think this discussion is important, I
do. I would just hate to see us have 3 corners of a foundation. I also
don't buy the "incremental speedups". This may be arrogant of me but
that type of thought process is what allows for mediocre results.

HOT works because EDB refused to accept the inherit limitations of
PostgreSQL. COPY is no different in that aspect. Maybe it can't go
exponentially faster but the math says, "if done correctly, it can". We
just haven't figured out how.

Sincerely,

Joshua D. Drake

>
> regards, tom lane
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHxGOxATb/zqfZUUQRAs3oAJ4vpHEiRDUXPm1wqHWyIigVevoTbwCeJXa0
8bZAkOntTC9i/AHC3/L5w+8=
=gsUl
-----END PGP SIGNATURE-----

From:	Greg Smith <gsmith(at)gregsmith(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 19:17:24
Message-ID:	Pine.GSO.4.64.0802261408480.20468@westnet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 26 Feb 2008, Tom Lane wrote:

> What are you imagining here ... a plain SQL script containing
> database-independent INSERT commands? That's going to suck compared
> to COPY no matter what.

Think 100GB+ of data that's in a CSV or delimited file. Right now the
best import path is with COPY, but it won't execute very fast as a single
process. Splitting the file manually will take a long time (time that
could be spend loading instead) and substantially increase disk usage, so
the ideal approach would figure out how to load in parallel across all
available CPUs against that single file.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 19:18:36
Message-ID:	20080226191836.GL5763@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> "Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> > IMO the place to start is COPY which is per my tests, slow. Multi
> > worker connection restore is great and I have proven that with some
> > work it can provide o.k. results but it is certainly not acceptable.
>
> It was already pointed out to you that we can hope for only incremental
> speedups in COPY per se. Don't be too quick to dismiss the discussion
> of large-grain parallelism, because I don't see anything else within
> reach that might give integer multiples rather than percentage points.

Well, one idea would be dividing the input file in similarly-sized parts
and giving each one to a different COPY process. This would help in
cases where you have a single very large table to restore.

Another thing we could do is selective binary output/input for bytea
columns, to avoid the escaping step.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 19:34:16
Message-ID:	1204054456.4252.420.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I love you too.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Greg Smith <gsmith(at)gregsmith(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 19:47:55
Message-ID:	20080226114755.16e447f2@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 26 Feb 2008 14:17:24 -0500 (EST)
Greg Smith <gsmith(at)gregsmith(dot)com> wrote:

> On Tue, 26 Feb 2008, Tom Lane wrote:
>
> > What are you imagining here ... a plain SQL script containing
> > database-independent INSERT commands? That's going to suck compared
> > to COPY no matter what.
>
> Think 100GB+ of data that's in a CSV or delimited file. Right now
> the best import path is with COPY, but it won't execute very fast as
> a single process. Splitting the file manually will take a long time
> (time that could be spend loading instead) and substantially increase
> disk usage, so the ideal approach would figure out how to load in
> parallel across all available CPUs against that single file.

You mean load from position? That would be very, very cool.

Joshua D. Drake

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHxGzrATb/zqfZUUQRAujfAJ9xUIbj/DwN1QoyzR8a6O7B7FfK/wCffl3T
Ruu+TzTVMoDkj0a5pHMePX0=
=jlVQ
-----END PGP SIGNATURE-----

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 19:50:35
Message-ID:	1204055435.4252.430.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 16:18 -0300, Alvaro Herrera wrote:
> Tom Lane wrote:
> > "Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> > > IMO the place to start is COPY which is per my tests, slow. Multi
> > > worker connection restore is great and I have proven that with some
> > > work it can provide o.k. results but it is certainly not acceptable.
> >
> > It was already pointed out to you that we can hope for only incremental
> > speedups in COPY per se. Don't be too quick to dismiss the discussion
> > of large-grain parallelism, because I don't see anything else within
> > reach that might give integer multiples rather than percentage points.
>
> Well, one idea would be dividing the input file in similarly-sized parts
> and giving each one to a different COPY process. This would help in
> cases where you have a single very large table to restore.
>
> Another thing we could do is selective binary output/input for bytea
> columns, to avoid the escaping step.

This is exactly what Dimitri is working on.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 19:55:29
Message-ID:	1204055729.4252.437.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 13:18 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > I've not been advocating improving pg_restore, which is where the -Fc
> > tricks come in.
> > ...
> > I see you thought I meant pg_restore. I don't thinking extending
> > pg_restore in that way is of sufficiently generic use to make it
> > worthwhile. Extending psql would be worth it, since not all psql scripts
> > come from pg_dump.
>
> OK, the reason I didn't grasp what you are proposing is that it's insane.
>
> We can easily, and backwards-compatibly, improve pg_restore to do
> concurrent restores. Trying to make psql do something like this will
> require a complete rewrite, and there is no prospect that it will work
> for any input that didn't come from (an updated version of) pg_dump
> anyway. Furthermore you will have to write a whole bunch of new code
> just to duplicate what pg_dump/pg_restore already do, ie store/retrieve
> the TOC and dependency info in a program-readable fashion.
>
> Since the performance advantages are still somewhat hypothetical,
> I think we should reach for the low-hanging fruit first. If concurrent
> pg_restore really does prove to be the best thing since sliced bread,
> *then* would be the time to start thinking about whether it's possible
> to do the same thing in less-constrained scenarios.

Working late isn't very helpful if this type of comment is all it
delivers.

I've said that the low hanging fruit is split files from pg_dump, hence
the title of this thread.

I've said I'm *not* planning to work on concurrent pg_restore or
concurrent psql and I don't regard either of them as low hanging fruit.
But concurrent pg_restore is a big waste of time, however low it hangs.
If you have a performance problem on load, putting all your data into a
single huge file is not the right starting point to solve your load
problem and as Greg points out data files from other sources aren't ever
in pg_dump format.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 20:03:47
Message-ID:	7088.1204056227@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> HOT works because EDB refused to accept the inherit limitations of
> PostgreSQL. COPY is no different in that aspect. Maybe it can't go
> exponentially faster but the math says, "if done correctly, it can".

You can always make it faster if it doesn't have to give the right
answer ;-). Or in more practical terms in this case, we have to balance
speed against potentially-large costs in maintainability, datatype
extensibility, and suchlike issues if we are going to try to get more
than percentage points out of straight COPY.

regards, tom lane

From:	Gregory Stark <stark(at)enterprisedb(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 20:14:49
Message-ID:	874pbvfpae.fsf@oxford.xeocode.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
>> I've not been advocating improving pg_restore, which is where the -Fc
>> tricks come in.
>> ...
>> I see you thought I meant pg_restore. I don't thinking extending
>> pg_restore in that way is of sufficiently generic use to make it
>> worthwhile. Extending psql would be worth it, since not all psql scripts
>> come from pg_dump.
>
> OK, the reason I didn't grasp what you are proposing is that it's insane.
>
> We can easily, and backwards-compatibly, improve pg_restore to do
> concurrent restores. Trying to make psql do something like this will
> require a complete rewrite, and there is no prospect that it will work
> for any input that didn't come from (an updated version of) pg_dump
> anyway.

I didn't read everything in the thread previously so I'm not sure if this is
what Simon had in mind. But I think one thing that could be done in parallel
even in psql scripts is index builds. That doesn't help speed up COPY but it
does speed up a case where we currently are limited by only occupying a single
cpu. And I would expect it to play well With synchronized scans too.

The "complete rewrite" in this case would be the "concurrent psql" patch I
submitted a while back. I think it's a bit of a mess right now because I was
trying to chase down some bugs with sigint handling so I've been thinking of
rewriting it.

I think this is a low-hanging fruit which would help a lot of users. The
ability to load multiple COPY dumps would be the other piece of the puzzle but
personally I'm not sure how to tackle that.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Gregory Stark <stark(at)enterprisedb(dot)com>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-26 22:36:29
Message-ID:	9194.1204065389@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> We can easily, and backwards-compatibly, improve pg_restore to do
>> concurrent restores. Trying to make psql do something like this will
>> require a complete rewrite, and there is no prospect that it will work
>> for any input that didn't come from (an updated version of) pg_dump
>> anyway.

> The "complete rewrite" in this case would be the "concurrent psql" patch I
> submitted a while back.

Uh, no, that's not even the tip of the iceberg.

The problem with trying to manhandle psql for this purpose is that psql
is purely a reactive engine: it does what you tell it, when you tell it
to do it, and it knows nothing worth mentioning about the semantics of
the specific SQL commands you're passing through it. The
concurrent-sessions feature is cool but it does not alter that
fundamental property of the program. To make psql do the sort of things
being spoken of here would be completely outside its realm; it would
need to decide *on its own* when to issue what, and it would have to
acquire a whole lot of knowledge it doesn't now have in order to make
those decisions. That would be bolting on a ton of code that is
unrelated to psql's normal purposes, and would likely even interfere with
using psql for its normal purposes. (Would you like psql to suddenly
start making its own decisions about whether to submit a command you've
given it?)

I think a sane way to think about what Simon would like to accomplish
is not "turn psql into a parallel job scheduler" but "teach pg_restore
how to do parallel scheduling, and then see if it can be made to do
anything useful with plain-text instead of archive-dump input".
At least that way you're talking about something that's within the scope
of the program's purpose.

regards, tom lane

From:	Gregory Stark <stark(at)enterprisedb(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-27 02:03:45
Message-ID:	87ve4bduke.fsf@oxford.xeocode.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> I think a sane way to think about what Simon would like to accomplish
> is not "turn psql into a parallel job scheduler" but "teach pg_restore
> how to do parallel scheduling, and then see if it can be made to do
> anything useful with plain-text instead of archive-dump input".
> At least that way you're talking about something that's within the scope
> of the program's purpose.

Er, yes, that's what I meant. My point though was just that for single
commands which take a long time to run on the server there's no additional
work in the client needed to handle it. That is, as opposed to handling
multiple COPYs in parallel where psql has to be involved in the actual data
transfer and so you need multiple client as well as server processes.

I think I'm sorry I got involved here though since I specifically do not have
either the full context of this discussion or any background in
pg_dump/pg_restore code.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Gregory Stark <stark(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-27 08:09:27
Message-ID:	1204099767.4252.466.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2008-02-26 at 20:14 +0000, Gregory Stark wrote:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> > Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> >> I've not been advocating improving pg_restore, which is where the -Fc
> >> tricks come in.
> >> ...
> >> I see you thought I meant pg_restore. I don't thinking extending
> >> pg_restore in that way is of sufficiently generic use to make it
> >> worthwhile. Extending psql would be worth it, since not all psql scripts
> >> come from pg_dump.
> >
> > OK, the reason I didn't grasp what you are proposing is that it's insane.
> >
> > We can easily, and backwards-compatibly, improve pg_restore to do
> > concurrent restores. Trying to make psql do something like this will
> > require a complete rewrite, and there is no prospect that it will work
> > for any input that didn't come from (an updated version of) pg_dump
> > anyway.
>
> I didn't read everything in the thread previously so I'm not sure if this is
> what Simon had in mind. But I think one thing that could be done in parallel
> even in psql scripts is index builds. That doesn't help speed up COPY but it
> does speed up a case where we currently are limited by only occupying a single
> cpu. And I would expect it to play well With synchronized scans too.
>
> The "complete rewrite" in this case would be the "concurrent psql" patch I
> submitted a while back. I think it's a bit of a mess right now because I was
> trying to chase down some bugs with sigint handling so I've been thinking of
> rewriting it.
>
> I think this is a low-hanging fruit which would help a lot of users. The
> ability to load multiple COPY dumps would be the other piece of the puzzle but
> personally I'm not sure how to tackle that.

The current design for concurrent psql includes commands that say which
session a command should be run on. Switches between sessions are
explicit. That is good, but prevents us from easily saying "use N
sessions to make it go faster" because we already hardwired the commands
to the sessions.

If we able to express dependency info then we would be able to alter the
amount of parallelism. That would require us to
* identify each command
* identify its dependents

possibly like this

the current default behaviour is this

<psql id="5" relative-dependents="-1"> ...

That's a leap ahead of concurrent psql.

I'd rather we had concurrent psql as it is now than attempt to leap
ahead too far, but the dependent definition approach seems likely to
yield benefits in the long run.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-27 09:52:47
Message-ID:	200802271052.50006.dfontaine@hi-media.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le mardi 26 février 2008, Tom Lane a écrit :
> Or in more practical terms in this case, we have to balance
> speed against potentially-large costs in maintainability, datatype
> extensibility, and suchlike issues if we are going to try to get more
> than percentage points out of straight COPY.

Could COPY begin with checking the table type involved and use some internal
knowledge about -core types to avoid extensibility costs, if any? Ok that
sounds as a maintainability cost :)

Or maybe just provide an option to pg_dump to force usage of binary COPY
format, which then allow pg_restore to skip alltogether the data parsing. If
that's not the case (no data parsing), maybe it's time for another COPY
format to be invented?

On the binary compatibility between architectures, I'm wondering whether using
pg_dump in binary format from the new architecture couldn't be a solution.
Of course, when you only have the binary archives, lost server A and need to
get the data to server B which do not share the A architecture, you're not in
a comfortable situation. But pg_dump binary option would make clear you don't
want to use it for your regular backups...
And it wouldn't help the case when data is not coming from PostgreSQL. It
could still be a common enough use case to bother?

Just trying to put some ideas in the game, hoping this is more helpful than
not,
--
dim

They did not know it was impossible, so they did it! -- Mark Twain

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-27 10:19:28
Message-ID:	200802271119.28655.dfontaine@hi-media.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le mardi 26 février 2008, Joshua D. Drake a écrit :
> > Think 100GB+ of data that's in a CSV or delimited file. Right now
> > the best import path is with COPY, but it won't execute very fast as
> > a single process. Splitting the file manually will take a long time
> > (time that could be spend loading instead) and substantially increase
> > disk usage, so the ideal approach would figure out how to load in
> > parallel across all available CPUs against that single file.
>
> You mean load from position? That would be very, very cool.

Did I mention pgloader now does exactly this when configured like this:
http://pgloader.projects.postgresql.org/dev/pgloader.1.html#_parallel_loading
section_threads = N
split_file_reading = True

IIRC, Simon and Greg Smith asked for pgloader to get those parallel loading
features in order to have some first results and ideas about the performance
gain, as a first step in the parallel COPY backend implementation design.

Hope this helps,
--
dim

From:	Decibel! <decibel(at)decibel(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Gregory Stark <stark(at)enterprisedb(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-02-29 21:42:12
Message-ID:	FE5A1CC4-A618-4C37-AEAF-1D636AC3E7B3@decibel.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Feb 26, 2008, at 4:36 PM, Tom Lane wrote:
> I think a sane way to think about what Simon would like to accomplish
> is not "turn psql into a parallel job scheduler"

My $0.02: I often find myself wishing I could perform parallel
operations in psql. There was a proposal for that that came up during
8.3 development; whatever happened to it?
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	"Decibel!" <decibel(at)decibel(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <stark(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-04 01:31:54
Message-ID:	200803040131.m241VsO08255@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Decibel! wrote:
> On Feb 26, 2008, at 4:36 PM, Tom Lane wrote:
> > I think a sane way to think about what Simon would like to accomplish
> > is not "turn psql into a parallel job scheduler"
>
>
> My $0.02: I often find myself wishing I could perform parallel
> operations in psql. There was a proposal for that that came up during
> 8.3 development; whatever happened to it?

The concurrent psql patch was never updated to an acceptable state for
it to be reviewed.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-04 01:33:08
Message-ID:	200803040133.m241X8r08435@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > I've not been advocating improving pg_restore, which is where the -Fc
> > tricks come in.
> > ...
> > I see you thought I meant pg_restore. I don't thinking extending
> > pg_restore in that way is of sufficiently generic use to make it
> > worthwhile. Extending psql would be worth it, since not all psql scripts
> > come from pg_dump.
>
> OK, the reason I didn't grasp what you are proposing is that it's insane.
>
> We can easily, and backwards-compatibly, improve pg_restore to do
> concurrent restores. Trying to make psql do something like this will
> require a complete rewrite, and there is no prospect that it will work
> for any input that didn't come from (an updated version of) pg_dump
> anyway. Furthermore you will have to write a whole bunch of new code
> just to duplicate what pg_dump/pg_restore already do, ie store/retrieve
> the TOC and dependency info in a program-readable fashion.
>
> Since the performance advantages are still somewhat hypothetical,
> I think we should reach for the low-hanging fruit first. If concurrent
> pg_restore really does prove to be the best thing since sliced bread,
> *then* would be the time to start thinking about whether it's possible
> to do the same thing in less-constrained scenarios.

Added to TODO based on this discussion:

o Allow pg_dump to utilize multiple CPUs and I/O channels by dumping
multiple objects simultaneously

The difficulty with this is getting multiple dump processes to
produce a single dump output file.
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php

o Allow pg_restore to utilize multiple CPUs and I/O channels by
restoring multiple objects simultaneously

This might require a pg_restore flag to indicate how many
simultaneous operations should be performed. Only pg_dump's
-Fc format has the necessary dependency information.

o To better utilize resources, restore data, primary keys, and
indexes for a single table before restoring the next table

Hopefully this will allow the CPU-I/O load to be more uniform
for simultaneous restores. The idea is to start data restores
for several objects, and once the first object is done, to move
on to its primary keys and indexes. Over time, simultaneous
data loads and index builds will be running.

o To better utilize resources, allow pg_restore to check foreign
keys simultaneously, where possible
o Allow pg_restore to create all indexes of a table
concurrently, via a single heap scan

This requires a pg_dump -Fc file because that format contains
the required dependency information.
http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php

o Allow pg_restore to load different parts of the COPY data
simultaneously

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-04 01:35:17
Message-ID:	200803040135.m241ZHo08949@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > So if I understand:
>
> > * we add switches to pg_dump to dump out separate files with --pre,
> > --post and --data (or other names) [TODO: Simon]
>
> > * we add switches to pg_restore to load/dump from the single archive
> > file the subsets of --pre, --post, --data [TODO: Magnus]
>
> Oh dear. Apparently you do not understand that *these are the same
> switches*. The way pg_dump is structured, there is only one
> implementation to be done for both cases.

Uh, exactly what switches need to be added? Just the pg_restore switch
to tell pg_restore how many concurrent sessions to run?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-04 01:56:27
Message-ID:	4106.1204595787@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Added to TODO based on this discussion:

> o To better utilize resources, restore data, primary keys, and
> indexes for a single table before restoring the next table

That idea seems quite misguided, and certainly was not anywhere in the
prior discussion. If you add an ordering constraint as strong as that,
you'll mostly just eliminate chances for parallelism. Also, the
following para describes something completely different:

> Hopefully this will allow the CPU-I/O load to be more uniform
> for simultaneous restores. The idea is to start data restores
> for several objects, and once the first object is done, to move
> on to its primary keys and indexes. Over time, simultaneous
> data loads and index builds will be running.

regards, tom lane

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-04 02:48:27
Message-ID:	200803040248.m242mRv20583@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Added to TODO based on this discussion:
>
> > o To better utilize resources, restore data, primary keys, and
> > indexes for a single table before restoring the next table
>
> That idea seems quite misguided, and certainly was not anywhere in the
> prior discussion. If you add an ordering constraint as strong as that,
> you'll mostly just eliminate chances for parallelism.

The idea comes from this email, which no one objected to:

http://search.postgresql.org/search?m=1&q=pg_dump+create+copy&l=&d=&s=

You could avoid the ~8% spent in XLogInsert in PostgreSQL 8.3, by
creating the table (or truncating it) in the same transaction with the
COPY.

In the same transaction? Oh that's interesting. So that might be a TODO
right there. Change pg_dump so it does:

create,copy,create,copy,index

Instead of:

create,create,copy,copy,index

> Also, the
> following para describes something completely different:
>
> > Hopefully this will allow the CPU-I/O load to be more uniform
> > for simultaneous restores. The idea is to start data restores
> > for several objects, and once the first object is done, to move
> > on to its primary keys and indexes. Over time, simultaneous
> > data loads and index builds will be running.

OK, let me remove the entire item and we can always re-add it if someone
want to make a case for it.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-04 05:43:02
Message-ID:	20080303214302.1272d4a2@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 3 Mar 2008 20:33:08 -0500 (EST)
Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> Added to TODO based on this discussion:
>
> o Allow pg_dump to utilize multiple CPUs and I/O channels by
> dumping multiple objects simultaneously
>
> The difficulty with this is getting multiple dump processes
> to produce a single dump output file.
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php

Isn't part of this problem also a consistent snapshot?

Joshua D. Drake

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHzOFmATb/zqfZUUQRApqJAJ0ScecOzmcWqy86j/nfeMazL6JeZgCfWynT
xJ9EbtlMqSsYi6NNXTjaY7U=
=SI0h
-----END PGP SIGNATURE-----

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-04 13:53:34
Message-ID:	200803041353.m24DrYM12083@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Joshua D. Drake wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Mon, 3 Mar 2008 20:33:08 -0500 (EST)
> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> > Added to TODO based on this discussion:
> >
> > o Allow pg_dump to utilize multiple CPUs and I/O channels by
> > dumping multiple objects simultaneously
> >
> > The difficulty with this is getting multiple dump processes
> > to produce a single dump output file.
> > http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php
>
> Isn't part of this problem also a consistent snapshot?

Yes, that might also be an issue though I thought we had some ideas
about snapshot sharing.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Gregory Stark <stark(at)enterprisedb(dot)com>
To:	"Bruce Momjian" <bruce(at)momjian(dot)us>
Cc:	"Decibel!" <decibel(at)decibel(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-10 01:40:22
Message-ID:	871w6jgxvd.fsf@oxford.xeocode.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Bruce Momjian" <bruce(at)momjian(dot)us> writes:

> Decibel! wrote:
>> On Feb 26, 2008, at 4:36 PM, Tom Lane wrote:
>> > I think a sane way to think about what Simon would like to accomplish
>> > is not "turn psql into a parallel job scheduler"
>>
>>
>> My $0.02: I often find myself wishing I could perform parallel
>> operations in psql. There was a proposal for that that came up during
>> 8.3 development; whatever happened to it?
>
> The concurrent psql patch was never updated to an acceptable state for
> it to be reviewed.

Well, I got tied in knots trying to fix up the SIGINT handling working
properly. I want to try rewriting it from scratch to try to produce a cleaner
version.

But that doesn't mean nobody should look at the patch and give comments. It
would be silly for me to rewrite it keeping things basically the way they are
now just cleaned up -- only to then find out that people disagree with the
basic approach and don't accept it anyways. This may be a trivial example but
some pretty major patches which required a lot of work have been submitted
before where nobody read any of the WIP patches and then basic questions were
raised long after a lot of work had been committed.

We seem to often think of "review" as equivalent to "commit". But patch
authors often want to get some confirmation they're on the right track before
they move on the next step. As a result many patches kind of get stuck in a
catch-22 where they're not ready for review and no ready for development
either.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-23 00:32:20
Message-ID:	200803230032.m2N0WKh19121@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Added to TODO:

o Allow pre/data/post files when dumping a single object, for
performance reasons

http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php

---------------------------------------------------------------------------

Simon Riggs wrote:
> pg_dump allows you to specify -s --schema-only, or -a --data-only.
>
> The -s option creates the table, as well as creating constraints and
> indexes. These objects need to be dropped prior to loading, if we are to
> follow the performance recommendations in the docs. But the only way to
> do that is to manually edit the script to produce a cut down script.
>
> So it would be good if we could dump objects in 3 groups
> 1. all commands required to re-create table
> 2. data
> 3. all commands required to complete table after data load
>
> My proposal is to provide two additional modes:
> --schema-pre-load corresponding to (1) above
> --schema-post-load corresponding to (3) above
>
> This would then allow this sequence of commands
>
> pg_dump --schema-pre-load
> pg_dump --data-only
> pg_dump --schema-post-load
>
> to be logically equivalent, but faster than
>
> pg_dump --schema-only
> pg_dump --data-only
>
> both forms of which are equivalent to just
>
> pg_dump
>
>
> [Assuming data isn't changing between invocations...]
>
> --
> Simon Riggs
> 2ndQuadrant http://www.2ndQuadrant.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-23 15:45:15
Message-ID:	20080323154515.GB4554@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Bruce Momjian wrote:
>
> Added to TODO:
>
> o Allow pre/data/post files when dumping a single object, for
> performance reasons
>
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php

"When dumping a single object"?? Do you mean database?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: pg_dump additional options for performance
Date:	2008-03-24 14:17:11
Message-ID:	200803241417.m2OEHBE17670@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alvaro Herrera wrote:
> Bruce Momjian wrote:
> >
> > Added to TODO:
> >
> > o Allow pre/data/post files when dumping a single object, for
> > performance reasons
> >
> > http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php
>
> "When dumping a single object"?? Do you mean database?

I have updated the wording:

o Allow pre/data/post files when schema and data are dumped
separately, for performance reasons

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +