Re: pg_dump additional options for performance

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_dump additional options for performance
Date: 2008-02-26 11:46:13
Message-ID: 200802261246.16194.dfontaine@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le mardi 26 février 2008, Simon Riggs a écrit :
> So that would mean we would run an unload like this
>
> pg_dump --pre-schema-file=f1 --save-snapshot -snapshot-id=X
> pg_dump -t bigtable --data-file=f2.1 --snapshot-id=X
> pg_dump -t bigtable2 --data-file=f2.2 --snapshot-id=X
> pg_dump -T bigtable -T bigtable2 --data-file=f2.3 --snapshot-id=X

As a user I'd really prefer all of this to be much more transparent, and could
well imagine the -Fc format to be some kind of TOC + zip of table data + post
load instructions (organized per table), or something like this.
In fact just what you described, all embedded in a single file.

And I'd much prefer it if this (new?) format was trustworthy enough to be the
new default format of -Fc dumps. Then we could add some *simple* command line
parameter to control the threading behavior of the dump and reload process,
ala make -j. We could even support some option for the user to tell us which
disk arrays to use for parallel dumping.

pg_dump -j2 --dumpto=/mount/sda:/mount/sdb ... > mydb.dump
pg_restore -j4 ... mydb.dump

Then the trick would certainly be to use your work internally to feed a newer
dump format, which may or may not look exactly like the current one... and
the user would not have to mess around to get a coherent optimized dump.

The other comments on this threads about playing with the schema before and
after restoring seem to be related to pg_restore facilities, not at all with
how you want to dump. I know I'd like to have a single simple pg_dump tool,
then some flexible pg_restore options for using the dump. We already
have --schema-only and --data-only, what about having some more stuff here?

pg_restore --with-my-new-schema file.sql --no-index --no-fks [etc]

Hope this helps, regards,
--
dim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-02-26 12:05:40 Re: pg_dump additional options for performance
Previous Message Magnus Hagander 2008-02-26 11:31:38 Re: pg_dump additional options for performance